Science.gov

Sample records for comparative genomics analysis

  1. Image analysis in comparative genomic hybridization

    SciTech Connect

    Lundsteen, C.; Maahr, J.; Christensen, B.

    1995-01-01

    Comparative genomic hybridization (CGH) is a new technique by which genomic imbalances can be detected by combining in situ suppression hybridization of whole genomic DNA and image analysis. We have developed software for rapid, quantitative CGH image analysis by a modification and extension of the standard software used for routine karyotyping of G-banded metaphase spreads in the Magiscan chromosome analysis system. The DAPI-counterstained metaphase spread is karyotyped interactively. Corrections for image shifts between the DAPI, FITC, and TRITC images are done manually by moving the three images relative to each other. The fluorescence background is subtracted. A mean filter is applied to smooth the FITC and TRITC images before the fluorescence ratio between the individual FITC and TRITC-stained chromosomes is computed pixel by pixel inside the area of the chromosomes determined by the DAPI boundaries. Fluorescence intensity ratio profiles are generated, and peaks and valleys indicating possible gains and losses of test DNA are marked if they exceed ratios below 0.75 and above 1.25. By combining the analysis of several metaphase spreads, consistent findings of gains and losses in all or almost all spreads indicate chromosomal imbalance. Chromosomal imbalances are detected either by visual inspection of fluorescence ratio (FR) profiles or by a statistical approach that compares FR measurements of the individual case with measurements of normal chromosomes. The complete analysis of one metaphase can be carried out in approximately 10 minutes. 8 refs., 7 figs., 1 tab.

  2. Comparative genome analysis of Basidiomycete fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  3. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  4. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  5. Quantitative analysis of comparative genomic hybridization

    SciTech Connect

    Manoir, S. du; Bentz, M.; Joos, S. |

    1995-01-01

    Comparative genomic hybridization (CGH) is a new molecular cytogenetic method for the detection of chromosomal imbalances. Following cohybridization of DNA prepared from a sample to be studied and control DNA to normal metaphase spreads, probes are detected via different fluorochromes. The ratio of the test and control fluorescence intensities along a chromosome reflects the relative copy number of segments of a chromosome in the test genome. Quantitative evaluation of CGH experiments is required for the determination of low copy changes, e.g., monosomy or trisomy, and for the definition of the breakpoints involved in unbalanced rearrangements. In this study, a program for quantitation of CGH preparations is presented. This program is based on the extraction of the fluorescence ratio profile along each chromosome, followed by averaging of individual profiles from several metaphase spreads. Objective parameters critical for quantitative evaluations were tested, and the criteria for selection of suitable CGH preparations are described. The granularity of the chromosome painting and the regional inhomogeneity of fluorescence intensities in metaphase spreads proved to be crucial parameters. The coefficient of variation of the ratio value for chromosomes in balanced state (CVBS) provides a general quality criterion for CGH experiments. Different cutoff levels (thresholds) of average fluorescence ratio values were compared for their specificity and sensitivity with regard to the detection of chromosomal imbalances. 27 refs., 15 figs., 1 tab.

  6. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  7. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    PubMed Central

    Chari, Raj; Lockwood, William W.; Lam, Wan L.

    2006-01-01

    Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development. PMID:17992253

  8. Comparative Genome Analysis of Basidiomycete Fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  9. DNA sequence copy number analysis by Comparative Genomic Hybridization (CGH)

    SciTech Connect

    Pinkel, D.; Kallioniemi, A.; Kallioniemi, O.; Waldman, F.; Sudar, D.; Gray, I. ); Rutovitz, D.; Piper, I. )

    1993-01-01

    Comparative Genomic Hybridization (CGH) uses the kinetics of in situ hybridization to compare the copy numbers of different DNA sequences within the same genome and the copy numbers of the same sequences among different genomes. In a typical application genomic DNA from a tumor and from normal cells are differentially labeled and simultaneously hybridized to normal metaphase chromosomes, and detected with different fluorochromes. Properly registered images of each fluorochrome are obtained using a microscope equipped with multi-band filters and a CCD camera. Digital image analysis permits measurement of intensity ratio profiles along each of the target chromosomes. Studies of cells with known aberrations indicate that the intensity ratio at each position is proportional to the ratio of the copy numbers of the sequences that bind there in the tumor and normal genomes. Analytical challenges posed by the need to efficiently obtain copy number karyotypes are discussed.

  10. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  11. Mycobacterial species as case-study of comparative genome analysis.

    PubMed

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-01-01

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species. PMID:21396338

  12. Comparative genome analysis of Solanum lycopersicum and Solanum tuberosum

    PubMed Central

    Lall, Rohit; Thomas, George; Singh, Satendra; Singh, Archana; Wadhwa, Gulshan

    2013-01-01

    Solanum lycopersicum and Solanum tuberosum are agriculturally important crop species as they are rich sources of starch, protein, antioxidants, lycopene, beta-carotene, vitamin C, and fiber. The genomes of S. lycopersicum and S. tuberosum are currently available. However the linear strings of nucleotides that together comprise a genome sequence are of limited significance by themselves. Computational and bioinformatics approaches can be used to exploit the genomes for fundamental research for improving their varieties. The comparative genome analysis, Pfam analysis of predicted reviewed paralogous proteins was performed. It was found that S. lycopersicum proteins belong to more families, domains and clans in comparison with S. tuberosum. It was also found that mostly intergenic regions are conserved in two genomes followed by exons, intron and UTR. This can be exploited to predict regions between genomes that are similar to each other and to study the evolutionary relationship between two genomes, leading towards the development of disease resistance, stress tolerance and improved varieties of tomato. PMID:24307771

  13. Utility of array comparative genomic hybridization in cytogenetic analysis.

    PubMed

    Singh, Rashmi R; Cheung, K-John J; Horsman, Douglas E

    2011-01-01

    Conventional comparative genomic hybridization (CGH), high-resolution oligonucleotide, and BAC array CGH have modernized the field of cytogenetics to enable access to unbalanced genomic aberrations such as whole or partial chromosomal gains and losses. The basic principle of array CGH involves hybridizing differentially labeled proband/test (e.g., tumor) and normal reference DNA on an array of oligonucleotide or BAC clones instead of normal metaphases as in conventional CGH. The sub-megabase resolution tiling BAC arrays are extremely useful for the analysis of acquired aberrations in cancer genomes. Array CGH can be extremely useful to identify the chromosomal makeup of marker and ring chromosomes, to define/delineate the precise location/bands involved in structural aberrations and the accurate localization of translocation breakpoints in both simple and complex karyotypes either alone or in combination with standard karyotype analysis. PMID:21431645

  14. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. PMID:25296770

  15. FusoBase: an online Fusobacterium comparative genomic analysis platform

    PubMed Central

    Ang, Mia Yang; Heydari, Hamed; Jakubovics, Nick S.; Mahmud, Mahafizul Imran; Dutta, Avirup; Wee, Wei Yee; Wong, Guat Jah; Mutha, Naresh V.R.; Tan, Shi Yang; Choo, Siew Woh

    2014-01-01

    Fusobacterium are anaerobic gram-negative bacteria that have been associated with a wide spectrum of human infections and diseases. As the biology of Fusobacterium is still not well understood, comparative genomic analysis on members of this species will provide further insights on their taxonomy, phylogeny, pathogenicity and other information that may contribute to better management of infections and diseases. To facilitate the ongoing genomic research on Fusobacterium, a specialized database with easy-to-use analysis tools is necessary. Here we present FusoBase, an online database providing access to genome-wide annotated sequences of Fusobacterium strains as well as bioinformatics tools, to support the expanding scientific community. Using our custom-developed Pairwise Genome Comparison tool, we demonstrate how differences between two user-defined genomes and how insertion of putative prophages can be identified. In addition, Pathogenomics Profiling Tool is capable of clustering predicted genes across Fusobacterium strains and visualizing the results in the form of a heat map with dendrogram. Database URL: http://fusobacterium.um.edu.my. PMID:25149689

  16. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    SciTech Connect

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  17. Genome sequence and comparative genome analysis of Pseudomonas syringae pv. syringae type strain ATCC 19310.

    PubMed

    Park, Yong-Soon; Jeong, Haeyoung; Sim, Young Mi; Yi, Hwe-Su; Ryu, Choong-Min

    2014-04-01

    Pseudomonas syringae pv. syringae (Psy) is a major bacterial pathogen of many economically important plant species. Despite the severity of its impact, the genome sequence of the type strain has not been reported. Here, we present the draft genome sequence of Psy ATCC 19310. Comparative genomic analysis revealed that Psy ATCC 19310 is closely related to Psy B728a. However, only a few type III effectors, which are key virulence factors, are shared by the two strains, indicating the possibility of host-pathogen specificity and genome dynamics, even under the pathovar level. PMID:24444998

  18. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    PubMed

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  19. Comparative analysis of genomic signal processing for microarray data clustering.

    PubMed

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods. PMID:22157075

  20. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    PubMed

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

  1. Comparative genomic analysis of seven Mycoplasma hyosynoviae strains

    PubMed Central

    Bumgardner, Eric A; Kittichotirat, Weerayuth; Bumgarner, Roger E; Lawrence, Paulraj K

    2015-01-01

    Infection with Mycoplasma hyosynoviae can result in debilitating arthritis in pigs, particularly those aged 10 weeks or older. Strategies for controlling this pathogen are becoming increasingly important due to the rise in the number of cases of arthritis that have been attributed to infection in recent years. In order to begin to develop interventions to prevent arthritis caused by M. hyosynoviae, more information regarding the specific proteins and potential virulence factors that its genome encodes was needed. However, the genome of this emerging swine pathogen had not been sequenced previously. In this report, we present a comparative analysis of the genomes of seven strains of M. hyosynoviae isolated from different locations in North America during the years 2010 to 2013. We identified several putative virulence factors that may contribute to the ability of this pathogen to adhere to host cells. Additionally, we discovered several prophage genes present within the genomes of three strains that show significant similarity to MAV1, a phage isolated from the related species, M. arthritidis. We also identified CRISPR-Cas and type III restriction and modification systems present in two strains that may contribute to their ability to defend against phage infection. PMID:25693846

  2. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

    PubMed Central

    Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  3. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis.

    PubMed

    Bengelsdorf, Frank R; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood-Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (P thlA ) from C. acetobutylicum or native pta-ack promoter (P pta-ack ) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  4. Comparative analysis of essential genes in prokaryotic genomic islands

    PubMed Central

    Zhang, Xi; Peng, Chong; Zhang, Ge; Gao, Feng

    2015-01-01

    Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands. PMID:26223387

  5. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae

    PubMed Central

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-01-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  6. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae.

    PubMed

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-03-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  7. Genome-wide Comparative Analysis of Annexin Superfamily in Plants

    PubMed Central

    Jami, Sravan Kumar; Clark, Greg B.; Ayele, Belay T.; Ashe, Paula; Kirti, Pulugurtha Bharadwaja

    2012-01-01

    Most annexins are calcium-dependent, phospholipid-binding proteins with suggested functions in response to environmental stresses and signaling during plant growth and development. They have previously been identified and characterized in Arabidopsis and rice, and constitute a multigene family in plants. In this study, we performed a comparative analysis of annexin gene families in the sequenced genomes of Viridiplantae ranging from unicellular green algae to multicellular plants, and identified 149 genes. Phylogenetic studies of these deduced annexins classified them into nine different arbitrary groups. The occurrence and distribution of bona fide type II calcium binding sites within the four annexin domains were found to be different in each of these groups. Analysis of chromosomal distribution of annexin genes in rice, Arabidopsis and poplar revealed their localization on various chromosomes with some members also found on duplicated chromosomal segments leading to gene family expansion. Analysis of gene structure suggests sequential or differential loss of introns during the evolution of land plant annexin genes. Intron positions and phases are well conserved in annexin genes from representative genomes ranging from Physcomitrella to higher plants. The occurrence of alternative motifs such as K/R/HGD was found to be overlapping or at the mutated regions of the type II calcium binding sites indicating potential functional divergence in certain plant annexins. This study provides a basis for further functional analysis and characterization of annexin multigene families in the plant lineage. PMID:23133603

  8. Comparative genomic analysis of ten Streptococcus pneumoniae temperate bacteriophages.

    PubMed

    Romero, Patricia; Croucher, Nicholas J; Hiller, N Luisa; Hu, Fen Z; Ehrlich, Garth D; Bentley, Stephen D; García, Ernesto; Mitchell, Tim J

    2009-08-01

    Streptococcus pneumoniae is an important human pathogen that often carries temperate bacteriophages. As part of a program to characterize the genetic makeup of prophages associated with clinical strains and to assess the potential roles that they play in the biology and pathogenesis in their host, we performed comparative genomic analysis of 10 temperate pneumococcal phages. All of the genomes are organized into five major gene clusters: lysogeny, replication, packaging, morphogenesis, and lysis clusters. All of the phage particles observed showed a Siphoviridae morphology. The only genes that are well conserved in all the genomes studied are those involved in the integration and the lysis of the host in addition to two genes, of unknown function, within the replication module. We observed that a high percentage of the open reading frames contained no similarities to any sequences catalogued in public databases; however, genes that were homologous to known phage virulence genes, including the pblB gene of Streptococcus mitis and the vapE gene of Dichelobacter nodosus, were also identified. Interestingly, bioinformatic tools showed the presence of a toxin-antitoxin system in the phage phiSpn_6, and this represents the first time that an addition system in a pneumophage has been identified. Collectively, the temperate pneumophages contain a diverse set of genes with various levels of similarity among them. PMID:19502408

  9. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    PubMed Central

    2011-01-01

    Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921

  10. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    PubMed

    Klima, Cassidy L; Cook, Shaun R; Zaheer, Rahat; Laing, Chad; Gannon, Vick P; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W; McAllister, Tim A

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  11. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources

    PubMed Central

    Klima, Cassidy L.; Cook, Shaun R.; Zaheer, Rahat; Laing, Chad; Gannon, Vick P.; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W.; McAllister, Tim A.

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2–8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  12. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  13. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    SciTech Connect

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The species P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this

  14. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE PAGESBeta

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; et al

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but

  15. Genome Sequence and Comparative Genome Analysis of Lactobacillus casei: Insights into Their Niche-Associated Evolution

    PubMed Central

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F.; Broadbent, Jeff R.

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  16. Genome sequence and comparative genome analysis of Lactobacillus casei: insights into their niche-associated evolution.

    PubMed

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F; Broadbent, Jeff R; Steele, James L

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  17. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    PubMed

    Thomson, Nicholas R; Howard, Sarah; Wren, Brendan W; Holden, Matthew T G; Crossman, Lisa; Challis, Gregory L; Churcher, Carol; Mungall, Karen; Brooks, Karen; Chillingworth, Tracey; Feltwell, Theresa; Abdellah, Zahra; Hauser, Heidi; Jagels, Kay; Maddison, Mark; Moule, Sharon; Sanders, Mandy; Whitehead, Sally; Quail, Michael A; Dougan, Gordon; Parkhill, Julian; Prentice, Michael B

    2006-12-15

    The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the

  18. The Complete Genome Sequence and Comparative Genome Analysis of the High Pathogenicity Yersinia enterocolitica Strain 8081

    PubMed Central

    Thomson, Nicholas R; Howard, Sarah; Wren, Brendan W; Holden, Matthew T. G; Crossman, Lisa; Challis, Gregory L; Churcher, Carol; Mungall, Karen; Brooks, Karen; Chillingworth, Tracey; Feltwell, Theresa; Abdellah, Zahra; Hauser, Heidi; Jagels, Kay; Maddison, Mark; Moule, Sharon; Sanders, Mandy; Whitehead, Sally; Quail, Michael A; Dougan, Gordon; Parkhill, Julian; Prentice, Michael B

    2006-01-01

    The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the

  19. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    SciTech Connect

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  20. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  1. Comparative genome analysis and genome-guided physiological analysis of Roseobacter litoralis

    PubMed Central

    2011-01-01

    Background Roseobacter litoralis OCh149, the type species of the genus, and Roseobacter denitrificans OCh114 were the first described organisms of the Roseobacter clade, an ecologically important group of marine bacteria. Both species were isolated from seaweed and are able to perform aerobic anoxygenic photosynthesis. Results The genome of R. litoralis OCh149 contains one circular chromosome of 4,505,211 bp and three plasmids of 93,578 bp (pRLO149_94), 83,129 bp (pRLO149_83) and 63,532 bp (pRLO149_63). Of the 4537 genes predicted for R. litoralis, 1122 (24.7%) are not present in the genome of R. denitrificans. Many of the unique genes of R. litoralis are located in genomic islands and on plasmids. On pRLO149_83 several potential heavy metal resistance genes are encoded which are not present in the genome of R. denitrificans. The comparison of the heavy metal tolerance of the two organisms showed an increased zinc tolerance of R. litoralis. In contrast to R. denitrificans, the photosynthesis genes of R. litoralis are plasmid encoded. The activity of the photosynthetic apparatus was confirmed by respiration rate measurements, indicating a growth-phase dependent response to light. Comparative genomics with other members of the Roseobacter clade revealed several genomic regions that were only conserved in the two Roseobacter species. One of those regions encodes a variety of genes that might play a role in host association of the organisms. The catabolism of different carbon and nitrogen sources was predicted from the genome and combined with experimental data. In several cases, e.g. the degradation of some algal osmolytes and sugars, the genome-derived predictions of the metabolic pathways in R. litoralis differed from the phenotype. Conclusions The genomic differences between the two Roseobacter species are mainly due to lateral gene transfer and genomic rearrangements. Plasmid pRLO149_83 contains predominantly recently acquired genetic material whereas pRLO149

  2. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  3. Sequence and comparative genomic analysis of actin-related proteins.

    PubMed

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  4. Comparative Analysis of Genome Diversity in Bullmastiff Dogs

    PubMed Central

    Mortlock, Sally-Anne; Khatkar, Mehar S.; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  5. Comparative Analysis of Genome Diversity in Bullmastiff Dogs.

    PubMed

    Mortlock, Sally-Anne; Khatkar, Mehar S; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  6. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes

    PubMed Central

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R.; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species. PMID:26057384

  7. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes.

    PubMed

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species. PMID:26057384

  8. Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication

    PubMed Central

    2009-01-01

    Background Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics. Results We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago. Conclusions This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution. PMID:19821981

  9. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  10. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  11. Delineation of Steroid-Degrading Microorganisms through Comparative Genomic Analysis

    PubMed Central

    Bergstrand, Lee H.; Cardenas, Erick; Holert, Johannes; Van Hamme, Jonathan D.

    2016-01-01

    ABSTRACT Steroids are ubiquitous in natural environments and are a significant growth substrate for microorganisms. Microbial steroid metabolism is also important for some pathogens and for biotechnical applications. This study delineated the distribution of aerobic steroid catabolism pathways among over 8,000 microorganisms whose genomes are available in the NCBI RefSeq database. Combined analysis of bacterial, archaeal, and fungal genomes with both hidden Markov models and reciprocal BLAST identified 265 putative steroid degraders within only Actinobacteria and Proteobacteria, which mainly originated from soil, eukaryotic host, and aquatic environments. These bacteria include members of 17 genera not previously known to contain steroid degraders. A pathway for cholesterol degradation was conserved in many actinobacterial genera, particularly in members of the Corynebacterineae, and a pathway for cholate degradation was conserved in members of the genus Rhodococcus. A pathway for testosterone and, sometimes, cholate degradation had a patchy distribution among Proteobacteria. The steroid degradation genes tended to occur within large gene clusters. Growth experiments confirmed bioinformatic predictions of steroid metabolism capacity in nine bacterial strains. The results indicate there was a single ancestral 9,10-seco-steroid degradation pathway. Gene duplication, likely in a progenitor of Rhodococcus, later gave rise to a cholate degradation pathway. Proteobacteria and additional Actinobacteria subsequently obtained a cholate degradation pathway via horizontal gene transfer, in some cases facilitated by plasmids. Catabolism of steroids appears to be an important component of the ecological niches of broad groups of Actinobacteria and individual species of Proteobacteria. PMID:26956583

  12. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  13. The tiger genome and comparative analysis with lion and snow leopard genomes

    PubMed Central

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  14. Comparative genomic analysis of hyperthermophilic archaeal fuselloviridae viruses

    SciTech Connect

    B. Wiedenheft; K. Stedman; F. Roberto; D. Willits; A. K. Gleske; L. Zoeller; J. Snyder; T. Douglas; M. Young

    2004-02-01

    The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindleshaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of _15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate.

  15. Comparative genomic analysis of hyperthermophilic archaeal Fuselloviridae viruses.

    PubMed

    Wiedenheft, Blake; Stedman, Kenneth; Roberto, Francisco; Willits, Deborah; Gleske, Anne-Kathrin; Zoeller, Luisa; Snyder, Jamie; Douglas, Trevor; Young, Mark

    2004-02-01

    The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindle-shaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of approximately 15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate. PMID:14747560

  16. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  17. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  18. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    PubMed

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  19. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  20. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach.

    PubMed

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  1. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach

    PubMed Central

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A.; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S.; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  2. Comparative Analysis of Alu Repeats in Primate Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Alu repeats are SINEs (Short intersperse repetitive elements) which enjoy a successful application in genome evolution, population biology, phylogenetics and forensics. Human Alu consensus sequences were widely used as surrogates in nonhuman primate studies with an assumption that all p...

  3. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    SciTech Connect

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  4. Ebolavirus comparative genomics

    DOE PAGESBeta

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; et al

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  5. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  6. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity.

    PubMed

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-08-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  7. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    PubMed Central

    2009-01-01

    Background The availability of the complete chicken (Gallus gallus) genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH) and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo) and the first analysis of copy number variants (CNVs) in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos), an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots"). Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies. PMID:19656363

  8. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  9. Comparative genomic and phylogeographic analysis of Mycobacterium leprae.

    PubMed

    Monot, Marc; Honoré, Nadine; Garnier, Thierry; Zidane, Nora; Sherafi, Diana; Paniz-Mondolfi, Alberto; Matsuoka, Masanori; Taylor, G Michael; Donoghue, Helen D; Bouwman, Abi; Mays, Simon; Watson, Claire; Lockwood, Diana; Khamesipour, Ali; Khamispour, Ali; Dowlati, Yahya; Jianping, Shen; Rea, Thomas H; Vera-Cabrera, Lucio; Stefani, Mariane M; Banu, Sayera; Macdonald, Murdo; Sapkota, Bishwa Raj; Spencer, John S; Thomas, Jérôme; Harshman, Keith; Singh, Pushpendra; Busso, Philippe; Gattiker, Alexandre; Rougemont, Jacques; Brennan, Patrick J; Cole, Stewart T

    2009-12-01

    Reductive evolution and massive pseudogene formation have shaped the 3.31-Mb genome of Mycobacterium leprae, an unculturable obligate pathogen that causes leprosy in humans. The complete genome sequence of M. leprae strain Br4923 from Brazil was obtained by conventional methods (6x coverage), and Illumina resequencing technology was used to obtain the sequences of strains Thai53 (38x coverage) and NHDP63 (46x coverage) from Thailand and the United States, respectively. Whole-genome comparisons with the previously sequenced TN strain from India revealed that the four strains share 99.995% sequence identity and differ only in 215 polymorphic sites, mainly SNPs, and by 5 pseudogenes. Sixteen interrelated SNP subtypes were defined by genotyping both extant and extinct strains of M. leprae from around the world. The 16 SNP subtypes showed a strong geographical association that reflects the migration patterns of early humans and trade routes, with the Silk Road linking Europe to China having contributed to the spread of leprosy. PMID:19881526

  10. Comparative genomic analysis of integral membrane transport proteins in ciliates.

    PubMed

    Kumar, Ujjwal; Saier, Milton H

    2015-01-01

    Integral membrane transport proteins homologous to those found in the Transporter Classification Database (TCDB; www.tcdb.org) were identified and bioinformatically characterized by transporter class, family, and substrate specificity in three ciliates, Paramecium tetraurelia (Para), Tetrahymena thermophila (Tetra), and Ichthyophthirius multifiliis (Ich). In these three organisms, 1,326 of 39,600 proteins (3.4%), 1,017 of 24,800 proteins (4.2%), and 504 out of 8,100 proteins (6.2%) integral membrane transport proteins were identified, respectively. Thus, an inverse relationship was observed between the % transporters identified and the number of total proteins per genome reported. This surprising observation provides insight into the evolutionary process, giving rise to genome reduction following whole genome duplication (as in the case of Para) or during pathogenic association with a host organism (Ich). Of these transport proteins in Para and Tetra, about 41% were channels (more than any other type of organism studied), 31% were secondary carriers (fewer than most eukaryotes) and 26% were primary active transporters, mostly ATP-hydrolysis driven (more than most other eukaryotes). In Ich, the number of channels was selectively reduced by 66%, relative to Para and Tetra. Para has four times more inorganic anion transporters than Tetra, and Ich has nonselectively lost most of these. Tetra and Ich preferentially transport sugars and monocarboxylates while Para prefers di- and tricarboxylates. These observations serve to characterize the transport proteins of these related ciliates, providing insight into their nutrition and metabolism. PMID:25099884

  11. Comparative analysis of prophage-like elements in Helicobacter sp. genomes

    PubMed Central

    Fan, Xiangyu; Li, Yumei; He, Rong

    2016-01-01

    Prophages are regarded as one of the factors underlying bacterial virulence, genomic diversification, and fitness, and are ubiquitous in bacterial genomes. Information on Helicobacter sp. prophages remains scarce. In this study, sixteen prophages were identified and analyzed in detail. Eight of them are described for the first time. Based on a comparative genomic analysis, these sixteen prophages can be classified into four different clusters. Phylogenetic relationships of Cluster A Helicobacter prophages were investigated. Furthermore, genomes of Helicobacter prophages from Clusters B, C, and D were analyzed. Interestingly, some putative antibiotic resistance proteins and virulence factors were associated with Helicobacter prophages. PMID:27169002

  12. Comparative Analysis of Salmonella Genomes Identifies a Metabolic Network for Escalating Growth in the Inflamed Gut

    PubMed Central

    Nuccio, Sean-Paul; Bäumler, Andreas J.

    2014-01-01

    ABSTRACT The Salmonella genus comprises a group of pathogens associated with illnesses ranging from gastroenteritis to typhoid fever. We performed an in silico analysis of comparatively reannotated Salmonella genomes to identify genomic signatures indicative of disease potential. By removing numerous annotation inconsistencies and inaccuracies, the process of reannotation identified a network of 469 genes involved in central anaerobic metabolism, which was intact in genomes of gastrointestinal pathogens but degrading in genomes of extraintestinal pathogens. This large network contained pathways that enable gastrointestinal pathogens to utilize inflammation-derived nutrients as well as many of the biochemical reactions used for the enrichment and biochemical discrimination of Salmonella serovars. Thus, comparative genome analysis identifies a metabolic network that provides clues about the strategies for nutrient acquisition and utilization that are characteristic of gastrointestinal pathogens. PMID:24643865

  13. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    PubMed Central

    Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

    2009-01-01

    Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249

  14. Comparative functional genomic analysis of Pasteurellaceae adhesins using phage display.

    PubMed

    Mullen, Lisa M; Nair, Sean P; Ward, John M; Rycroft, Andrew N; Williams, Rachel J; Henderson, Brian

    2007-05-16

    The Pasteurellaceae contain a number of important animal pathogens. Although related, the various members of this family cause a diversity of pathology in a wide variety of organ systems. Adhesion is an important virulence factor in bacterial infections. Surprisingly little is known about the adhesins of the Pasteurellaceae. To attempt to identify the genes coding for adhesins to some key components of the hosts extracellular matrix molecules, phage display libraries of fragmented genomic DNA from Haemophilus influenzae, Actinobacillus pleuropneumoniae, Pasteurella multocida and Aggregatibacter actinomycetemcomitans, were prepared in the phage display vector pG8SAET. The libraries were screened against human or porcine fibronectin, serum albumin or a commercial extracellular matrix containing type IV collagen, laminin and heparin sulphate. Four genes encoding putative adhesins were identified. These genes code for: (i) a 34 kDa human serum albumin binding protein from Haemophilus influenzae; (ii) a 12.8 kDa fibronectin-binding protein from Pasteurella multocida; (iii) a 13.7 kDa fibronectin-binding protein from A. actinomycetemcomitans; (iv) a 9.5 kDa serum albumin-binding protein from A. pleuropneumoniae. None of these genes have previously been proposed to code for adhesins. The applications of phage display with whole bacterial genomes to identify genes encoding novel adhesins in this family of bacteria are discussed. PMID:17258409

  15. Comparative Genomic Analysis of Human Fungal Pathogens Causing Paracoccidioidomycosis

    PubMed Central

    Desjardins, Christopher A.; Champion, Mia D.; Holder, Jason W.; Muszewska, Anna; Goldberg, Jonathan; Bailão, Alexandre M.; Brigido, Marcelo Macedo; Ferreira, Márcia Eliana da Silva; Garcia, Ana Maria; Grynberg, Marcin; Gujja, Sharvari; Heiman, David I.; Henn, Matthew R.; Kodira, Chinnappa D.; León-Narváez, Henry; Longo, Larissa V. G.; Ma, Li-Jun; Malavazi, Iran; Matsuo, Alisson L.; Morais, Flavia V.; Pereira, Maristela; Rodríguez-Brito, Sabrina; Sakthikumar, Sharadha; Salem-Izacc, Silvia M.; Sykes, Sean M.; Teixeira, Marcus Melo; Vallejo, Milene C.; Walter, Maria Emília Machado Telles; Yandava, Chandri; Young, Sarah; Zeng, Qiandong; Zucker, Jeremy; Felipe, Maria Sueli; Goldman, Gustavo H.; Haas, Brian J.; McEwen, Juan G.; Nino-Vega, Gustavo; Puccia, Rosana; San-Blas, Gioconda; Soares, Celia Maria de Almeida; Birren, Bruce W.; Cuomo, Christina A.

    2011-01-01

    Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18) and one strain of Paracoccidioides lutzii (Pb01). These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic species of

  16. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum.

    PubMed

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio; Middelboe, Mathias

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259-93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  17. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    PubMed Central

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  18. Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments.

    PubMed

    Cenci, Alberto; Combes, Marie-Christine; Lashermes, Philippe

    2012-01-01

    Sequence comparison of orthologous regions enables estimation of the divergence between genomes, analysis of their evolution and detection of particular features of the genomes, such as sequence rearrangements and transposable elements. Despite the economic importance of Coffea species, little genomic information is currently available. Coffea is a relatively young genus that includes more than one hundred diploid species and a single tetraploid species. Three Coffea orthologous regions of 470-900 kb were analyzed and compared: both subgenomes of allotetraploid Coffea arabica (contributed by the diploid species Coffea eugenioides and Coffea canephora) and the genome of diploid C. canephora. Sequence divergence was calculated on global alignments or on coding and non-coding sequences separately. A search for transposable elements detected 43 retrotransposons and 198 transposons in the sequences analyzed. Comparative insertion analysis made it possible to locate 165 TE insertions in the phylogenetic tree of the three genomes/subgenomes. In the tetraploid C. arabica, a homoeologous non-reciprocal transposition (HNRT) was detected and characterized: a 50 kb region of the C. eugenioides derived subgenome replaced the C. canephora derived counterpart. Comparative sequence analysis on three Coffea genomes/subgenomes revealed almost perfect gene synteny, low sequence divergence and a high number of shared transposable elements. Compared to the results of similar analysis in other genera (Aegilops/Triticum and Oryza), Coffea genomes/subgenomes appeared to be dramatically less diverged, which is consistent with the relatively recent radiation of the Coffea genus. Based on nucleotide substitution frequency, the HNRT was dated at 10,000-50,000 years BP, which is also the most recent estimation of the origin of C. arabica. PMID:22086332

  19. Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation

    PubMed Central

    Rasko, David A.; Worsham, Patricia L.; Abshire, Terry G.; Stanley, Scott T.; Bannan, Jason D.; Wilson, Mark R.; Langham, Richard J.; Decker, R. Scott; Jiang, Lingxia; Read, Timothy D.; Phillippy, Adam M.; Salzberg, Steven L.; Pop, Mihai; Van Ert, Matthew N.; Kenefic, Leo J.; Keim, Paul S.; Fraser-Liggett, Claire M.; Ravel, Jacques

    2011-01-01

    Before the anthrax letter attacks of 2001, the developing field of microbial forensics relied on microbial genotyping schemes based on a small portion of a genome sequence. Amerithrax, the investigation into the anthrax letter attacks, applied high-resolution whole-genome sequencing and comparative genomics to identify key genetic features of the letters’ Bacillus anthracis Ames strain. During systematic microbiological analysis of the spore material from the letters, we identified a number of morphological variants based on phenotypic characteristics and the ability to sporulate. The genomes of these morphological variants were sequenced and compared with that of the B. anthracis Ames ancestor, the progenitor of all B. anthracis Ames strains. Through comparative genomics, we identified four distinct loci with verifiable genetic mutations. Three of the four mutations could be directly linked to sporulation pathways in B. anthracis and more specifically to the regulation of the phosphorylation state of Spo0F, a key regulatory protein in the initiation of the sporulation cascade, thus linking phenotype to genotype. None of these variant genotypes were identified in single-colony environmental B. anthracis Ames isolates associated with the investigation. These genotypes were identified only in B. anthracis morphotypes isolated from the letters, indicating that the variants were not prevalent in the environment, not even the environments associated with the investigation. This study demonstrates the forensic value of systematic microbiological analysis combined with whole-genome sequencing and comparative genomics. PMID:21383169

  20. Comparative genomic analysis as a tool for biologicaldiscovery

    SciTech Connect

    Nobrega, Marcelo A.; Pennacchio, Len A.

    2003-03-30

    Biology is a discipline rooted in comparisons. Comparative physiology has assembled a detailed catalogue of the biological similarities and differences between species, revealing insights into how life has adapted to fill a wide-range of environmental niches. For example, the oxygen and carbon dioxide carrying capacity of vertebrate has evolved to provide strong advantages for species respiring at sea level, at high elevation or within water. Comparative- anatomy, -biochemistry, -pharmacology, -immunology and -cell biology have provided the fundamental paradigms from which each discipline has grown.

  1. Ebolavirus comparative genomics.

    PubMed

    Jun, Se-Ran; Leuze, Michael R; Nookaew, Intawat; Uberbacher, Edward C; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas D; Wassenaar, Trudy M; Ussery, David W

    2015-09-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  2. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  3. Comparative genomic analysis of the Tribolium immune system

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The red flour beetle Tribolium castaneum has contributed a wealth of knowledge on insect development but limited information about innate immunity. With its complete nucleotide sequence determined, we have taken the opportunity to annotate immunity-related genes and compare them with homologous mole...

  4. Comparative Genomic Analysis of Brucella melitensis Vaccine Strain M5 Provides Insights into Virulence Attenuation

    PubMed Central

    Zhang, Wen; Wang, Heng; Zhao, Hongyan; Piao, Dongri; Tian, Guozhong; Chen, Chen; Cui, Buyun

    2013-01-01

    The Brucella melitensis vaccine strain M5 is widely used to prevent and control brucellosis in animals. In this study, we determined the whole-genome sequence of M5, and conducted a comprehensive comparative analysis against the whole-genome sequence of the virulent strain 16 M and other reference strains. This analysis revealed 11 regions of deletion (RDs) and 2 regions of insertion (RIs) within the M5 genome. Among these regions, the sequences encompassed in 5 RDs and 1 RI showed consistent variation, with a large deletion between the M5 and the 16 M genomes. RD4 and RD5 showed the large diversity among all Brucella genomes, both in RD length and RD copy number. Thus, RD4 and RD5 are potential sites for typing different Brucella strains. Other RD and RI regions exhibited multiple single nucleotide polymorphisms (SNPs). In addition, a genome fragment with a 56 kb rearrangement was determined to be consistent with previous studies. Comparative genomic analysis indicated that genomic island inversion in Brucella was widely present. With the genetic pattern common among all strains analyzed, these 2 RDs, 1 RI, and one inversion region are potential sites for detection of genomic differences. Several SNPs of important virulence-related genes (motB, dhbC, sfuB, dsbAB, aidA, aroC, and lysR) were also detected, and may be used to determine the mechanism of virulence attenuation. Collectively, this study reveals that comparative analysis between wild-type and vaccine strains can provide resources for the study of virulence and microevolution of Brucella. PMID:23967122

  5. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  6. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources. PMID:27446038

  7. Genome Based Phylogeny and Comparative Genomic Analysis of Intra-Mammary Pathogenic Escherichia coli

    PubMed Central

    Richards, Vincent P.; Lefébure, Tristan; Pavinski Bitar, Paulina D.; Dogan, Belgin; Simpson, Kenneth W.; Schukken, Ynte H.; Stanhope, Michael J.

    2015-01-01

    Escherichia coli is an important cause of bovine mastitis and can cause both severe inflammation with a short-term transient infection, as well as less severe, but more chronic inflammation and infection persistence. E. coli is a highly diverse organism that has been classified into a number of different pathotypes or pathovars, and mammary pathogenic E. coli (MPEC) has been proposed as a new such pathotype. The purpose of this study was to use genome sequence data derived from both transient and persistent MPEC isolates (two isolates of each phenotype) to construct a genome-based phylogeny that places MPEC in its phylogenetic context with other E. coli pathovars. A subsidiary goal was to conduct comparative genomic analyses of these MPEC isolates with other E. coli pathovars to provide a preliminary perspective on loci that might be correlated with the MPEC phenotype. Both concatenated and consensus tree phylogenies did not support MPEC monophyly or the monophyly of either transient or persistent phenotypes. Three of the MPEC isolates (ECA-727, ECC-Z, and ECA-O157) originated from within the predominately commensal clade of E. coli, referred to as phylogroup A. The fourth MPEC isolate, of the persistent phenotype (ECC-1470), was sister group to an isolate of ETEC, falling within the E. coli B1 clade. This suggests that the MPEC phenotype has arisen on numerous independent occasions and that this has often, although not invariably, occurred from commensal ancestry. Examination of the genes present in the MPEC strains relative to the commensal strains identified a consistent presence of the type VI secretion system (T6SS) in the MPEC strains, with only occasional representation in commensal strains, suggesting that T6SS may be associated with MPEC pathogenesis and/or as an inter-bacterial competitive attribute and therefore could represent a useful target to explore for the development of MPEC specific inhibitors. PMID:25807497

  8. Comparative genomics of Brassicaceae crops

    PubMed Central

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-01-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  9. Comparative genomics of Brassicaceae crops.

    PubMed

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-05-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  10. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    PubMed Central

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis. PMID:27525259

  11. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India.

    PubMed

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar; Radhakrishnan, Girish

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis. PMID:27525259

  12. Comparative analysis of plastid genomes of non-photosynthetic Ericaceae and their photosynthetic relatives

    PubMed Central

    Logacheva, Maria D.; Schelkunov, Mikhail I.; Shtratnikova, Victoria Y.; Matveeva, Maria V.; Penin, Aleksey A.

    2016-01-01

    Although plastid genomes of flowering plants are typically highly conserved regarding their size, gene content and order, there are some exceptions. Ericaceae, a large and diverse family of flowering plants, warrants special attention within the context of plastid genome evolution because it includes both non-photosynthetic and photosynthetic species with rearranged plastomes and putative losses of “essential” genes. We characterized plastid genomes of three species of Ericaceae, non-photosynthetic Monotropa uniflora and Hypopitys monotropa and photosynthetic Pyrola rotundifolia, using high-throughput sequencing. As expected for non-photosynthetic plants, M. uniflora and H. monotropa have small plastid genomes (46 kb and 35 kb, respectively) lacking genes related to photosynthesis, whereas P. rotundifolia has a larger genome (169 kb) with a gene set similar to other photosynthetic plants. The examined genomes contain an unusually high number of repeats and translocations. Comparative analysis of the expanded set of Ericaceae plastomes suggests that the genes clpP and accD that are present in the plastid genomes of almost all plants have not been lost in this family (as was previously thought) but rather persist in these genomes in unusual forms. Also we found a new gene in P. rotundifolia that emerged as a result of duplication of rps4 gene. PMID:27452401

  13. Organization and comparative analysis of the mitochondrial genomes of bioluminescent Elateroidea (Coleoptera: Polyphaga).

    PubMed

    Amaral, Danilo T; Mitani, Yasuo; Ohmiya, Yoshihiro; Viviani, Vadim R

    2016-07-25

    Mitochondrial genome organization in the Elateroidea superfamily (Coleoptera), which include the main families of bioluminescent beetles, has been poorly studied and lacking information about Phengodidae family. We sequenced the mitochondrial genomes of Neotropical Lampyridae (Bicellonycha lividipennis), Phengodidae (Brasilocerus sp.2 and Phrixothrix hirtus) and Elateridae (Pyrearinus termitilluminans, Hapsodrilus ignifer and Teslasena femoralis). All species had a typical insect mitochondrial genome except for the following: in the elaterid T. femoralis genome there is a non-coding region between NADH2 and tRNA-Trp; in the phengodids Brasilocerus sp.2 and P. hirtus genomes we did not find the tRNA-Ile and tRNA-Gln. The P. hirtus genome showed a ~1.6kb non-coding region, the rearrangement of tRNA-Tyr, a new tRNA-Leu copy, and several regions with higher AT contents. Phylogenetics analysis using Bayesian and ML models indicated that the Phengodidae+Rhagophthalmidae are closely related to Lampyridae family, and included Drilus flavescens (Drilidae) as an internal clade within Elateridae. This is the first report that compares the mitochondrial genomes organization of the three main families of bioluminescent Elateroidea, including the first Neotropical Lampyridae and Phengodidae. The losses of tRNAs, and translocation and duplication events found in Phengodidae mt genomes, mainly in P. hirtus, may indicate different evolutionary rates in these mitochondrial genomes. The mitophylogenomics analysis indicates the monophyly of the three bioluminescent families and a closer relationship between Lampyridae and Phengodidae/Rhagophthalmidae, in contrast with previous molecular analysis. PMID:27060405

  14. The complete chloroplast genome sequence of Morus mongolica and a comparative analysis within the Fabidae clade.

    PubMed

    Kong, Weiqing; Yang, Jinhong

    2016-02-01

    The complete nucleotide sequence of the Morus mongolica chloroplast (cp) genome was reported and characterized in this study. The cp genome is a circular molecule of 158,459 bp containing a pair of 25,678 bp IR regions, separated by small and large single-copy regions of 19,736 and 87,363 bp, respectively. The number and relative positions of the 114 unique genes (80 PCGs, 30 tRNAs, and 4 rRNA genes) are almost identical to Morus indica cp genome. Further detailed comparative analyses revealed one hypervariable region, which is responsible for 88% of the total variation, and 64 indel events between two individuals. There are 78 simple sequence repeats (SSRs) in M. mongolica cp genome, in which 58 of them are mononucleotide repeats. Comparative analysis with M. indica cp genome indicated 22 SSRs with length polymorphisms and 1 SSR with nucleotide content polymorphism. The phylogenetic analysis of 60 PCGs from 62 cp genomes provided strong support for the monophyletic, single origin of Fabidae (N2-fixing) clade. PMID:26205390

  15. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective

    PubMed Central

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163

  16. Comparative genome analysis of Lysinibacillus B1-CDA, a bacterium that accumulates arsenics.

    PubMed

    Rahman, Aminur; Nahar, Noor; Nawani, Neelu N; Jass, Jana; Ghosh, Sibdas; Olsson, Björn; Mandal, Abul

    2015-12-01

    Previously, we reported an arsenic resistant bacterium Lysinibacillus sphaericus B1-CDA, isolated from an arsenic contaminated lands. Here, we have investigated its genetic composition and evolutionary history by using massively parallel sequencing and comparative analysis with other known Lysinibacillus genomes. Assembly of the sequencing reads revealed a genome of ~4.5 Mb in size encompassing ~80% of the chromosomal DNA. We found that the set of ordered contigs contains abundant regions of similarity with other Lysinibacillus genomes and clearly identifiable genome rearrangements. Furthermore, all genes of B1-CDA that were predicted be involved in its resistance to arsenic and/or other heavy metals were annotated. The presence of arsenic responsive genes was verified by PCR in vitro conditions. The findings of this study highlight the significance of this bacterium in removing arsenics and other toxic metals from the contaminated sources. The genetic mechanisms of the isolate could be used to cope with arsenic toxicity. PMID:26387925

  17. Comparative Genomic Analysis of Malaria Mosquito Vector-Associated Novel Pathogen Elizabethkingia anophelis

    PubMed Central

    Teo, Jeanette; Tan, Sean Yang-Yi; Liu, Yang; Tay, Martin; Ding, Yichen; Li, Yingying; Kjelleberg, Staffan; Givskov, Michael; Lin, Raymond T.P.; Yang, Liang

    2014-01-01

    Acquisition of Elizabethkingia infections in intensive care units (ICUs) has risen in the past decade. Treatment of Elizabethkingia infections is challenging due to the lack of effective therapeutic regimens, leading to a high mortality rate. Elizabethkingia infections have long been attributed to Elizabethkingia meningoseptica. Recently, we used whole-genome sequencing to reveal that E. anophelis is the pathogenic agent for an Elizabethkingia outbreak at two ICUs. We performed comparative genomic analysis of seven hospital-isolated E. anophelis strains with five available Elizabethkingia spp. genomes deposited in the National Center for Biotechnology Information Database. A pan-genomic approach was applied to identify the core- and pan-genome for the Elizabethkingia genus. We showed that unlike the hospital-isolated pathogen E. meningoseptica ATCC 12535 strain, the hospital-isolated E. anophelis strains have genome content and organization similar to the E. anophelis Ag1 and R26 strains isolated from the midgut microbiota of the malaria mosquito vector Anopheles gambiae. Both the core- and accessory genomes of Elizabethkingia spp. possess genes conferring antibiotic resistance and virulence. Our study highlights that E. anophelis is an emerging bacterial pathogen for hospital environments. PMID:24803570

  18. Comparative genomics and functional analysis of the 936 group of lactococcal Siphoviridae phages

    PubMed Central

    Murphy, James; Bottacini, Francesca; Mahony, Jennifer; Kelleher, Philip; Neve, Horst; Zomer, Aldert; Nauta, Arjen; van Sinderen, Douwe

    2016-01-01

    Genome sequencing and comparative analysis of bacteriophage collections has greatly enhanced our understanding regarding their prevalence, phage-host interactions as well as the overall biodiversity of their genomes. This knowledge is very relevant to phages infecting Lactococcus lactis, since they constitute a significant risk factor for dairy fermentations. Of the eighty four lactococcal phage genomes currently available, fifty five belong to the so-called 936 group, the most prevalent of the ten currently recognized lactococcal phage groups. Here, we report the genetic characteristics of a new collection of 936 group phages. By combining these genomes to those sequenced previously we determined the core and variable elements of the 936 genome. Genomic variation occurs across the 936 phage genome, such as genetic elements that (i) lead to a +1 translational frameshift resulting in the formation of additional structures on the phage tail, (ii) specify a double neck passage structure, and (iii) encode packaging module-associated methylases. Hierarchical clustering of the gene complement of the 936 group phages and nucleotide alignments allowed grouping of the ninety 936 group phages into distinct clusters, which in general appear to correspond with their geographical origin. PMID:26892066

  19. Psittacid Herpesvirus 1 and Infectious Laryngotracheitis Virus: Comparative Genome Sequence Analysis of Two Avian Alphaherpesviruses

    PubMed Central

    Thureen, Dean R.; Keeler, Calvin L.

    2006-01-01

    Psittacid herpesvirus 1 (PsHV-1) is the causative agent of Pacheco's disease, an acute, highly contagious, and potentially lethal respiratory herpesvirus infection in psittacine birds, while infectious laryngotracheitis virus (ILTV) is a highly contagious and economically significant avian herpesvirus which is responsible for an acute respiratory disease limited to galliform birds. The complete genome sequence of PsHV-1 has been determined and compared to the ILTV sequence, assembled from published data. The PsHV-1 and ILTV genomes exhibit similar structural characteristics and are 163,025 bp and 148,665 bp in length, respectively. The PsHV-1 genome contains 73 predicted open reading frames (ORFs), while the ILTV genome contains 77 predicted ORFs. Both genomes contain an inversion in the unique long region similar to that observed in pseudorabies virus. PsHV-1 is closely related to ILTV, and it is proposed that it be assigned to the Iltovirus genus. These two avian herpesviruses represent a phylogenetically unique clade of alphaherpesviruses that are distinct from the Marek's disease-like viruses (Mardivirus). The determination of the complete genomic nucleotide sequences of PsHV-1 and ILTV provides a tool for further comparative and functional analysis of this unique class of avian alphaherpesviruses. PMID:16873243

  20. Complete Genome Sequence of Borrelia afzelii K78 and Comparative Genome Analysis

    PubMed Central

    Schüler, Wolfgang; Bunikis, Ignas; Weber-Lehman, Jacqueline; Comstedt, Pär; Kutschan-Bunikis, Sabrina; Stanek, Gerold; Huber, Jutta; Meinke, Andreas; Bergström, Sven; Lundberg, Urban

    2015-01-01

    The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp) and 13 plasmids (8 linear and 5 circular) together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes. PMID:25798594

  1. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

    PubMed Central

    Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  2. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    PubMed

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  3. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok.

    PubMed

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-08-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution. PMID:27581124

  4. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok

    PubMed Central

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-01-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution. PMID:27581124

  5. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok.

    PubMed

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-07-11

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution. PMID:27409843

  6. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis

    PubMed Central

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5’ portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids. PMID:26046631

  7. Comparative genomic analysis of four representative plant growth-promoting rhizobacteria in Pseudomonas

    PubMed Central

    2013-01-01

    Background Some Pseudomonas strains function as predominant plant growth-promoting rhizobacteria (PGPR). Within this group, Pseudomonas chlororaphis and Pseudomonas fluorescens are non-pathogenic biocontrol agents, and some Pseudomonas aeruginosa and Pseudomonas stutzeri strains are PGPR. P. chlororaphis GP72 is a plant growth-promoting rhizobacterium with a fully sequenced genome. We conducted a genomic analysis comparing GP72 with three other pseudomonad PGPR: P. fluorescens Pf-5, P. aeruginosa M18, and the nitrogen-fixing strain P. stutzeri A1501. Our aim was to identify the similarities and differences among these strains using a comparative genomic approach to clarify the mechanisms of plant growth-promoting activity. Results The genome sizes of GP72, Pf-5, M18, and A1501 ranged from 4.6 to 7.1 M, and the number of protein-coding genes varied among the four species. Clusters of Orthologous Groups (COGs) analysis assigned functions to predicted proteins. The COGs distributions were similar among the four species. However, the percentage of genes encoding transposases and their inactivated derivatives (COG L) was 1.33% of the total genes with COGs classifications in A1501, 0.21% in GP72, 0.02% in Pf-5, and 0.11% in M18. A phylogenetic analysis indicated that GP72 and Pf-5 were the most closely related strains, consistent with the genome alignment results. Comparisons of predicted coding sequences (CDSs) between GP72 and Pf-5 revealed 3544 conserved genes. There were fewer conserved genes when GP72 CDSs were compared with those of A1501 and M18. Comparisons among the four Pseudomonas species revealed 603 conserved genes in GP72, illustrating common plant growth-promoting traits shared among these PGPR. Conserved genes were related to catabolism, transport of plant-derived compounds, stress resistance, and rhizosphere colonization. Some strain-specific CDSs were related to different kinds of biocontrol activities or plant growth promotion. The GP72 genome

  8. Comparative Analysis of Genomics and Proteomics in Bacillus thuringiensis 4.0718

    PubMed Central

    Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu

    2015-01-01

    Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for

  9. Complete genome and comparative analysis of the chemolithoautotrophic bacterium Oligotropha carboxidovorans OM5

    PubMed Central

    2010-01-01

    Background Oligotropha carboxidovorans OM5 T. (DSM 1227, ATCC 49405) is a chemolithoautotrophic bacterium capable of utilizing CO (carbon monoxide) and fixing CO2 (carbon dioxide). We previously published the draft genome of this organism and recently submitted the complete genome sequence to GenBank. Results The genome sequence of the chemolithoautotrophic bacterium Oligotropha carboxidovorans OM5 consists of a 3.74-Mb chromosome and a 133-kb megaplasmid that contains the genes responsible for utilization of carbon monoxide, carbon dioxide, and hydrogen. To our knowledge, this strain is the first one to be sequenced in the genus Oligotropha, the closest fully sequenced relatives being Bradyrhizobium sp. BTAi and USDA110 and Nitrobacter hamburgiensis X14. Analysis of the O. carboxidovorans genome reveals potential links between plasmid-encoded chemolithoautotrophy and chromosomally-encoded lipid metabolism. Comparative analysis of O. carboxidovorans with closely related species revealed differences in metabolic pathways, particularly in carbohydrate and lipid metabolism, as well as transport pathways. Conclusion Oligotropha, Bradyrhizobium sp and Nitrobacter hamburgiensis X14 are phylogenetically proximal. Although there is significant conservation of genome organization between the species, there are major differences in many metabolic pathways that reflect the adaptive strategies unique to each species. PMID:20863402

  10. Construction of Genetic Linkage Maps and Comparative Genome Analysis of Catfish Using Gene-Associated Markers

    PubMed Central

    Kucuktas, Huseyin; Wang, Shaolin; Li, Ping; He, Chongbo; Xu, Peng; Sha, Zhenxia; Liu, Hong; Jiang, Yanliang; Baoprasertkul, Puttharat; Somridhivej, Benjaporn; Wang, Yaping; Abernathy, Jason; Guo, Ximing; Liu, Lei; Muir, William; Liu, Zhanjiang

    2009-01-01

    A genetic linkage map of the channel catfish genome (N = 29) was constructed using EST-based microsatellite and single nucleotide polymorphism (SNP) markers in an interspecific reference family. A total of 413 microsatellites and 125 SNP markers were polymorphic in the reference family. Linkage analysis using JoinMap 4.0 allowed mapping of 331 markers (259 microsatellites and 72 SNPs) to 29 linkage groups. Each linkage group contained 3–18 markers. The largest linkage group contained 18 markers and spanned 131.2 cM, while the smallest linkage group contained 14 markers and spanned only 7.9 cM. The linkage map covered a genetic distance of 1811 cM with an average marker interval of 6.0 cM. Sex-specific maps were also constructed; the recombination rate for females was 1.6 times higher than that for males. Putative conserved syntenies between catfish and zebrafish, medaka, and Tetraodon were established, but the overall levels of genome rearrangements were high among the teleost genomes. This study represents a first-generation linkage map constructed by using EST-derived microsatellites and SNPs, laying a framework for large-scale comparative genome analysis in catfish. The conserved syntenies identified here between the catfish and the three model fish species should facilitate structural genome analysis and evolutionary studies, but more importantly should facilitate functional inference of catfish genes. Given that determination of gene functions is difficult in nonmodel species such as catfish, functional genome analysis will have to rely heavily on the establishment of orthologies from model species. PMID:19171943

  11. Isolation and Comparative Genomic Analysis of T1-Like Shigella Bacteriophage pSf-2.

    PubMed

    Jun, Jin Woo; Kim, Hyoun Joong; Yun, Sae Kil; Chai, Ji Young; Lee, Byeong Chun; Park, Se Chang

    2016-03-01

    The increasing prevalence of antibiotic-resistant Shigella sp. emphasizes that alternatives to conventional antibiotics are needed. Siphoviridae bacteriophage (phage), pSf-2, infecting S. flexneri ATCC(®) 12022 was isolated from Geolpocheon stream in Korea. Morphological analysis by transmission electron microscopy revealed that pSf-2 has a head of about 57 ± 4 nm in diameter with a long tail of 136 ± 3 nm in length and 15 ± 2 nm in width. One-step growth analysis revealed that pSf-2 has latent period of 30 min and burst size of 16 PFU/infected cell. The DNA genome of pSf-2 is composed of 50,109 bp with a G+C content of 45.44 %. The genome encodes 83 putative ORFs, 19 putative promoters, and 23 transcriptional terminator regions. Genome sequence analysis of pSf-2 and comparative analysis with the homologous T1-like Shigella phages, Shfl1 and pSf-1, revealed that pSf-2 is a novel T1-like Shigella phage. These results showed that pSf-2 might have a high potential as a biocontrol agent to control shigellosis. Also, the genomic information may lead to further understanding of phage biodiversity, especially T1-like phages. PMID:26612033

  12. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    PubMed

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution. PMID:27129539

  13. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade.

    PubMed

    Xu, Jinshan; He, Qiang; Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A; Zhou, Zeyang; Vossbrinck, Charles R

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An "ACCCTT" motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic ecology

  14. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles* #

    PubMed Central

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-yu; Zhang, Xiao-mei; Song, Da-feng; Zhang, Chen

    2016-01-01

    Objective: In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. Methods: The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). Results: We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Conclusions: Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate. PMID:27487802

  15. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    SciTech Connect

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  16. Genome-level identification, gene expression, and comparative analysis of porcine ß-defensin genes

    PubMed Central

    2012-01-01

    Background Beta-defensins (β-defensins) are innate immune peptides with evolutionary conservation across a wide range of species and has been suggested to play important roles in innate immune reactions against pathogens. However, the complete β-defensin repertoire in the pig has not been fully addressed. Result A BLAST analysis was performed against the available pig genomic sequence in the NCBI database to identify β-defensin-related sequences using previously reported β-defensin sequences of pigs, humans, and cattle. The porcine β-defensin gene clusters were mapped to chromosomes 7, 14, 15 and 17. The gene expression analysis of 17 newly annotated porcine β-defensin genes across 15 tissues using semi-quantitative reverse transcription polymerase chain reaction (RT-PCR) showed differences in their tissue distribution, with the kidney and testis having the largest pBD expression repertoire. We also analyzed single nucleotide polymorphisms (SNPs) in the mature peptide region of pBD genes from 35 pigs of 7 breeds. We found 8 cSNPs in 7 pBDs. Conclusion We identified 29 porcine β-defensin (pBD) gene-like sequences, including 17 unreported pBDs in the porcine genome. Comparative analysis of β-defensin genes in the pig genome with those in human and cattle genomes showed structural conservation of β-defensin syntenic regions among these species. PMID:23150902

  17. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  18. Genome Sequencing and Comparative Analysis of the Biocontrol Agent Trichoderma harzianum sensu stricto TR274

    SciTech Connect

    Steindorff, Andrei S.; Noronha, Elilane F.; Ulhoa, Cirano J.; Kuo, Alan; Salamov, Asaf A.; Haridas, Sajeet; Riley, Robert W.; Druzhinina, Irina S.; Kubicek, Christian P.; Grigoriev, Igor V.

    2015-03-17

    Biological control is a complex process which requires many mechanisms and a high diversity of biochemical pathways. The species of Trichoderma harzianum are well known for their biocontrol activity against many plant pathogens. To gain new insights into the biocontrol mechanism used by T. harzianum, we sequenced the isolate TR274 genome using Illumina. The assembly was performed using AllPaths-LG with a maximum coverage of 100x. The assembly resulted in 2282 contigs with a N50 of 37033bp. The genome size generated was 40.8 Mb and the GC content was 47.7%, similar to other Trichoderma genomes. Using the JGI Annotation Pipeline we predicted 13,932 genes with a high transcriptome support. CEGMA tests suggested 100% genome completeness and 97.9% of RNA-SEQ reads were mapped to the genome. The phylogenetic comparison using orthologous proteins with all Trichoderma genomes sequenced at JGI, corroborates the Trichoderma (T. asperellum and T. atroviride), Longibrachiatum (T. reesei and T. longibrachiatum) and Pachibasium (T. harzianum and T. virens) section division described previously. The comparison between two Trichoderma harzianum species suggests a high genome similarity but some strain-specific expansions. Analyses of the secondary metabolites, CAZymes, transporters, proteases, transcription factors were performed. The Pachybasium section expanded virtually all categories analyzed compared with the other sections, specially Longibrachiatum section, that shows a clear contraction. These results suggests that these proteins families have an important role in their respective phenotypes. Future analysis will improve the understanding of this complex genus and give some insights about its lifestyle and the interactions with the environment.

  19. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    PubMed Central

    Trantas, Emmanouil A.; Licciardello, Grazia; Almeida, Nalvo F.; Witek, Kamil; Strano, Cinzia P.; Duxbury, Zane; Ververidis, Filippos; Goumas, Dimitrios E.; Jones, Jonathan D. G.; Guttman, David S.; Catara, Vittoria; Sarris, Panagiotis F.

    2015-01-01

    The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor) and P. mediterranea (Pmed), are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for genes that encode proteins involved in commercially important chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of type III secretion system and known type III effector-encoding genes from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes. Genome-mining also revealed the presence of gene clusters for biosynthesis of siderophores, polyketides, non-ribosomal peptides, and hydrogen cyanide. A highly conserved quorum sensing system was detected in all strains, although species specific differences were observed. Our study provides the basis for in-depth investigations regarding the molecular mechanisms underlying virulence strategies in the battle between plants and microbes. PMID:26300874

  20. Expanding the repertoire of secretory peptides controlling root development with comparative genome analysis and functional assays

    PubMed Central

    Ghorbani, Sarieh; Lin, Yao-Cheng; Parizot, Boris; Fernandez, Ana; Njo, Maria Fransiska; Van de Peer, Yves; Beeckman, Tom; Hilson, Pierre

    2015-01-01

    Plant genomes encode numerous small secretory peptides (SSPs) whose functions have yet to be explored. Based on structural features that characterize SSP families known to take part in postembryonic development, this comparative genome analysis resulted in the identification of genes coding for oligopeptides potentially involved in cell-to-cell communication. Because genome annotation based on short sequence homology is difficult, the criteria for the de novo identification and aggregation of conserved SSP sequences were first benchmarked across five reference plant species. The resulting gene families were then extended to 32 genome sequences, including major crops. The global phylogenetic pattern common to the functionally characterized SSP families suggests that their apparition and expansion coincide with that of the land plants. The SSP families can be searched online for members, sequences and consensus (http://bioinformatics.psb.ugent.be/webtools/PlantSSP/). Looking for putative regulators of root development, Arabidopsis thaliana SSP genes were further selected through transcriptome meta-analysis based on their expression at specific stages and in specific cell types in the course of the lateral root formation. As an additional indication that formerly uncharacterized SSPs may control development, this study showed that root growth and branching were altered by the application of synthetic peptides matching conserved SSP motifs, sometimes in very specific ways. The strategy used in the study, combining comparative genomics, transcriptome meta-analysis and peptide functional assays in planta, pinpoints factors potentially involved in non-cell-autonomous regulatory mechanisms. A similar approach can be implemented in different species for the study of a wide range of developmental programmes. PMID:26195730

  1. Comparative Genomic Analysis of Drosophila melanogaster and Vector Mosquito Developmental Genes

    PubMed Central

    Behura, Susanta K.; Haugen, Morgan; Flannery, Ellen; Sarro, Joseph; Tessier, Charles R.; Severson, David W.; Duman-Scheel, Molly

    2011-01-01

    Genome sequencing projects have presented the opportunity for analysis of developmental genes in three vector mosquito species: Aedes aegypti, Culex quinquefasciatus, and Anopheles gambiae. A comparative genomic analysis of developmental genes in Drosophila melanogaster and these three important vectors of human disease was performed in this investigation. While the study was comprehensive, special emphasis centered on genes that 1) are components of developmental signaling pathways, 2) regulate fundamental developmental processes, 3) are critical for the development of tissues of vector importance, 4) function in developmental processes known to have diverged within insects, and 5) encode microRNAs (miRNAs) that regulate developmental transcripts in Drosophila. While most fruit fly developmental genes are conserved in the three vector mosquito species, several genes known to be critical for Drosophila development were not identified in one or more mosquito genomes. In other cases, mosquito lineage-specific gene gains with respect to D. melanogaster were noted. Sequence analyses also revealed that numerous repetitive sequences are a common structural feature of Drosophila and mosquito developmental genes. Finally, analysis of predicted miRNA binding sites in fruit fly and mosquito developmental genes suggests that the repertoire of developmental genes targeted by miRNAs is species-specific. The results of this study provide insight into the evolution of developmental genes and processes in dipterans and other arthropods, serve as a resource for those pursuing analysis of mosquito development, and will promote the design and refinement of functional analysis experiments. PMID:21754989

  2. Comparative 3D Genome Structure Analysis of the Fission and the Budding Yeast

    PubMed Central

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species. PMID:25799503

  3. Comparative genomic analysis of Helicobacter pylori from Malaysia identifies three distinct lineages suggestive of differential evolution

    PubMed Central

    Kumar, Narender; Mariappan, Vanitha; Baddam, Ramani; Lankapalli, Aditya K.; Shaik, Sabiha; Goh, Khean-Lee; Loke, Mun Fai; Perkins, Tim; Benghezal, Mohammed; Hasnain, Seyed E.; Vadivelu, Jamuna; Marshall, Barry J.; Ahmed, Niyaz

    2015-01-01

    The discordant prevalence of Helicobacter pylori and its related diseases, for a long time, fostered certain enigmatic situations observed in the countries of the southern world. Variation in H. pylori infection rates and disease outcomes among different populations in multi-ethnic Malaysia provides a unique opportunity to understand dynamics of host–pathogen interaction and genome evolution. In this study, we extensively analyzed and compared genomes of 27 Malaysian H. pylori isolates and identified three major phylogeographic lineages: hspEastAsia, hpEurope and hpSouthIndia. The analysis of the virulence genes within the core genome, however, revealed a comparable pathogenic potential of the strains. In addition, we identified four genes limited to strains of East-Asian lineage. Our analyses identified a few strain-specific genes encoding restriction modification systems and outlined 311 core genes possibly under differential evolutionary constraints, among the strains representing different ethnic groups. The cagA and vacA genes also showed variations in accordance with the host genetic background of the strains. Moreover, restriction modification genes were found to be significantly enriched in East-Asian strains. An understanding of these variations in the genome content would provide significant insights into various adaptive and host modulation strategies harnessed by H. pylori to effectively persist in a host-specific manner. PMID:25452339

  4. Comparative Analysis of Mitochondrial Genomes of Five Aphid Species (Hemiptera: Aphididae) and Phylogenetic Implications

    PubMed Central

    Wang, Yuan; Huang, Xiao-Lei; Qiao, Ge-Xia

    2013-01-01

    Insect mitochondrial genomes (mitogenomes) are of great interest in exploring molecular evolution, phylogenetics and population genetics. Only two mitogenomes have been previously released in the insect group Aphididae, which consists of about 5,000 known species including some agricultural, forestry and horticultural pests. Here we report the complete 16,317 bp mitogenome of Cavariella salicicola and two nearly complete mitogenomes of Aphis glycines and Pterocomma pilosum. We also present a first comparative analysis of mitochondrial genomes of aphids. Results showed that aphid mitogenomes share conserved genomic organization, nucleotide and amino acid composition, and codon usage features. All 37 genes usually present in animal mitogenomes were sequenced and annotated. The analysis of gene evolutionary rate revealed the lowest and highest rates for COI and ATP8, respectively. A unique repeat region exclusively in aphid mitogenomes, which included variable numbers of tandem repeats in a lineage-specific manner, was highlighted for the first time. This region may have a function as another origin of replication. Phylogenetic reconstructions based on protein-coding genes and the stem-loop structures of control regions confirmed a sister relationship between Cavariella and pterocommatines. Current evidence suggest that pterocommatines could be formally transferred into Macrosiphini. Our paper also offers methodological instructions for obtaining other Aphididae mitochondrial genomes. PMID:24147014

  5. Comparative Genomic Analysis of Mycobacterium tuberculosis Drug Resistant Strains from Russia

    PubMed Central

    Ilina, Elena N.; Shitikov, Egor A.; Ikryannikova, Larisa N.; Alekseev, Dmitry G.; Kamashev, Dmitri E.; Malakhova, Maja V.; Parfenova, Tatjana V.; Afanas’ev, Maxim V.; Ischenko, Dmitry S.; Bazaleev, Nikolai A.; Smirnova, Tatjana G.; Larionova, Elena E.; Chernousova, Larisa N.; Beletsky, Alexey V.; Mardanov, Andrei V.; Ravin, Nikolai V.; Skryabin, Konstantin G.; Govorun, Vadim M.

    2013-01-01

    Tuberculosis caused by multidrug-resistant (MDR) and extensively drug-resistant (XDR) Mycobacterium tuberculosis (MTB) strains is a growing problem in many countries. The availability of the complete nucleotide sequences of several MTB genomes allows to use the comparative genomics as a tool to study the relationships of strains and differences in their evolutionary history including acquisition of drug-resistance. In our work, we sequenced three genomes of Russian MTB strains of different phenotypes – drug susceptible, MDR and XDR. Of them, MDR and XDR strains were collected in Tomsk (Siberia, Russia) during the local TB outbreak in 1998–1999 and belonged to rare KQ and KY families in accordance with IS6110 typing, which are considered endemic for Russia. Based on phylogenetic analysis, our isolates belonged to different genetic families, Beijing, Ural and LAM, which made the direct comparison of their genomes impossible. For this reason we performed their comparison in the broader context of all M. tuberculosis genomes available in GenBank. The list of unique individual non-synonymous SNPs for each sequenced isolate was formed by comparison with all SNPs detected within the same phylogenetic group. For further functional analysis, all proteins with unique SNPs were ascribed to 20 different functional classes based on Clusters of Orthologous Groups (COG). We have confirmed drug resistant status of our isolates that harbored almost all known drug-resistance associated mutations. Unique SNPs of an XDR isolate CTRI-4XDR, belonging to a Beijing family were compared in more detail with SNPs of additional 14 Russian XDR strains of the same family. Only type specific mutations in genes of repair, replication and recombination system (COG category L) were found common within this group. Probably the other unique SNPs discovered in CTRI-4XDR may have an important role in adaptation of this microorganism to its surrounding and in escape from antituberculosis drugs

  6. MultiMetEval: Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

    PubMed Central

    Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko

    2012-01-01

    Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval), built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads. PMID:23272111

  7. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods

    PubMed Central

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from “Tua Nao” of Thailand traces a different evolutionary process from other strains. PMID:26505996

  8. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods.

    PubMed

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from "Tua Nao" of Thailand traces a different evolutionary process from other strains. PMID:26505996

  9. New insights on the biology of swine respiratory tract mycoplasmas from a comparative genome analysis

    PubMed Central

    2013-01-01

    Background Mycoplasma hyopneumoniae, Mycoplasma flocculare and Mycoplasma hyorhinis live in swine respiratory tracts. M. flocculare, a commensal bacterium, is genetically closely related to M. hyopneumoniae, the causative agent of enzootic porcine pneumonia. M. hyorhinis is also pathogenic, causing polyserositis and arthritis. In this work, we present the genome sequences of M. flocculare and M. hyopneumoniae strain 7422, and we compare these genomes with the genomes of other M. hyoponeumoniae strain and to the a M. hyorhinis genome. These analyses were performed to identify possible characteristics that may help to explain the different behaviors of these species in swine respiratory tracts. Results The overall genome organization of three species was analyzed, revealing that the ORF clusters (OCs) differ considerably and that inversions and rearrangements are common. Although M. flocculare and M. hyopneumoniae display a high degree of similarity with respect to the gene content, only some genomic regions display considerable synteny. Genes encoding proteins that may be involved in host-cell adhesion in M. hyopneumoniae and M. flocculare display differences in genomic structure and organization. Some genes encoding adhesins of the P97 family are absent in M. flocculare and some contain sequence differences or lack of domains that are considered to be important for adhesion to host cells. The phylogenetic relationship of the three species was confirmed by a phylogenomic approach. The set of genes involved in metabolism, especially in the uptake of precursors for nucleic acids synthesis and nucleotide metabolism, display some differences in copy number and the presence/absence in the three species. Conclusions The comparative analyses of three mycoplasma species that inhabit the swine respiratory tract facilitated the identification of some characteristics that may be related to their different behaviors. M. hyopneumoniae and M. flocculare display many differences

  10. Genome Sequence and Comparative Genomics Analysis of a Vibrio cholerae O1 Strain Isolated from a Cholera Patient in Malaysia

    PubMed Central

    Osama, Abdulrazak; Gan, Han Ming; Teh, Cindy Shuan Ju; Yap, Kien-Pong

    2012-01-01

    The genome sequence analysis of a clinical Vibrio cholerae VC35 strain from an outbreak case in Malaysia indicates multiple genes involved in host adaptation and a novel Na+-driven multidrug efflux pump-coding gene in the genome of Vibrio cholerae with the highest similarity to VMA_001754 of Vibrio mimicus VMA223. PMID:23209200

  11. Comparative Genome Analysis of the Pathogenic Spirochetes Borrelia burgdorferi and Treponema pallidum

    PubMed Central

    Subramanian, G.; Koonin, Eugene V.; Aravind, L.

    2000-01-01

    A comparative analysis of the predicted protein sequences encoded in the complete genomes of Borrelia burgdorferi and Treponema pallidum provides a number of insights into evolutionary trends and adaptive strategies of the two spirochetes. A measure of orthologous relationships between gene sets, termed the orthology coefficient (OC), was developed. The overall OC value for the gene sets of the two spirochetes is about 0.43, which means that less than one-half of the genes show readily detectable orthologous relationships. This emphasizes significant divergence between the two spirochetes, apparently driven by different biological niches. Different functional categories of proteins as well as different protein families show a broad distribution of OC values, from near 1 (a perfect, one-to-one correspondence) to near 0. The proteins involved in core biological functions, such as genome replication and expression, typically show high OC values. In contrast, marked variability is seen among proteins that are involved in specific processes, such as nutrient transport, metabolism, gene-specific transcription regulation, signal transduction, and host response. Differences in the gene complements encoded in the two spirochete genomes suggest active adaptive evolution for their distinct niches. Comparative analysis of the spirochete genomes produced evidence of gene exchanges with other bacteria, archaea, and eukaryotic hosts that seem to have occurred at different points in the evolution of the spirochetes. Examples are presented of the use of sequence profile analysis to predict proteins that are likely to play a role in pathogenesis, including secreted proteins that contain specific protein-protein interaction domains, such as von Willebrand A, YWTD, TPR, and PR1, some of which hitherto have been reported only in eukaryotes. We tentatively reconstruct the likely evolutionary process that has led to the divergence of the two spirochete lineages; this reconstruction seems

  12. Comparative and phylogenetic analysis of the mitochondrial genomes in basal hymenopterans

    PubMed Central

    Song, Sheng-Nan; Tang, Pu; Wei, Shu-Jun; Chen, Xue-Xin

    2016-01-01

    The Symphyta is traditionally accepted as a paraphyletic group located in a basal position of the order Hymenoptera. Herein, we conducted a comparative analysis of the mitochondrial genomes in the Symphyta by describing two newly sequenced ones, from Trichiosoma anthracinum, representing the first mitochondrial genome in family Cimbicidae, and Asiemphytus rufocephalus, from family Tenthredinidae. The sequenced lengths of these two mitochondrial genomes were 15,392 and 14,864 bp, respectively. Within the sequenced region, trnC and trnY were rearranged to the upstream of trnI-nad2 in T. anthracinum, while in A. rufocephalus all sequenced genes were arranged in the putative insect ancestral gene arrangement. Rearrangement of the tRNA genes is common in the Symphyta. The rearranged genes are mainly from trnL1 and two tRNA clusters of trnI-trnQ-trnM and trnW-trnC-trnY. The mitochondrial genomes of Symphyta show a biased usage of A and T rather than G and C. Protein-coding genes in Symphyta species show a lower evolutionary rate than those of Apocrita. The Ka/Ks ratios were all less than 1, indicating purifying selection of Symphyta species. Phylogenetic analyses supported the paraphyly and basal position of Symphyta in Hymenoptera. The well-supported phylogenetic relationship in the study is Tenthredinoidea + (Cephoidea + (Orussoidea + Apocrita)). PMID:26879745

  13. Comparative karyotyping as a tool for genome structure analysis of Trypanosoma cruzi.

    PubMed

    Branche, Carole; Ochaya, Stephen; Aslund, Lena; Andersson, Björn

    2006-05-01

    As a part of the Trypanosoma cruzi genome project, 239 genetic markers were hybridised to PFGE separated DNA from T. cruzi, in order to determine the number and size of chromosomes and to aid the assembly of the genome sequence. We used three strains, T. cruzi IIe CL Brener (the genome project reference strain) and two T. cruzi I strains, Sylvio X10/7 and CAI/72, to perform a comparative study of their karyotypes and to determine marker linkage. A densitometry analysis of the separations estimated the total chromosome numbers to be 55 in CL Brener and 57 in the two other strains. In all, 45 markers hybridised to single chromosomal bands and 103 markers to two bands in CL Brener, while the number of markers in Sylvio X10/7 and CAI/72 were 102/68 and 61/105, respectively. Size differences between homologous chromosomes were often large, up to 1900 kb (173%). The average difference was 36% for CL Brener and 23.5% for the T. cruzi I strains. Larger differences in CL Brener are consistent with a recent hybrid origin. Forty markers distributed into 15 linkage groups were found to identify specific chromosomes or chromosomes pairs. While the same markers are generally linked in all three strains, the sizes of the chromosomes vary extensively, indicating large chromosomal rearrangements. These data provide valuable information for the finishing of the CL Brener genome sequence. PMID:16481054

  14. Comparative in-silico genome analysis of Leishmania (Leishmania) donovani: A step towards its species specificity

    PubMed Central

    S., Satheesh Kumar; R.K., Gokulasuriyan; Ghosh, Monidipa

    2014-01-01

    Comparative genome analysis of recently sequenced Leishmania (L.) donovani was unexplored so far. The present study deals with the complete scanning of L. (L.) donovani genome revealing its interspecies variations. 60 distinctly present genes in L. (L.) donovani were identified when the whole genome was compared with Leishmania (L.) infantum. Similarly 72, 159, and 265 species specific genes were identified in L. (L.) donovani when compared to Leishmania (L.) major, Leishmania (L.) mexicana and Leishmania (Viannia) braziliensis respectively. The cross comparison of L. (L.) donovani in parallel with the other sequenced species of leishmanial led to the identification of 55 genes which are highly specific and expressed exclusively in L. (L.) donovani. We found mainly the discrepancies of surface proteins such as amastins, proteases, and peptidases. Also 415 repeat containing proteins in L. (L.) donovani and their differential distribution in other leishmanial species were identified which might have a potential role during pathogenesis. The genes identified can be evaluated as drug targets for anti-leishmanial treatment, exploring the scope for extensive future investigations. PMID:25606461

  15. Evaluation of Breast Cancer Polyclonality by Combined Chromosome Banding and Comparative Genomic Hybridization Analysis1

    PubMed Central

    Teixeira, Manuel R; Tsarouha, Haroula; Kraggerud, Sigrid M; Pandis, Nikos; Dimitriadis, Euthymios; Andersen, Johan A; Lothe, Ragnhild A; Heim, Sverre

    2001-01-01

    Abstract Cytogenetically unrelated clones have been detected by chromosome banding analysis in many breast carcinomas. Because these karyotypic studies were performed on short-term cultured samples, it may be argued that in vitro selection occurred or that small clones may have arisen during culturing. To address this issue, we analyzed 37 breast carcinomas by G-banding and comparative genomic hybridization (CGH), a fluorescent in situ hybridization-based screening technique that does not require culturing or tumor metaphases. All but two of the 37 karyotypically abnormal cases presented copy number changes by CGH. The picture of genomic alterations revealed by the two techniques overlapped only partly. Sometimes the CGH analysis revealed genomic imbalances that belonged to cell populations not picked up by the cytogenetic analysis and in other cases, especially when the karyotypes had many markers and chromosomes with additional material of unknown origin, CGH gave a more reliable overall picture of the copy number gains and losses. However, besides sometimes revealing cell populations with balanced chromosome aberrations or unbalanced changes that nevertheless remained undetected by CGH, G-banding analysis was essential to understand how the genomic imbalances arose in the many cases in which both techniques detected the same clonal abnormalities. Furthermore, because CGH pictures only imbalances present in a significant proportion of the test sample, the very detection by this technique of imbalances belonging to apparently small, cytogenetically unrelated clones of cells proves that these clones must have been present in vivo. This constitutes compelling evidence that the cytogenetic polyclonality observed after short-term culturing of breast carcinomas is not an artifact. PMID:11494114

  16. Correction: Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2014-01-01

    Abstract The version of this article published in BMC Genomics 2013, 14: 274, contains 9 unpublished genomes (Botryobasidium botryosum, Gymnopus luxurians, Hypholoma sublateritium, Jaapia argillacea, Hebeloma cylindrosporum, Conidiobolus coronatus, Laccaria amethystina, Paxillus involutus, and P. rubicundulus) downloaded from JGI website. In this correction, we removed these genomes after discussion with editors and data producers whom we should have contacted before downloading these genomes. Removing these data did not alter the principle results and conclusions of our original work. The relevant Figures 1, 2, 3, 4 and 6; and Table 1 have been revised. Additional files 1, 3, 4, and 5 were also revised. We would like to apologize for any confusion or inconvenience this may have caused. Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 94 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed

  17. Phylogenetic Analysis and Comparative Genomics of Purine Riboswitch Distribution in Prokaryotes

    PubMed Central

    Singh, Payal; Sengupta, Supratim

    2012-01-01

    Riboswitches are regulatory RNA that control gene expression by undergoing conformational changes on ligand binding. Using phylogenetic analysis and comparative genomics we have been able to identify the class of genes/operons regulated by the purine riboswitch and obtain a high-resolution map of purine riboswitch distribution across all bacterial groups. In the process, we are able to explain the absence of purine riboswitches upstream to specific genes in certain genomes. We also identify the point of origin of various purine riboswitches and argue that not all purine riboswitches are of primordial origin, and that some purine riboswitches must have originated after the divergence of certain Firmicute orders in the course of evolution. Our study also reveals the role of horizontal transfer events in accounting for the presence of purine riboswitches in some gammaproteobacterial species. Our work provides significant insights into the origin, distribution and regulatory role of purine riboswitches in prokaryotes. PMID:23170063

  18. Comparative Genomic and Functional Analysis of Lactobacillus casei and Lactobacillus rhamnosus Strains Marketed as Probiotics

    PubMed Central

    Douillard, François P.; Ribbera, Angela; Järvinen, Hanna M.; Kant, Ravi; Pietilä, Taija E.; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K.; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi

    2013-01-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which could be linked to their genotypes. The two isolated L. rhamnosus strains had genomes that were virtually identical to that of L. rhamnosus GG, testifying to their genomic stability and integrity in food products. The L. casei strains showed much greater genomic heterogeneity. Remarkably, all strains contained an intact spaCBA pilus gene cluster. However, only the L. rhamnosus strains produced mucus-binding SpaCBA pili under the conditions tested. Transcription initiation mapping demonstrated that the insertion of an iso-IS30 element upstream of the pilus gene cluster in L. rhamnosus strains but absent in L. casei strains had constituted a functional promoter driving pilus gene expression. All L. rhamnosus strains triggered an NF-κB response via Toll-like receptor 2 (TLR2) in a reporter cell line, whereas the L. casei strains did not or did so to a much lesser extent. This study demonstrates that the two L. rhamnosus strains isolated from probiotic products are virtually identical to L. rhamnosus GG and further highlights the differences between these and L. casei strains widely marketed as probiotics, in terms of genome content, mucus-binding and metabolic capacities, and host signaling capabilities. PMID:23315726

  19. Comparative genomic and functional analysis of Lactobacillus casei and Lactobacillus rhamnosus strains marketed as probiotics.

    PubMed

    Douillard, François P; Ribbera, Angela; Järvinen, Hanna M; Kant, Ravi; Pietilä, Taija E; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi; de Vos, Willem M

    2013-03-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which could be linked to their genotypes. The two isolated L. rhamnosus strains had genomes that were virtually identical to that of L. rhamnosus GG, testifying to their genomic stability and integrity in food products. The L. casei strains showed much greater genomic heterogeneity. Remarkably, all strains contained an intact spaCBA pilus gene cluster. However, only the L. rhamnosus strains produced mucus-binding SpaCBA pili under the conditions tested. Transcription initiation mapping demonstrated that the insertion of an iso-IS30 element upstream of the pilus gene cluster in L. rhamnosus strains but absent in L. casei strains had constituted a functional promoter driving pilus gene expression. All L. rhamnosus strains triggered an NF-κB response via Toll-like receptor 2 (TLR2) in a reporter cell line, whereas the L. casei strains did not or did so to a much lesser extent. This study demonstrates that the two L. rhamnosus strains isolated from probiotic products are virtually identical to L. rhamnosus GG and further highlights the differences between these and L. casei strains widely marketed as probiotics, in terms of genome content, mucus-binding and metabolic capacities, and host signaling capabilities. PMID:23315726

  20. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    PubMed

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements. PMID:26212213

  1. Genome-scale metabolic modeling of Mucor circinelloides and comparative analysis with other oleaginous species.

    PubMed

    Vongsangnak, Wanwipa; Klanchui, Amornpan; Tawornsamretkit, Iyarest; Tatiyaborwornchai, Witthawin; Laoteng, Kobkul; Meechai, Asawin

    2016-06-01

    We present a novel genome-scale metabolic model iWV1213 of Mucor circinelloides, which is an oleaginous fungus for industrial applications. The model contains 1213 genes, 1413 metabolites and 1326 metabolic reactions across different compartments. We demonstrate that iWV1213 is able to accurately predict the growth rates of M. circinelloides on various nutrient sources and culture conditions using Flux Balance Analysis and Phenotypic Phase Plane analysis. Comparative analysis of three oleaginous genome-scale models, including M. circinelloides (iWV1213), Mortierella alpina (iCY1106) and Yarrowia lipolytica (iYL619_PCP) revealed that iWV1213 possesses a higher number of genes involved in carbohydrate, amino acid, and lipid metabolisms that might contribute to its versatility in nutrient utilization. Moreover, the identification of unique and common active reactions among the Zygomycetes oleaginous models using Flux Variability Analysis unveiled a set of gene/enzyme candidates as metabolic engineering targets for cellular improvement. Thus, iWV1213 offers a powerful metabolic engineering tool for multi-level omics analysis, enabling strain optimization as a cell factory platform of lipid-based production. PMID:26911256

  2. Pyrosequencing-Based Comparative Genome Analysis of Vibrio vulnificus Environmental Isolates

    PubMed Central

    Morrison, Shatavia S.; Williams, Tiffany; Cain, Aurora; Froelich, Brett; Taylor, Casey; Baker-Austin, Craig; Verner-Jeffreys, David; Hartnell, Rachel; Oliver, James D.; Gibas, Cynthia J.

    2012-01-01

    Between 1996 and 2006, the US Centers for Disease Control reported that the only category of food-borne infections increasing in frequency were those caused by members of the genus Vibrio. The Gram-negative bacterium Vibrio vulnificus is a ubiquitous inhabitant of estuarine waters, and is the number one cause of seafood-related deaths in the US. Many V. vulnificus isolates have been studied, and it has been shown that two genetically distinct subtypes, distinguished by 16S rDNA and other gene polymorphisms, are associated predominantly with either environmental or clinical isolation. While local genetic differences between the subtypes have been probed, only the genomes of clinical isolates have so far been completely sequenced. In order to better understand V. vulnificus as an agent of disease and to identify the molecular components of its virulence mechanisms, we have completed whole genome shotgun sequencing of three diverse environmental genotypes using a pyrosequencing approach. V. vulnificus strain JY1305 was sequenced to a depth of 33×, and strains E64MW and JY1701 were sequenced to lesser depth, covering approximately 99.9% of each genome. We have performed a comparative analysis of these sequences against the previously published sequences of three V. vulnificus clinical isolates. We find that the genome of V. vulnificus is dynamic, with 1.27% of genes in the C-genotype genomes not found in the E- genotype genomes. We identified key genes that differentiate between the genomes of the clinical and environmental genotypes. 167 genes were found to be specifically associated with environmental genotypes and 278 genes with clinical genotypes. Genes specific to the clinical strains include components of sialic acid catabolism, mannitol fermentation, and a component of a Type IV secretory pathway VirB4, as well as several other genes with potential significance for human virulence. Genes specific to environmental strains included several that may have

  3. Comparative Mitochondrial Genome Analysis of Eligma narcissus and other Lepidopteran Insects Reveals Conserved Mitochondrial Genome Organization and Phylogenetic Relationships.

    PubMed

    Dai, Li-Shang; Zhu, Bao-Jian; Zhao, Yue; Zhang, Cong-Fen; Liu, Chao-Liang

    2016-01-01

    In this study, we sequenced the complete mitochondrial genome of Eligma narcissus and compared it with 18 other lepidopteran species. The mitochondrial genome (mitogenome) was a circular molecule of 15,376 bp containing 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes and an adenine (A) + thymine (T) - rich region. The positive AT skew (0.007) indicated the occurrence of more As than Ts. The arrangement of 13 PCGs was similar to that of other sequenced lepidopterans. All PCGs were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which was initiated by the CGA sequence, as observed in other lepidopterans. The results of the codon usage analysis indicated that Asn, Ile, Leu, Tyr and Phe were the five most frequent amino acids. All tRNA genes were shown to be folded into the expected typical cloverleaf structure observed for mitochondrial tRNA genes. Phylogenetic relationships were analyzed based on the nucleotide sequences of 13 PCGs from other insect mitogenomes, which confirmed that E. narcissus is a member of the Noctuidae superfamily. PMID:27222440

  4. A Comparative Analysis of Mitochondrial Genomes in Coleoptera (Arthropoda: Insecta) and Genome Descriptions of Six New Beetles

    PubMed Central

    Song, H.; Cameron, S. L.; Whiting, M. F.

    2008-01-01

    Coleoptera is the most diverse group of insects with over 360,000 described species divided into four suborders: Adephaga, Archostemata, Myxophaga, and Polyphaga. In this study, we present six new complete mitochondrial genome (mtgenome) descriptions, including a representative of each suborder, and analyze the evolution of mtgenomes from a comparative framework using all available coleopteran mtgenomes. We propose a modification of atypical cox1 start codons based on sequence alignment to better reflect the conservation observed across species as well as findings of TTG start codons in other genes. We also analyze tRNA-Ser(AGN) anticodons, usually GCU in arthropods, and report a conserved UCU anticodon as a possible synapomorphy across Polyphaga. We further analyze the secondary structure of tRNA-Ser(AGN) and present a consensus structure and an updated covariance model that allows tRNAscan-SE (via the COVE software package) to locate and fold these atypical tRNAs with much greater consistency. We also report secondary structure predictions for both rRNA genes based on conserved stems. All six species of beetle have the same gene order as the ancestral insect. We report noncoding DNA regions, including a small gap region of about 20 bp between tRNA-Ser(UCN) and nad1 that is present in all six genomes, and present results of a base composition analysis. PMID:18779259

  5. Comparative Mitochondrial Genome Analysis of Eligma narcissus and other Lepidopteran Insects Reveals Conserved Mitochondrial Genome Organization and Phylogenetic Relationships

    PubMed Central

    Dai, Li-Shang; Zhu, Bao-Jian; Zhao, Yue; Zhang, Cong-Fen; Liu, Chao-Liang

    2016-01-01

    In this study, we sequenced the complete mitochondrial genome of Eligma narcissus and compared it with 18 other lepidopteran species. The mitochondrial genome (mitogenome) was a circular molecule of 15,376 bp containing 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes and an adenine (A) + thymine (T) − rich region. The positive AT skew (0.007) indicated the occurrence of more As than Ts. The arrangement of 13 PCGs was similar to that of other sequenced lepidopterans. All PCGs were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which was initiated by the CGA sequence, as observed in other lepidopterans. The results of the codon usage analysis indicated that Asn, Ile, Leu, Tyr and Phe were the five most frequent amino acids. All tRNA genes were shown to be folded into the expected typical cloverleaf structure observed for mitochondrial tRNA genes. Phylogenetic relationships were analyzed based on the nucleotide sequences of 13 PCGs from other insect mitogenomes, which confirmed that E. narcissus is a member of the Noctuidae superfamily. PMID:27222440

  6. A comparative analysis of the DNA recombination repair pathway in mycobacterial genomes.

    PubMed

    Singh, Amandeep; Bhagavat, Raghu; Vijayan, M; Chandra, Nagasuma

    2016-07-01

    In prokaryotes, repair by homologous recombination provides a major means to reinstate the genetic information lost in DNA damage. Recombination repair pathway in mycobacteria has multiple differences as compared to that in Escherichia coli. Of about 20 proteins known to be involved in the pathway, a set of 9 proteins, namely, RecF, RecO, RecR, RecA, SSBa, RuvA, RuvB and RuvC was found to be indispensable among the 43 mycobacterial strains. A domain level analysis indicated that most domains involved in recombination repair are unique to these proteins and are present as single copies in the genomes. Synteny analysis reveals that the gene order of proteins involved in the pathway is not conserved, suggesting that they may be regulated differently in different species. Sequence conservation among the same protein from different strains suggests the importance of RecO-RecA and RecFOR-RecA presynaptic pathways in the repair of double strand-breaks and single strand-breaks respectively. New annotations obtained from the analysis, include identification of a protein with a probable Holliday junction binding role present in 41 mycobacterial genomes and that of a RecB-like nuclease, containing a cas4 domain, present in 42 genomes. New insights into the binding of small molecules to the relevant proteins are provided by binding pocket analysis using three dimensional structural models. Analysis of the various features of the recombination repair pathway, presented here, is likely to provide a framework for further exploring stress response and emergence of drug resistance in mycobacteria. PMID:27450012

  7. Genome sequence of the model sulfate reducer Desulfovibrio gigas: a comparative analysis within the Desulfovibrio genus*

    PubMed Central

    Morais-Silva, Fabio O; Rezende, Antonio Mauro; Pimentel, Catarina; Santos, Catia I; Clemente, Carla; Varela–Raposo, Ana; Resende, Daniela M; da Silva, Sofia M; de Oliveira, Luciana Márcia; Matos, Marcia; Costa, Daniela A; Flores, Orfeu; Ruiz, Jerónimo C; Rodrigues-Pousada, Claudina

    2014-01-01

    Desulfovibrio gigas is a model organism of sulfate-reducing bacteria of which energy metabolism and stress response have been extensively studied. The complete genomic context of this organism was however, not yet available. The sequencing of the D. gigas genome provides insights into the integrated network of energy conserving complexes and structures present in this bacterium. Comparison with genomes of other Desulfovibrio spp. reveals the presence of two different CRISPR/Cas systems in D. gigas. Phylogenetic analysis using conserved protein sequences (encoded by rpoB and gyrB) indicates two main groups of Desulfovibrio spp, being D. gigas more closely related to D. vulgaris and D. desulfuricans strains. Gene duplications were found such as those encoding fumarate reductase, formate dehydrogenase, and superoxide dismutase. Complexes not yet described within Desulfovibrio genus were identified: Mnh complex, a v-type ATP-synthase as well as genes encoding the MinCDE system that could be responsible for the larger size of D. gigas when compared to other members of the genus. A low number of hydrogenases and the absence of the codh/acs and pfl genes, both present in D. vulgaris strains, indicate that intermediate cycling mechanisms may contribute substantially less to the energy gain in D. gigas compared to other Desulfovibrio spp. This might be compensated by the presence of other unique genomic arrangements of complexes such as the Rnf and the Hdr/Flox, or by the presence of NAD(P)H related complexes, like the Nuo, NfnAB or Mnh. PMID:25055974

  8. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    PubMed Central

    2011-01-01

    Background It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb) survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN), which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE) tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH). Methods Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA) was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. Results The mean of the derivative log ratio spread (DLRSpread), which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05). The concordance of results between aCGH and fluorescence in situ hybridization (FISH) for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively). The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15). Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40). Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005) independent factor which was associated with larger total length of CNA of breast cancers. Conclusions Thus, archival FFPE tissues from A-bomb survivors are useful for genome-wide a

  9. The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae.

    PubMed

    Hao, Zhaodong; Cheng, Tielong; Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

    2016-01-01

    Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965

  10. The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae

    PubMed Central

    Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

    2016-01-01

    Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965

  11. Array Comparative Genomic Hybridization (aCGH) Analysis in Patients with Anophthalmia, Microphthalmia and Coloboma

    PubMed Central

    Raca, Gordana; Jackson, Craig A.; Kucinskas, Laimutis; Warman, Berta; Shieh, Joseph T. C.; Schneider, Adele; Bardakjian, Tanya M.; Schimmenti, Lisa A.

    2014-01-01

    Purpose The goal of our study was to determine whether genomic copy number abnormalities (deletions and duplications) affecting genes involved in eye development contribute to the etiology of anophthalmia, microphthalmia and coloboma. Methods The affected individuals were tested for deletions and duplications in genomic DNA using 2 million probe (HD2) comparative genomic hybridization arrays (aCGH) from Roche-NimbleGen. Results Array analysis of 32 patients detected one case with a deletion encompassing the Renal-coloboma syndrome associated gene PAX2. Non-polymorphic copy number changes were also observed at several candidate chromosomal regions, including 6p12.3, 8q23.1q23.2, 13q31.3, 15q11.2q13.1, 16p13.13 and 20q13.13. Conclusions This study identified the first patient with the typical phenotype of the Renal-coloboma syndrome caused by a submicroscopic deletion of the coding region of the PAX2 gene. The finding suggests that PAX2 deletion testing should be performed in addition to gene sequencing as a part of molecular evaluation for the Renal-coloboma syndrome. aCGH testing of 32 affected individual showed that genomic deletions and duplications are not a common cause of non-syndromic anophthalmia, microphthalmia and/or coloboma, but undoubtedly contribute to the etiology of these eye anomalies. aCGH testing therefore represents an important and valuable addition to candidate gene sequencing in research and diagnostics of ocular birth defects. PMID:21285886

  12. Comparative Whole-Genome Analysis of Clinical Isolates Reveals Characteristic Architecture of Mycobacterium tuberculosis Pangenome

    PubMed Central

    Periwal, Vinita; Patowary, Ashok; Vellarikkal, Shamsudheen Karuthedath; Gupta, Anju; Singh, Meghna; Mittal, Ashish; Jeyapaul, Shamini; Chauhan, Rajendra Kumar; Singh, Ajay Vir; Singh, Pravin Kumar; Garg, Parul; Katoch, Viswa Mohan; Katoch, Kiran; Chauhan, Devendra Singh; Sivasubbu, Sridhar; Scaria, Vinod

    2015-01-01

    The tubercle complex consists of closely related mycobacterium species which appear to be variants of a single species. Comparative genome analysis of different strains could provide useful clues and insights into the genetic diversity of the species. We integrated genome assemblies of 96 strains from Mycobacterium tuberculosis complex (MTBC), which included 8 Indian clinical isolates sequenced and assembled in this study, to understand its pangenome architecture. We predicted genes for all the 96 strains and clustered their respective CDSs into homologous gene clusters (HGCs) to reveal a hard-core, soft-core and accessory genome component of MTBC. The hard-core (HGCs shared amongst 100% of the strains) was comprised of 2,066 gene clusters whereas the soft-core (HGCs shared amongst at least 95% of the strains) comprised of 3,374 gene clusters. The change in the core and accessory genome components when observed as a function of their size revealed that MTBC has an open pangenome. We identified 74 HGCs that were absent from reference strains H37Rv and H37Ra but were present in most of clinical isolates. We report PCR validation on 9 candidate genes depicting 7 genes completely absent from H37Rv and H37Ra whereas 2 genes shared partial homology with them accounting to probable insertion and deletion events. The pangenome approach is a promising tool for studying strain specific genetic differences occurring within species. We also suggest that since selecting appropriate target genes for typing purposes requires the expected target gene be present in all isolates being typed, therefore estimating the core-component of the species becomes a subject of prime importance. PMID:25853708

  13. Genome Sequencing and Comparative Analysis of Saccharomyces cerevisiae Strains of the Peterhof Genetic Collection

    PubMed Central

    Drozdova, Polina B.; Tarasov, Oleg V.; Matveenko, Andrew G.; Radchenko, Elina A.; Sopova, Julia V.; Polev, Dmitrii E.; Inge-Vechtomov, Sergey G.; Dobrynin, Pavel V.

    2016-01-01

    The Peterhof genetic collection of Saccharomyces cerevisiae strains (PGC) is a large laboratory stock that has accumulated several thousands of strains for over than half a century. It originated independently of other common laboratory stocks from a distillery lineage (race XII). Several PGC strains have been extensively used in certain fields of yeast research but their genomes have not been thoroughly explored yet. Here we employed whole genome sequencing to characterize five selected PGC strains including one of the closest to the progenitor, 15V-P4, and several strains that have been used to study translation termination and prions in yeast (25-25-2V-P3982, 1B-D1606, 74-D694, and 6P-33G-D373). The genetic distance between the PGC progenitor and S288C is comparable to that between two geographically isolated populations. The PGC seems to be closer to two bakery strains than to S288C-related laboratory stocks or European wine strains. In genomes of the PGC strains, we found several loci which are absent from the S288C genome; 15V-P4 harbors a rare combination of the gene cluster characteristic for wine strains and the RTM1 cluster. We closely examined known and previously uncharacterized gene variants of particular strains and were able to establish the molecular basis for known phenotypes including phenylalanine auxotrophy, clumping behavior and galactose utilization. Finally, we made sequencing data and results of the analysis available for the yeast community. Our data widen the knowledge about genetic variation between Saccharomyces cerevisiae strains and can form the basis for planning future work in PGC-related strains and with PGC-derived alleles. PMID:27152522

  14. Comparative Genomics Analysis of Mycobacterium ulcerans for the Identification of Putative Essential Genes and Therapeutic Candidates

    PubMed Central

    Tahir, Shifa; Tong, Yigang

    2012-01-01

    Mycobacterium ulcerans, the causative agent of Buruli ulcer, is the third most common mycobacterial disease after tuberculosis and leprosy. The present treatment options are limited and emergence of treatment resistant isolates represents a serious concern and a need for better therapeutics. Conventional drug discovery methods are time consuming and labor-intensive. Unfortunately, the slow growing nature of M. ulcerans in experimental conditions is also a barrier for drug discovery and development. In contrast, recent advancements in complete genome sequencing, in combination with cheminformatics and computational biology, represent an attractive alternative approach for the identification of therapeutic candidates worthy of experimental research. A computational, comparative genomics workflow was defined for the identification of novel therapeutic candidates against M. ulcerans, with the aim that a selected target should be essential to the pathogen, and have no homology in the human host. Initially, a total of 424 genes were predicted as essential from the M. ulcerans genome, via homology searching of essential genome content from 20 different bacteria. Metabolic pathway analysis showed that the most essential genes are associated with carbohydrate and amino acid metabolism. Among these, 236 proteins were identified as non-host and essential, and could serve as potential drug and vaccine candidates. Several drug target prioritization parameters including druggability were also calculated. Enzymes from several pathways are discussed as potential drug targets, including those from cell wall synthesis, thiamine biosynthesis, protein biosynthesis, and histidine biosynthesis. It is expected that our data will facilitate selection of M. ulcerans proteins for successful entry into drug design pipelines. PMID:22912793

  15. Comparative genomic analysis of mitochondrial protein-coding genes in Veneroida clams: Analysis of superfamily-specific genomic and evolutionary features.

    PubMed

    Hwang, Jae Yeon; Lee, Chang-Kyu; Kim, Heebal; Nam, Bo-Hye; An, Cheul Min; Park, Jung Youn; Park, Kyu-Hyun; Huh, Chul-Sung; Kim, Eun Bae

    2015-12-01

    Veneroida is the largest order of bivalves, and these clams are commercially important in Asian countries. Although numerous studies have focused on the genomic characters of individual species or genera in Veneroida, superfamily-specific genomic characters have not been determined. In this study, we performed a comparative genomic analysis of 12 mitochondrial protein coding genes (PCGs) from 25 clams in six Veneroida superfamilies to determine genomic and evolutionary features of each superfamily. Length and distribution of nucleotides encoding the PCGs were too variable to define superfamily-specific genomic characters. Phylogenetic analysis revealed that PCGs are suitable for classification of species in three superfamilies: Cardioidea, Mactroidea, and Veneroidea. However, one species classified in Tellinoidea, Sinonovacula constricta, was evolutionarily closer to Solenoidea clams than Tellinoidea clams. dN/dS analysis showed that positively selected sites in NADH dehydrogenase subunit, nd4 and subunit of ATP synthase, atp6 were present in Mactroidea. Differences in selected sites in the nd4 and atp6 could be caused by superfamily-level differences in sodium transport or ATP synthesis functions, respectively. These differences in selected sites in NADH may have conferred these animals, which have low motility and do not generally move, with increased flexibility to maintain homeostasis in the face of osmotic pressure. Our study provides insight into evolutionary traits as well as facilitates identification of veneroids. PMID:26343338

  16. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  17. Genome-Wide Comparative Analysis of Chemosensory Gene Families in Five Tsetse Fly Species

    PubMed Central

    Macharia, Rosaline; Mireji, Paul; Murungi, Edwin; Murilla, Grace; Christoffels, Alan; Aksoy, Serap; Masiga, Daniel

    2016-01-01

    For decades, odour-baited traps have been used for control of tsetse flies (Diptera; Glossinidae), vectors of African trypanosomes. However, differential responses to known attractants have been reported in different Glossina species, hindering establishment of a universal vector control tool. Availability of full genome sequences of five Glossina species offers an opportunity to compare their chemosensory repertoire and enhance our understanding of their biology in relation to chemosensation. Here, we identified and annotated the major chemosensory gene families in Glossina. We identified a total of 118, 115, 124, and 123 chemosensory genes in Glossina austeni, G. brevipalpis, G. f. fuscipes, G. pallidipes, respectively, relative to 127 reported in G. m. morsitans. Our results show that tsetse fly genomes have fewer chemosensory genes when compared to other dipterans such as Musca domestica (n>393), Drosophila melanogaster (n = 246) and Anopheles gambiae (n>247). We also found that Glossina chemosensory genes are dispersed across distantly located scaffolds in their respective genomes, in contrast to other insects like D. melanogaster whose genes occur in clusters. Further, Glossina appears to be devoid of sugar receptors and to have expanded CO2 associated receptors, potentially reflecting Glossina's obligate hematophagy and the need to detect hosts that may be out of sight. We also identified, in all species, homologs of Ir84a; a Drosophila-specific ionotropic receptor that promotes male courtship suggesting that this is a conserved trait in tsetse flies. Notably, our selection analysis revealed that a total of four gene loci (Gr21a, GluRIIA, Gr28b, and Obp83a) were under positive selection, which confers fitness advantage to species. These findings provide a platform for studies to further define the language of communication of tsetse with their environment, and influence development of novel approaches for control. PMID:26886411

  18. Exome sequencing and array-based comparative genomic hybridisation analysis of preferential 6-methylmercaptopurine producers.

    PubMed

    Chua, E W; Cree, S; Barclay, M L; Doudney, K; Lehnert, K; Aitchison, A; Kennedy, M A

    2015-10-01

    Preferential conversion of azathioprine or 6-mercaptopurine into methylated metabolites is a major cause of thiopurine resistance. To seek potentially Mendelian causes of thiopurine hypermethylation, we recruited 12 individuals who exhibited extreme therapeutic resistance while taking azathioprine or 6-mercaptopurine and performed whole-exome sequencing (WES) and copy-number variant analysis by array-based comparative genomic hybridisation (aCGH). Exome-wide variant filtering highlighted four genes potentially associated with thiopurine metabolism (ENOSF1 and NFS1), transport (SLC17A4) or therapeutic action (RCC2). However, variants of each gene were found only in two or three patients, and it is unclear whether these genes could influence thiopurine hypermethylation. Analysis by aCGH did not identify any unusual or pathogenic copy-number variants. This suggests that if causative mutations for the hypermethylation phenotype exist they may be heterogeneous, occurring in several different genes, or they may lie within regulatory regions not captured by WES. Alternatively, hypermethylation may arise from the involvement of multiple genes with small effects. To test this hypothesis would require recruitment of large patient samples and application of genome-wide association studies. PMID:25752523

  19. A comparative genomic analysis of the alkalitolerant soil bacterium Bacillus lehensis G1.

    PubMed

    Noor, Yusuf Muhammad; Samsulrizal, Nurul Hidayah; Jema'on, Noor Azah; Low, Kheng Oon; Ramli, Aizi Nor Mazila; Alias, Noor Izawati; Damis, Siti Intan Rosdianah; Fuzi, Siti Fatimah Zaharah Mohd; Isa, Mohd Noor Mat; Murad, Abdul Munir Abdul; Raih, Mohd Firdaus Mohd; Bakar, Farah Diba Abu; Najimudin, Nazalan; Mahadi, Nor Muhammad; Illias, Rosli Md

    2014-07-25

    Bacillus lehensis G1 is a Gram-positive, moderately alkalitolerant bacterium isolated from soil samples. B. lehensis produces cyclodextrin glucanotransferase (CGTase), an enzyme that has enabled the extensive use of cyclodextrin in foodstuffs, chemicals, and pharmaceuticals. The genome sequence of B. lehensis G1 consists of a single circular 3.99 Mb chromosome containing 4017 protein-coding sequences (CDSs), of which 2818 (70.15%) have assigned biological roles, 936 (23.30%) have conserved domains with unknown functions, and 263 (6.55%) have no match with any protein database. Bacillus clausii KSM-K16 was established as the closest relative to B. lehensis G1 based on gene content similarity and 16S rRNA phylogenetic analysis. A total of 2820 proteins from B. lehensis G1 were found to have orthologues in B. clausii, including sodium-proton antiporters, transport proteins, and proteins involved in ATP synthesis. A comparative analysis of these proteins and those in B. clausii and other alkaliphilic Bacillus species was carried out to investigate their contributions towards the alkalitolerance of the microorganism. The similarities and differences in alkalitolerance-related genes among alkalitolerant/alkaliphilic Bacillus species highlight the complex mechanism of pH homeostasis. The B. lehensis G1 genome was also mined for proteins and enzymes with potential viability for industrial and commercial purposes. PMID:24811681

  20. Characterization and comparative genomic analysis of bacteriophages infecting members of the Bacillus cereus group.

    PubMed

    Lee, Ju-Hoon; Shin, Hakdong; Ryu, Sangryeol

    2014-05-01

    The Bacillus cereus group phages infecting B. cereus, B. anthracis, and B. thuringiensis (Bt) have been studied at the molecular level and, recently, at the genomic level to control the pathogens B. cereus and B. anthracis and to prevent phage contamination of the natural insect pesticide Bt. A comparative phylogenetic analysis has revealed three different major phage groups with different morphologies (Myoviridae for group I, Siphoviridae for group II, and Tectiviridae for group III), genome size (group I > group II > group III), and lifestyle (virulent for group I and temperate for group II and III). A subsequent phage genome comparison using a dot plot analysis showed that phages in each group are highly homologous, substantiating the grouping of B. cereus phages. Endolysin is a host lysis protein that contains two conserved domains: a cell-wall-binding domain (CBD) and an enzymatic activity domain (EAD). In B. cereus sensu lato phage group I, four different endolysin groups have been detected, according to combinations of two types of CBD and four types of EAD. Group I phages have two copies of tail lysins and one copy of endolysin, but the functions of the tail lysins are still unknown. In the B. cereus sensu lato phage group II, the B. anthracis phages have been studied and applied for typing and rapid detection of pathogenic host strains. In the B. cereus sensu lato phage group III, the B. thuringiensis phages Bam35 and GIL01 have been studied to understand phage entry and lytic switch regulation mechanisms. In this review, we suggest that further study of the B. cereus group phages would be useful for various phage applications, such as biocontrol, typing, and rapid detection of the pathogens B. cereus and B. anthracis and for the prevention of phage contamination of the natural insect pesticide Bt. PMID:24264384

  1. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    PubMed Central

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  2. A Comparative Genomic Analysis of Diverse Clonal Types of Enterotoxigenic Escherichia coli Reveals Pathovar-Specific Conservation▿ †

    PubMed Central

    Sahl, Jason W.; Steinsland, Hans; Redman, Julia C.; Angiuoli, Samuel V.; Nataro, James P.; Sommerfelt, Halvor; Rasko, David A.

    2011-01-01

    Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrheal illness in children less than 5 years of age in low- and middle-income nations, whereas it is an emerging enteric pathogen in industrialized nations. Despite being an important cause of diarrhea, little is known about the genomic composition of ETEC. To address this, we sequenced the genomes of five ETEC isolates obtained from children in Guinea-Bissau with diarrhea. These five isolates represent distinct and globally dominant ETEC clonal groups. Comparative genomic analyses utilizing a gene-independent whole-genome alignment method demonstrated that sequenced ETEC strains share approximately 2.7 million bases of genomic sequence. Phylogenetic analysis of this “core genome” confirmed the diverse history of the ETEC pathovar and provides a finer resolution of the E. coli relationships than multilocus sequence typing. No identified genomic regions were conserved exclusively in all ETEC genomes; however, we identified more genomic content conserved among ETEC genomes than among non-ETEC E. coli genomes, suggesting that ETEC isolates share a genomic core. Comparisons of known virulence and of surface-exposed and colonization factor genes across all sequenced ETEC genomes not only identified variability but also indicated that some antigens are restricted to the ETEC pathovar. Overall, the generation of these five genome sequences, in addition to the two previously generated ETEC genomes, highlights the genomic diversity of ETEC. These studies increase our understanding of ETEC evolution, as well as provide insight into virulence factors and conserved proteins, which may be targets for vaccine development. PMID:21078854

  3. Use of methylation filtration and C(0)t fractionation for analysis of genome composition and comparative genomics in bread wheat.

    PubMed

    Bandopadhyay, Rajib; Rustgi, Sachin; Chaudhuri, Rajat Kanti; Khurana, Paramjit; Khurana, Jitendra Paul; Tyagi, Akhilesh Kumar; Balyan, Harindra Singh; Houben, Andreas; Gupta, Pushpendra Kumar

    2011-07-20

    We investigated the compositional and structural differences in sequences derived from different fractions of wheat genomic DNA obtained using methylation filtration and C(0)t fractionation. Comparative analysis of these sequences revealed large compositional and structural variations in terms of GC content, different structural elements including repeat sequences (e.g., transposable elements and simple sequence repeats), protein coding genes, and non-coding RNA genes. A correlation between methylation status [determined on the basis of selective inclusion/exclusion in methylation-filtered (MF) library] of different repeat elements and expression level was observed. The expression levels were determined by comparing MF sequences with expressed sequence tags (ESTs) available in the public domain. Only a limited overlap among MF, high C(0)t (HC), and ESTs was observed, suggesting that these sequences may largely either represent the low-copy non-transcribed sequences or include genes with low expression levels. Thus, these results indicated a need to study MF and HC sequences along with ESTs to fully appreciate complexity of wheat gene space. PMID:21777856

  4. Comparative Genomic Analysis of Sulfurospirillum cavolei MES Reconstructed from the Metagenome of an Electrosynthetic Microbiome

    PubMed Central

    Ross, Daniel E.; Marshall, Christopher W.; May, Harold D.; Norman, R. Sean

    2016-01-01

    Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this

  5. Genomic Sequencing and Comparative Analysis of Epstein-Barr Virus Genome Isolated from Primary Nasopharyngeal Carcinoma Biopsy

    PubMed Central

    Kwok, Hin; Tong, Amy H. Y.; Lin, Chi Ho; Lok, Si; Farrell, Paul J.; Kwong, Dora L. W.; Chiang, Alan K. S.

    2012-01-01

    Whether certain Epstein-Barr virus (EBV) strains are associated with pathogenesis of nasopharyngeal carcinoma (NPC) is still an unresolved question. In the present study, EBV genome contained in a primary NPC tumor biopsy was amplified by Polymerase Chain Reaction (PCR), and sequenced using next-generation (Illumina) and conventional dideoxy-DNA sequencing. The EBV genome, designated HKNPC1 (Genbank accession number JQ009376) is a type 1 EBV of approximately 171.5 kb. The virus appears to be a uniform strain in line with accepted monoclonal nature of EBV in NPC but is heterogeneous at 172 nucleotide positions. Phylogenetic analysis with the four published EBV strains, B95-8, AG876, GD1, and GD2, indicated HKNPC1 was more closely related to the Chinese NPC patient-derived strains, GD1 and GD2. HKNPC1 contains 1,589 single nucleotide variations (SNVs) and 132 insertions or deletions (indels) in comparison to the reference EBV sequence (accession number NC007605). When compared to AG876, a strain derived from Ghanaian Burkitt's lymphoma, we found 322 SNVs, of which 76 were non-synonymous SNVs and were shared amongst the Chinese GD1, GD2 and HKNPC1 isolates. We observed 88 non-synonymous SNVs shared only by HKNPC1 and GD2, the only other NPC tumor-derived strain reported thus far. Non-synonymous SNVs were mainly found in the latent, tegument and glycoprotein genes. The same point mutations were found in glycoprotein (BLLF1 and BALF4) genes of GD1, GD2 and HKNPC1 strains and might affect cell type specific binding. Variations in LMP1 and EBNA3B epitopes and mutations in Cp (11404 C>T) and Qp (50134 G>C) found in GD1, GD2 and HKNPC1 could potentially affect CD8+ T cell recognition and latent gene expression pattern in NPC, respectively. In conclusion, we showed that whole genome sequencing of EBV in NPC may facilitate discovery of previously unknown variations of pathogenic significance. PMID:22590638

  6. COMPARATIVE GENOMICS IN LEGUMES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The legume plant family will soon include three sequenced genomes. The majority of the gene-containing portions of the model legumes Medicago truncatula and Lotus japonicus have been sequenced in clone-by-clone projects, and the sequencing of the soybean genome is underway in a whole-genome shotgun ...

  7. Genome sequence of Candida versatilis and comparative analysis with other yeast.

    PubMed

    Hou, Lihua; Guo, Lin; Wang, Chunling; Wang, Cong

    2016-08-01

    The genome of Candida versatilis was sequenced to understand its characteristics in soy sauce fermentation. The genome size of C. versatilis was 9.7 Mb, the content of G + C was 39.74 %, scaffolds of N50 were 1,229,640 bp in length, containing 4711 gene. There were predicted 269 tRNA genes and 2201 proteins with clear function. Moreover, the genome information of C. versatilis was compared with another salt-tolerant yeast Zygosaccharomyces rouxii and the model organism Saccharomyces cerevisiae. C. versatilis and Z. rouxii genome size was close and both smaller than 12.1 for the Mb of S. cerevisiae. Using the OrthoMCL protein, three genomes were divided into 4663 groups. There were about 3326 homologous proteins in C. versatilis, Z. rouxii and S. cerevisiae. PMID:27234221

  8. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-01

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health. PMID:16341006

  9. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat.

    PubMed

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  10. The Mycobacterium DosR regulon structure and diversity revealed by comparative genomic analysis.

    PubMed

    Chen, Tian; He, Liming; Deng, Wanyan; Xie, Jianping

    2013-01-01

    Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), which claims approximately two million people annually, remains a global health concern. The non-replicating or dormancy like state of this pathogen which is impervious to anti-tuberculosis drugs is widely recognized as the culprit for this scenario. The dormancy survival regulator (DosR) regulon, composed of 48 co-regulated genes, is held as essential for Mtb persistence. The DosR regulon is regulated by a two-component regulatory system consisting of two sensor kinases-DosS (Rv3132c) and DosT (Rv2027c), and a response regulator DosR (Rv3133c). The underlying regulatory mechanism of DosR regulon expression is very complex. Many factors are involved, particularly the oxygen tension. The DosR regulon enables the pathogen to persist during lengthy hypoxia. Comparative genomic analysis demonstrated that the DosR regulon is widely distributed among the mycobacterial genomes, ranging from the pathogenic strains to the environmental strains. In-depth studies on the DosR response should provide insights into its role in TB latency in vivo and shape new measures to combat this exceeding recalcitrant pathogen. PMID:22833514

  11. The (d)evolution of methanotrophy in the Beijerinckiaceae—a comparative genomics analysis

    PubMed Central

    Tamas, Ivica; Smirnova, Angela V; He, Zhiguo; Dunfield, Peter F

    2014-01-01

    The alphaproteobacterial family Beijerinckiaceae contains generalists that grow on a wide range of substrates, and specialists that grow only on methane and methanol. We investigated the evolution of this family by comparing the genomes of the generalist organotroph Beijerinckia indica, the facultative methanotroph Methylocella silvestris and the obligate methanotroph Methylocapsa acidiphila. Highly resolved phylogenetic construction based on universally conserved genes demonstrated that the Beijerinckiaceae forms a monophyletic cluster with the Methylocystaceae, the only other family of alphaproteobacterial methanotrophs. Phylogenetic analyses also demonstrated a vertical inheritance pattern of methanotrophy and methylotrophy genes within these families. Conversely, many lateral gene transfer (LGT) events were detected for genes encoding carbohydrate transport and metabolism, energy production and conversion, and transcriptional regulation in the genome of B. indica, suggesting that it has recently acquired these genes. A key difference between the generalist B. indica and its specialist methanotrophic relatives was an abundance of transporter elements, particularly periplasmic-binding proteins and major facilitator transporters. The most parsimonious scenario for the evolution of methanotrophy in the Alphaproteobacteria is that it occurred only once, when a methylotroph acquired methane monooxygenases (MMOs) via LGT. This was supported by a compositional analysis suggesting that all MMOs in Alphaproteobacteria methanotrophs are foreign in origin. Some members of the Beijerinckiaceae subsequently lost methanotrophic functions and regained the ability to grow on multicarbon energy substrates. We conclude that B. indica is a recidivist multitroph, the only known example of a bacterium having completely abandoned an evolved lifestyle of specialized methanotrophy. PMID:23985741

  12. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat

    PubMed Central

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  13. Comparative genomic analysis and phenazine production of Pseudomonas chlororaphis, a plant growth-promoting rhizobacterium.

    PubMed

    Chen, Yawen; Shen, Xuemei; Peng, Huasong; Hu, Hongbo; Wang, Wei; Zhang, Xuehong

    2015-06-01

    Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30-84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869-6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield. PMID:26484173

  14. Emergence and Evolutionary Analysis of the Human DDR Network: Implications in Comparative Genomics and Downstream Analyses

    PubMed Central

    Arcas, Aida; Fernández-Capetillo, Oscar; Cases, Ildefonso; Rojas, Ana M.

    2014-01-01

    The DNA damage response (DDR) is a crucial signaling network that preserves the integrity of the genome. This network is an ensemble of distinct but often overlapping subnetworks, where different components fulfill distinct functions in precise spatial and temporal scenarios. To understand how these elements have been assembled together in humans, we performed comparative genomic analyses in 47 selected species to trace back their emergence using systematic phylogenetic analyses and estimated gene ages. The emergence of the contribution of posttranslational modifications to the complex regulation of DDR was also investigated. This is the first time a systematic analysis has focused on the evolution of DDR subnetworks as a whole. Our results indicate that a DDR core, mostly constructed around metabolic activities, appeared soon after the emergence of eukaryotes, and that additional regulatory capacities appeared later through complex evolutionary process. Potential key posttranslational modifications were also in place then, with interacting pairs preferentially appearing at the same evolutionary time, although modifications often led to the subsequent acquisition of new targets afterwards. We also found extensive gene loss in essential modules of the regulatory network in fungi, plants, and arthropods, important for their validation as model organisms for DDR studies. PMID:24441036

  15. Comparative genomic analysis and phenazine production of Pseudomonas chlororaphis, a plant growth-promoting rhizobacterium

    PubMed Central

    Chen, Yawen; Shen, Xuemei; Peng, Huasong; Hu, Hongbo; Wang, Wei; Zhang, Xuehong

    2015-01-01

    Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30–84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869–6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield. PMID:26484173

  16. Comparative genome analysis of lignin biosynthesis gene families across the plant kingdom

    PubMed Central

    2009-01-01

    Background As a major component of plant cell wall, lignin plays important roles in mechanical support, water transport, and stress responses. As the main cause for the recalcitrance of plant cell wall, lignin modification has been a major task for bioenergy feedstock improvement. The study of the evolution and function of lignin biosynthesis genes thus has two-fold implications. First, the lignin biosynthesis pathway provides an excellent model to study the coordinative evolution of a biochemical pathway in plants. Second, understanding the function and evolution of lignin biosynthesis genes will guide us to develop better strategies for bioenergy feedstock improvement. Results We analyzed lignin biosynthesis genes from fourteen plant species and one symbiotic fungal species. Comprehensive comparative genome analysis was carried out to study the distribution, relatedness, and family expansion of the lignin biosynthesis genes across the plant kingdom. In addition, we also analyzed the comparative synteny map between rice and sorghum to study the evolution of lignin biosynthesis genes within the Poaceae family and the chromosome evolution between the two species. Comprehensive lignin biosynthesis gene expression analysis was performed in rice, poplar and Arabidopsis. The representative data from rice indicates that different fates of gene duplications exist for lignin biosynthesis genes. In addition, we also carried out the biomass composition analysis of nine Arabidopsis mutants with both MBMS analysis and traditional wet chemistry methods. The results were analyzed together with the genomics analysis. Conclusion The research revealed that, among the species analyzed, the complete lignin biosynthesis pathway first appeared in moss; the pathway is absent in green algae. The expansion of lignin biosynthesis gene families correlates with substrate diversity. In addition, we found that the expansion of the gene families mostly occurred after the divergence of monocots

  17. Comparative genomic analysis of clinical and environmental Vibrio vulnificus isolates revealed biotype 3 evolutionary relationships

    PubMed Central

    Koton, Yael; Gordon, Michal; Chalifa-Caspi, Vered; Bisharat, Naiel

    2015-01-01

    In 1996 a common-source outbreak of severe soft tissue and bloodstream infections erupted among Israeli fish farmers and fish consumers due to changes in fish marketing policies. The causative pathogen was a new strain of Vibrio vulnificus, named biotype 3, which displayed a unique biochemical and genotypic profile. Initial observations suggested that the pathogen erupted as a result of genetic recombination between two distinct populations. We applied a whole genome shotgun sequencing approach using several V. vulnificus strains from Israel in order to study the pan genome of V. vulnificus and determine the phylogenetic relationship of biotype 3 with existing populations. The core genome of V. vulnificus based on 16 draft and complete genomes consisted of 3068 genes, representing between 59 and 78% of the whole genome of 16 strains. The accessory genome varied in size from 781 to 2044 kbp. Phylogenetic analysis based on whole, core, and accessory genomes displayed similar clustering patterns with two main clusters, clinical (C) and environmental (E), all biotype 3 strains formed a distinct group within the E cluster. Annotation of accessory genomic regions found in biotype 3 strains and absent from the core genome yielded 1732 genes, of which the vast majority encoded hypothetical proteins, phage-related proteins, and mobile element proteins. A total of 1916 proteins (including 713 hypothetical proteins) were present in all human pathogenic strains (both biotype 3 and non-biotype 3) and absent from the environmental strains. Clustering analysis of the non-hypothetical proteins revealed 148 protein clusters shared by all human pathogenic strains; these included transcriptional regulators, arylsulfatases, methyl-accepting chemotaxis proteins, acetyltransferases, GGDEF family proteins, transposases, type IV secretory system (T4SS) proteins, and integrases. Our study showed that V. vulnificus biotype 3 evolved from environmental populations and formed a genetically

  18. Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2013-01-01

    Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 103 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed. Importantly, cellulases of some GH families are present in fungi that are not known to have cellulose-degrading ability. In addition, our results also showed that in general, plant pathogenic fungi have the highest number of CAZymes. Biotrophic fungi tend to have fewer CAZymes than necrotrophic and hemibiotrophic fungi. Pathogens of dicots often contain more pectinases than fungi infecting monocots. Interestingly, besides yeasts, many saprophytic fungi that are highly active in degrading plant biomass contain fewer CAZymes than plant pathogenic fungi. Furthermore, analysis of the gene expression profile of the wheat scab fungus Fusarium graminearum revealed that most of the CAZyme genes related to cell wall degradation were up-regulated during plant infection. Phylogenetic analysis also

  19. Comparative Genomic Analysis of Mycobacterium avium subspecies Obtained from Multiple Host Species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comparative genomic approach was used to identify large sequence polymorphisms among Mycobacterium avium (M. avium) subspecies obtained from a variety of host animals. DNA microarrays were used as a platform for comparing mycobacterial isolates with the sequenced bovine isolate M. avium subsp. p...

  20. Comparative Analysis of the Complete Chloroplast Genomes of Five Quercus Species

    PubMed Central

    Yang, Yanci; Zhou, Tao; Duan, Dong; Yang, Jia; Feng, Li; Zhao, Guifang

    2016-01-01

    Quercus is considered economically and ecologically one of the most important genera in the Northern Hemisphere. Oaks are taxonomically perplexing because of shared interspecific morphological traits and intraspecific morphological variation, which are mainly attributed to hybridization. Universal plastid markers cannot provide a sufficient number of variable sites to explore the phylogeny of this genus, and chloroplast genome-scale data have proven to be useful in resolving intractable phylogenetic relationships. In this study, the complete chloroplast genomes of four Quercus species were sequenced, and one published chloroplast genome of Quercus baronii was retrieved for comparative analyses. The five chloroplast genomes ranged from 161,072 bp (Q. baronii) to 161,237 bp (Q. dolicholepis) in length, and their gene organization and order, and GC content, were similar to those of other Fagaceae species. We analyzed nucleotide substitutions, indels, and repeats in the chloroplast genomes, and found 19 relatively highly variable regions that will potentially provide plastid markers for further taxonomic and phylogenetic studies within Quercus. We observed that four genes (ndhA, ndhK, petA, and ycf1) were subject to positive selection. The phylogenetic relationships of the Quercus species inferred from the chloroplast genomes obtained moderate-to-high support, indicating that chloroplast genome data may be useful in resolving relationships in this genus. PMID:27446185

  1. [Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

    PubMed

    Kai, Xia; Xinle, Liang; Yudong, Li

    2015-12-01

    The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria. PMID:26704949

  2. A Comparative Genomic Analysis of Energy Metabolism in Sulfate Reducing Bacteria and Archaea

    PubMed Central

    Pereira, Inês A. Cardoso; Ramos, Ana Raquel; Grein, Fabian; Marques, Marta Coimbra; da Silva, Sofia Marques; Venceslau, Sofia Santos

    2011-01-01

    The number of sequenced genomes of sulfate reducing organisms (SRO) has increased significantly in the recent years, providing an opportunity for a broader perspective into their energy metabolism. In this work we carried out a comparative survey of energy metabolism genes found in 25 available genomes of SRO. This analysis revealed a higher diversity of possible energy conserving pathways than classically considered to be present in these organisms, and permitted the identification of new proteins not known to be present in this group. The Deltaproteobacteria (and Thermodesulfovibrio yellowstonii) are characterized by a large number of cytochromes c and cytochrome c-associated membrane redox complexes, indicating that periplasmic electron transfer pathways are important in these bacteria. The Archaea and Clostridia groups contain practically no cytochromes c or associated membrane complexes. However, despite the absence of a periplasmic space, a few extracytoplasmic membrane redox proteins were detected in the Gram-positive bacteria. Several ion-translocating complexes were detected in SRO including H+-pyrophosphatases, complex I homologs, Rnf, and Ech/Coo hydrogenases. Furthermore, we found evidence that cytoplasmic electron bifurcating mechanisms, recently described for other anaerobes, are also likely to play an important role in energy metabolism of SRO. A number of cytoplasmic [NiFe] and [FeFe] hydrogenases, formate dehydrogenases, and heterodisulfide reductase-related proteins are likely candidates to be involved in energy coupling through electron bifurcation, from diverse electron donors such as H2, formate, pyruvate, NAD(P)H, β-oxidation, and others. In conclusion, this analysis indicates that energy metabolism of SRO is far more versatile than previously considered, and that both chemiosmotic and flavin-based electron bifurcating mechanisms provide alternative strategies for energy conservation. PMID:21747791

  3. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris.

    PubMed

    Berka, Randy M; Grigoriev, Igor V; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M; Lombard, Vincent; Natvig, Donald O; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P; Allijn, Iris E; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J; Paulsen, Ian T; Elbourne, Liam D H; Baker, Scott E; Magnuson, Jon; Laboissiere, Sylvie; Clutterbuck, A John; Martinez, Diego; Wogulis, Mark; de Leon, Alfredo Lopez; Rey, Michael W; Tsang, Adrian

    2011-10-01

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics. PMID:21964414

  4. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    SciTech Connect

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott. E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael W.; Tsang, Adrian

    2011-05-16

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  5. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and thielavia terrestris

    SciTech Connect

    Berka, Randy; Grigoriev, Igor V.; Otillar, Robert P.; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; john, tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott E.; Magnuson, Jon K.; LaBoissiere, Sylvie; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  6. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    PubMed

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome. PMID:27079962

  7. Comparative whole-genome analysis of virulent and avirulent strains of Porphyromonas gingivalis.

    PubMed

    Chen, Tsute; Hosogi, Yumiko; Nishikawa, Kiyoshi; Abbey, Kevin; Fleischmann, Robert D; Walling, Jennifer; Duncan, Margaret J

    2004-08-01

    We used Porphyromonas gingivalis gene microarrays to compare the total gene contents of the virulent strain W83 and the avirulent type strain, ATCC 33277. Signal ratios and scatter plots indicated that the chromosomes were very similar, with approximately 93% of the predicted genes in common, while at least 7% of them showed very low or no signals in ATCC 33277. Verification of the array results by PCR indicated that several of the disparate genes were either absent from or variant in ATCC 33277. Divergent features included already reported insertion sequences and ragB, as well as additional hypothetical and functionally assigned genes. Several of the latter were organized in a putative operon in W83 and encoded enzymes involved in capsular polysaccharide synthesis. Another cluster was associated with two paralogous regions of the chromosome with a low G+C content, at 41%, compared to that of the whole genome, at 48%. These regions also contained conserved and species-specific hypothetical genes, transposons, insertion sequences, and integrases and were located adjacent to tRNA genes; thus, they had several characteristics of pathogenicity islands. While this global comparative analysis showed the close relationship between W83 and ATCC 33277, the clustering of genes that are present in W83 but divergent in or absent from ATCC 33277 is suggestive of chromosomal islands that may have been acquired by lateral gene transfer. PMID:15292149

  8. Comparative genomic analysis of phylogenetically closely related Hydrogenobaculum sp. isolates from Yellowstone National Park.

    PubMed

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L; McDermott, Timothy R

    2013-05-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥ 99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891

  9. Comparative Genomic Analysis of Phylogenetically Closely Related Hydrogenobaculum sp. Isolates from Yellowstone National Park

    PubMed Central

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L.

    2013-01-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891

  10. Genome-wide comparative analysis of the transposable elements in the related species Arabidopsis thaliana and Brassica oleracea.

    PubMed

    Zhang, Xiaoyu; Wessler, Susan R

    2004-04-13

    Transposable elements (TEs) are the major component of plant genomes where they contribute significantly to the >1,000-fold genome size variation. To understand the dynamics of TE-mediated genome expansion, we have undertaken a comparative analysis of the TEs in two related organisms: the weed Arabidopsis thaliana (125 megabases) and Brassica oleracea ( approximately 600 megabases), a species with many crop plants. Comparison of the whole genome sequence of A. thaliana with a partial draft of B. oleracea has permitted an estimation of the patterns of TE amplification, diversification, and loss that has occurred in related species since their divergence from a common ancestor. Although we find that nearly all TE lineages are shared, the number of elements in each lineage is almost always greater in B. oleracea. Class 1 (retro) elements are the most abundant TE class in both species with LTR and non-LTR elements comprising the largest fraction of each genome. However, several families of class 2 (DNA) elements have amplified to very high copy number in B. oleracea where they have contributed significantly to genome expansion. Taken together, the results of this analysis indicate that amplification of both class 1 and class 2 TEs is responsible, in part, for B. oleracea genome expansion since divergence from a common ancestor with A. thaliana. In addition, the observation that B. oleracea and A. thaliana share virtually all TE lineages makes it unlikely that wholesale removal of TEs is responsible for the compact genome of A. thaliana. PMID:15064405

  11. Complete genome sequence of Nitrobacter hamburgensis X14 and comparative genomic analysis of species within the genus Nitrobacter.

    SciTech Connect

    Starkenburg, Shawn R; Larimer, Frank W; Stein, Lisa Y; Klotz, Martin G; Chain, Patrick S. G.; Sayavedra-Soto, LA; Poret-Peterson, Amisha T.; Gentry, ME; Arp, D J; Ward, Bess B.; Bottomley, Peter J

    2008-05-01

    The alphaproteobacterium Nitrobacter hamburgensis X14 is a gram-negative facultative chemolithoautotroph that conserves energy from the oxidation of nitrite to nitrate. Sequencing and analysis of the Nitrobacter hamburgensis X14 genome revealed four replicons comprised of one chromosome (4.4 Mbp) and three plasmids (294, 188, and 121 kbp). Over 20% of the genome is composed of pseudogenes and paralogs. Whole-genome comparisons were conducted between N. hamburgensis and the finished and draft genome sequences of Nitrobacter winogradskyi and Nitrobacter sp. strain Nb-311A, respectively. Most of the plasmid-borne genes were unique to N. hamburgensis and encode a variety of functions (central metabolism, energy conservation, conjugation, and heavy metal resistance), yet approximately 21 kb of a approximately 28-kb "autotrophic" island on the largest plasmid was conserved in the chromosomes of Nitrobacter winogradskyi Nb-255 and Nitrobacter sp. strain Nb-311A. The N. hamburgensis chromosome also harbors many unique genes, including those for heme-copper oxidases, cytochrome b(561), and putative pathways for the catabolism of aromatic, organic, and one-carbon compounds, which help verify and extend its mixotrophic potential. A Nitrobacter "subcore" genome was also constructed by removing homologs found in strains of the closest evolutionary relatives, Bradyrhizobium japonicum and Rhodopseudomonas palustris. Among the Nitrobacter subcore inventory (116 genes), copies of genes or gene clusters for nitrite oxidoreductase (NXR), cytochromes associated with a dissimilatory nitrite reductase (NirK), PII-like regulators, and polysaccharide formation were identified. Many of the subcore genes have diverged significantly from, or have origins outside, the alphaproteobacterial lineage and may indicate some of the unique genetic requirements for nitrite oxidation in Nitrobacter.

  12. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  13. Mapping Reads on a Genomic Sequence: An Algorithmic Overview and a Practical Comparative Analysis

    PubMed Central

    Martin, Véronique; Zytnicki, Matthias; Fayolle, Julien; Loux, Valentin; Gibrat, Jean-François

    2012-01-01

    Abstract Mapping short reads against a reference genome is classically the first step of many next-generation sequencing data analyses, and it should be as accurate as possible. Because of the large number of reads to handle, numerous sophisticated algorithms have been developped in the last 3 years to tackle this problem. In this article, we first review the underlying algorithms used in most of the existing mapping tools, and then we compare the performance of nine of these tools on a well controled benchmark built for this purpose. We built a set of reads that exist in single or multiple copies in a reference genome and for which there is no mismatch, and a set of reads with three mismatches. We considered as reference genome both the human genome and a concatenation of all complete bacterial genomes. On each dataset, we quantified the capacity of the different tools to retrieve all the occurrences of the reads in the reference genome. Special attention was paid to reads uniquely reported and to reads with multiple hits. PMID:22506536

  14. Comparative genomics and transcriptional analysis of prophages identified in the genomes of Lactobacillus gasseri, Lactobacillus salivarius, and Lactobacillus casei.

    PubMed

    Ventura, Marco; Canchaya, Carlos; Bernini, Valentina; Altermann, Eric; Barrangou, Rodolphe; McGrath, Stephen; Claesson, Marcus J; Li, Yin; Leahy, Sinead; Walker, Carey D; Zink, Ralf; Neviani, Erasmo; Steele, Jim; Broadbent, Jeff; Klaenhammer, Todd R; Fitzgerald, Gerald F; O'toole, Paul W; van Sinderen, Douwe

    2006-05-01

    Lactobacillus gasseri ATCC 33323, Lactobacillus salivarius subsp. salivarius UCC 118, and Lactobacillus casei ATCC 334 contain one (LgaI), four (Sal1, Sal2, Sal3, Sal4), and one (Lca1) distinguishable prophage sequences, respectively. Sequence analysis revealed that LgaI, Lca1, Sal1, and Sal2 prophages belong to the group of Sfi11-like pac site and cos site Siphoviridae, respectively. Phylogenetic investigation of these newly described prophage sequences revealed that they have not followed an evolutionary development similar to that of their bacterial hosts and that they show a high degree of diversity, even within a species. The attachment sites were determined for all these prophage elements; LgaI as well as Sal1 integrates in tRNA genes, while prophage Sal2 integrates in a predicted arginino-succinate lyase-encoding gene. In contrast, Lca1 and the Sal3 and Sal4 prophage remnants are integrated in noncoding regions in the L. casei ATCC 334 and L. salivarius UCC 118 genomes. Northern analysis showed that large parts of the prophage genomes are transcriptionally silent and that transcription is limited to genome segments located near the attachment site. Finally, pulsed-field gel electrophoresis followed by Southern blot hybridization with specific prophage probes indicates that these prophage sequences are narrowly distributed within lactobacilli. PMID:16672450

  15. Comparative genomic hybridization analysis of invasive ductal breast carcinomas in the Chinese population

    PubMed Central

    ZHANG, JIANWEI; ZHANG, HONGYAN; XU, XIN; WANG, MINGRONG; YU, ZHONGHE

    2015-01-01

    Breast cancer is the most common malignancy in Chinese women. The aim of the present study was to investigate the genetic alterations that occur in breast cancer cells in Chinese women. Comparative genomic hybridization (CGH) analysis was performed on 34 tumors obtained from patients with primary invasive ductal breast carcinoma (IDC). Recurrent genetic alterations in breast cancer include gains on chromosomes 1q (59%), 16p (50%), 17q (44%), 8q (38%), 11q (32%), 20q (32%), 1p (24%), 20p (24%), 19q (21%) and 19p (18%). Losses are common on chromosomes 6q (15%), 8p (12%), 18 (12%), 4q (9%), X (9%) and 17p (9%). In the present study, high-level amplifications were observed on chromosomes 1q32, 8p, 11q13, 17q and 20q. Overall, the chromosomal DNA gains observed were consistent with the changes reported in Caucasian populations. However, the incidence of chromosomal DNA loss was lower in the present study compared with the incidence reported in the literature. The present results demonstrate the pattern of chromosomal imbalances in the invasive ductal breast carcinomas of Chinese females. PMID:26622803

  16. Cytogenetic analysis of myxoid liposarcoma and myxofibrosarcoma by array‐based comparative genomic hybridisation

    PubMed Central

    Ohguri, T; Hisaoka, M; Kawauchi, S; Sasaki, K; Aoki, T; Kanemitsu, S; Matsuyama, A; Korogi, Y; Hashimoto, H

    2006-01-01

    Aim To investigate overall chromosomal alterations using array‐based comparative genomic hybridisation (CGH) of myxoid liposarcomas (MLSs) and myxofibrosarcomas (MFSs). Materials and methods Genomic DNA extracted from fresh‐frozen tumour tissues was labelled with fluorochromes and then hybridised on to an array consisting of 1440 bacterial artificial chromosome clones representing regions throughout the entire human genome important in cytogenetics and oncology. Results DNA copy number aberrations (CNAs) were found in all the 8 MFSs, but no alterations were found in 7 (70%) of 10 MLSs. In MFSs, the most frequent CNAs were gains at 7p21.1–p22.1 and 12q15–q21.1 and a loss at 13q14.3–q34. The second most frequent CNAs were gains at 7q33–q35, 9q22.31–q22.33, 12p13.32–pter, 17q22–q23, Xp11.2 and Xq12 and losses at 10p13–p14, 10q25, 11p11–p14, 11q23.3–q25, 20p11–p12 and 21q22.13–q22.2, which were detected in 38% of the MFSs examined. In MLSs, only a few CNAs were found in two sarcomas with gains at 8p21.2–p23.3, 8q11.22–q12.2 and 8q23.1–q24.3, and in one with gains at 5p13.2–p14.3 and 5q11.2–5q35.2 and a loss at 21q22.2–qter. Conclusions MFS has more frequent and diverse CNAs than MLS, which reinforces the hypothesis that MFS is genetically different from MLS. Out‐array CGH analysis may also provide several entry points for the identification of candidate genes associated with oncogenesis and progression in MFS. PMID:16751306

  17. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data.

    PubMed

    Vallenet, David; Belda, Eugeni; Calteau, Alexandra; Cruveiller, Stéphane; Engelen, Stefan; Lajus, Aurélie; Le Fèvre, François; Longin, Cyrille; Mornico, Damien; Roche, David; Rouy, Zoé; Salvignol, Gregory; Scarpelli, Claude; Thil Smith, Adam Alexander; Weiman, Marion; Médigue, Claudine

    2013-01-01

    MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest. PMID:23193269

  18. Beginner’s guide to comparative bacterial genome analysis using next-generation sequence data

    PubMed Central

    2013-01-01

    High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner’s guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available E. coli data and free software tools, all which can be performed on a desktop computer. PMID:23575213

  19. Complete Chloroplast Genome Sequence of Omani Lime (Citrus aurantiifolia) and Comparative Analysis within the Rosids

    PubMed Central

    Su, Huei-Jiun; Hogenhout, Saskia A.; Al-Sadi, Abdullah M.; Kuo, Chih-Horng

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia). The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis) chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs) that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution. PMID:25398081

  20. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi.

    PubMed

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  1. Chromosomal microarray analysis, or comparative genomic hybridization: A high throughput approach

    PubMed Central

    Haeri, Mohammad; Gelowani, Violet; Beaudet, Arthur L.

    2015-01-01

    Pathological copy number variants (CNVs) and point mutations are major genetic causes of hundreds of disorders. Comparative genomic hybridization (CGH) also known as chromosomal microarray analysis (CMA) is the best available tool to detect copy number variations in chromosomal make up. We have optimized several different protocols and introduce a high-throughput approach to perform a cost-effective, fast, high-throughput and high-quality CMA. We managed to reach to high quality arrays with 17 ± 0.04 (mean ± SD, n = 90) Derivative Log Ratio (DLR) spread, a measure of array quality (<0.20 considered as excellent) for our arrays. High-throughput and high-quality arrays are gaining more attention and the current manuscript is a step forward to this increasing demand.•This manuscript introduces a low cost, fast, efficient, high throughput and high-quality aCGH protocol;•This protocol provides specific instructions and crucial detail for processing up to 24 slides which is equal to 48, 96, or 192 arrays by only one person in one day;•This manuscript is accompanied with a step-by-step video. PMID:26862485

  2. Comparative genomic analysis of upstream miRNA regulatory motifs in Caenorhabditis.

    PubMed

    Jovelin, Richard; Krizus, Aldis; Taghizada, Bakhtiyar; Gray, Jeremy C; Phillips, Patrick C; Claycomb, Julie M; Cutter, Asher D

    2016-07-01

    MicroRNAs (miRNAs) comprise a class of short noncoding RNA molecules that play diverse developmental and physiological roles by controlling mRNA abundance and protein output of the vast majority of transcripts. Despite the importance of miRNAs in regulating gene function, we still lack a complete understanding of how miRNAs themselves are transcriptionally regulated. To fill this gap, we predicted regulatory sequences by searching for abundant short motifs located upstream of miRNAs in eight species of Caenorhabditis nematodes. We identified three conserved motifs across the Caenorhabditis phylogeny that show clear signatures of purifying selection from comparative genomics, patterns of nucleotide changes in motifs of orthologous miRNAs, and correlation between motif incidence and miRNA expression. We then validated our predictions with transgenic green fluorescent protein reporters and site-directed mutagenesis for a subset of motifs located in an enhancer region upstream of let-7 We demonstrate that a CT-dinucleotide motif is sufficient for proper expression of GFP in the seam cells of adult C. elegans, and that two other motifs play incremental roles in combination with the CT-rich motif. Thus, functional tests of sequence motifs identified through analysis of molecular evolutionary signatures provide a powerful path for efficiently characterizing the transcriptional regulation of miRNA genes. PMID:27140965

  3. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi

    PubMed Central

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  4. Comparative genome analysis and identification of competitive and cooperative interactions in a polymicrobial disease

    PubMed Central

    Endo, Akiko; Watanabe, Takayasu; Ogata, Nachiko; Nozawa, Takashi; Aikawa, Chihiro; Arakawa, Shinichi; Maruyama, Fumito; Izumi, Yuichi; Nakagawa, Ichiro

    2015-01-01

    Polymicrobial diseases are caused by combinations of multiple bacteria, which can lead to not only mild but also life-threatening illnesses. Periodontitis represents a polymicrobial disease; Porphyromonas gingivalis, Treponema denticola and Tannerella forsythia, called ‘the red complex', have been recognized as the causative agents of periodontitis. Although molecular interactions among the three species could be responsible for progression of periodontitis, the relevant genetic mechanisms are unknown. In this study, we uncovered novel interactions in comparative genome analysis among the red complex species. Clustered regularly interspaced short palindromic repeats (CRISPRs) of T. forsythia might attack the restriction modification system of P. gingivalis, and possibly work as a defense system against DNA invasion from P. gingivalis. On the other hand, gene deficiencies were mutually compensated in metabolic pathways when the genes of all the three species were taken into account, suggesting that there are cooperative relationships among the three species. This notion was supported by the observation that each of the three species had its own virulence factors, which might facilitate persistence and manifestations of virulence of the three species. Here, we propose new mechanisms of bacterial symbiosis in periodontitis; these mechanisms consist of competitive and cooperative interactions. Our results might shed light on the pathogenesis of periodontitis and of other polymicrobial diseases. PMID:25171331

  5. Genome-Wide Comparative Analysis of Flowering-Related Genes in Arabidopsis, Wheat, and Barley

    PubMed Central

    Peng, Fred Y.; Hu, Zhiqiu; Yang, Rong-Cai

    2015-01-01

    Early flowering is an important trait influencing grain yield and quality in wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) in short-season cropping regions. However, due to large and complex genomes of these species, direct identification of flowering genes and their molecular characterization remain challenging. Here, we used a bioinformatic approach to predict flowering-related genes in wheat and barley from 190 known Arabidopsis (Arabidopsis thaliana (L.) Heynh.) flowering genes. We identified 900 and 275 putative orthologs in wheat and barley, respectively. The annotated flowering-related genes were clustered into 144 orthologous groups with one-to-one, one-to-many, many-to-one, and many-to-many orthology relationships. Our approach was further validated by domain and phylogenetic analyses of flowering-related proteins and comparative analysis of publicly available microarray data sets for in silico expression profiling of flowering-related genes in 13 different developmental stages of wheat and barley. These further analyses showed that orthologous gene pairs in three critical flowering gene families (PEBP, MADS, and BBX) exhibited similar expression patterns among 13 developmental stages in wheat and barley, suggesting similar functions among the orthologous genes with sequence and expression similarities. The predicted candidate flowering genes can be confirmed and incorporated into molecular breeding for early flowering wheat and barley in short-season cropping regions. PMID:26435710

  6. Comparative genomic hybridization analysis of abnormalities in chromosome 21 in childhood osteosarcoma.

    PubMed

    dos Santos Aguiar, Simone; de Jesus Girotto Zambaldi, Lilian; dos Santos, Adilson Manoel; Pinto, Walter; Brandalise, Silvia Regina

    2007-05-01

    Osteosarcomas (OS) are aggressive tumors of the bone and often have a poor prognosis. The tumors exhibit karyotypes with a high degree of complexity, which has made it difficult to determine whether any recurrent chromosomal aberrations characterize OS. To address inherent difficulties associated with classical cytogenetic analysis, comparative genomic hybridization (CGH) was applied to OS tissue. Forty-one pediatric OS specimens were analyzed by a CGH technique: 24 female and 17 male patients, with a median age of 12 years and 4 months. Chromosomal abnormalities were highly diverse and variable, including gains of chromosome 1p, 2p, 3q, 5q, 5p, and 6p and losses of 14q (50% in 14q11.2), 15q, and 16p. A high level of losses of chromosome 21 was present (26/41 cases; P = 0.008), most often loss of the 21q11.2 approximately 21 region. These novel findings in chromosome 21 of pediatric OS tumors suggest that specific sequences mapping to these chromosomal regions are likely to play a role in the development of OS. PMID:17498555

  7. Genome-Wide Comparative Analysis of Flowering-Related Genes in Arabidopsis, Wheat, and Barley.

    PubMed

    Peng, Fred Y; Hu, Zhiqiu; Yang, Rong-Cai

    2015-01-01

    Early flowering is an important trait influencing grain yield and quality in wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) in short-season cropping regions. However, due to large and complex genomes of these species, direct identification of flowering genes and their molecular characterization remain challenging. Here, we used a bioinformatic approach to predict flowering-related genes in wheat and barley from 190 known Arabidopsis (Arabidopsis thaliana (L.) Heynh.) flowering genes. We identified 900 and 275 putative orthologs in wheat and barley, respectively. The annotated flowering-related genes were clustered into 144 orthologous groups with one-to-one, one-to-many, many-to-one, and many-to-many orthology relationships. Our approach was further validated by domain and phylogenetic analyses of flowering-related proteins and comparative analysis of publicly available microarray data sets for in silico expression profiling of flowering-related genes in 13 different developmental stages of wheat and barley. These further analyses showed that orthologous gene pairs in three critical flowering gene families (PEBP, MADS, and BBX) exhibited similar expression patterns among 13 developmental stages in wheat and barley, suggesting similar functions among the orthologous genes with sequence and expression similarities. The predicted candidate flowering genes can be confirmed and incorporated into molecular breeding for early flowering wheat and barley in short-season cropping regions. PMID:26435710

  8. Genome-wide characterization and comparative analysis of the MLO gene family in cotton.

    PubMed

    Wang, Xiaoyan; Ma, Qifeng; Dou, Lingling; Liu, Zhen; Peng, Renhai; Yu, Shuxun

    2016-06-01

    In plants, MLO (Mildew Locus O) gene encodes a plant-specific seven transmembrane (TM) domain protein involved in several cellular processes, including susceptibility to powdery mildew (PM). In this study, a genome-wide characterization of the MLO gene family in G. raimondii L., G. arboreum L. and G. hirsutum L. was performed. In total, 22, 17 and 38 homologous sequences were identified for each species, respectively. Gene organization, including chromosomal location, gene clustering and gene duplication, was investigated. Homologues related to PM susceptibility in upland cotton were inferred by phylogenetic relationships with functionally characterized MLO proteins. To conduct a comparative analysis between MLO candidate genes from G. raimondii L., G. arboreum L. and G. hirsutum L., orthologous relationships and conserved synteny blocks were constructed. The transcriptional variation of 38 GhMLO genes in response to exogenous application of salt, mannitol (Man), abscisic acid (ABA), ethylene (ETH), jasmonic acid (JA) and salicylic acid (SA) was monitored. Further studies should be conducted to elucidate the functions of MLO genes in PM susceptibility and phytohormone signalling pathways. PMID:26986931

  9. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    PubMed Central

    Wu, Xiao; Monchy, Sébastien; Taghavi, Safiyh; Zhu, Wei; Ramos, Juan; van der Lelie, Daniel

    2011-01-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands. PMID:20796030

  10. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    SciTech Connect

    Wu X.; van der Lelie D.; Monchy, S.; Taghavi, S.; Zhu, W.; Ramos, J.

    2011-03-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands.

  11. Comparative analysis of the Oenococcus oeni pan genome reveals genetic diversity in industrially-relevant pathways

    PubMed Central

    2012-01-01

    Background Oenococcus oeni, a member of the lactic acid bacteria, is one of a limited number of microorganisms that not only survive, but actively proliferate in wine. It is also unusual as, unlike the majority of bacteria present in wine, it is beneficial to wine quality rather than causing spoilage. These benefits are realised primarily through catalysing malolactic fermentation, but also through imparting other positive sensory properties. However, many of these industrially-important secondary attributes have been shown to be strain-dependent and their genetic basis it yet to be determined. Results In order to investigate the scale and scope of genetic variation in O. oeni, we have performed whole-genome sequencing on eleven strains of this bacterium, bringing the total number of strains for which genome sequences are available to fourteen. While any single strain of O. oeni was shown to contain around 1800 protein-coding genes, in-depth comparative annotation based on genomic synteny and protein orthology identified over 2800 orthologous open reading frames that comprise the pan genome of this species, and less than 1200 genes that make up the conserved genomic core present in all of the strains. The expansion of the pan genome relative to the coding potential of individual strains was shown to be due to the varied presence and location of multiple distinct bacteriophage sequences and also in various metabolic functions with potential impacts on the industrial performance of this species, including cell wall exopolysaccharide biosynthesis, sugar transport and utilisation and amino acid biosynthesis. Conclusions By providing a large cohort of sequenced strains, this study provides a broad insight into the genetic variation present within O. oeni. This data is vital to understanding and harnessing the phenotypic variation present in this economically-important species. PMID:22863143

  12. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  13. Comparative analysis of two phenotypically-similar but genomically-distinct Burkholderia cenocepacia-specific bacteriophages

    PubMed Central

    2012-01-01

    Background Genomic analysis of bacteriophages infecting the Burkholderia cepacia complex (BCC) is an important preliminary step in the development of a phage therapy protocol for these opportunistic pathogens. The objective of this study was to characterize KL1 (vB_BceS_KL1) and AH2 (vB_BceS_AH2), two novel Burkholderia cenocepacia-specific siphoviruses isolated from environmental samples. Results KL1 and AH2 exhibit several unique phenotypic similarities: they infect the same B. cenocepacia strains, they require prolonged incubation at 30°C for the formation of plaques at low titres, and they do not form plaques at similar titres following incubation at 37°C. However, despite these similarities, we have determined using whole-genome pyrosequencing that these phages show minimal relatedness to one another. The KL1 genome is 42,832 base pairs (bp) in length and is most closely related to Pseudomonas phage 73 (PA73). In contrast, the AH2 genome is 58,065 bp in length and is most closely related to Burkholderia phage BcepNazgul. Using both BLASTP and HHpred analysis, we have identified and analyzed the putative virion morphogenesis, lysis, DNA binding, and MazG proteins of these two phages. Notably, MazG homologs identified in cyanophages have been predicted to facilitate infection of stationary phase cells and may contribute to the unique plaque phenotype of KL1 and AH2. Conclusions The nearly indistinguishable phenotypes but distinct genomes of KL1 and AH2 provide further evidence of both vast diversity and convergent evolution in the BCC-specific phage population. PMID:22676492

  14. Study of Modern Human Evolution via Comparative Analysis with the Neanderthal Genome

    PubMed Central

    Ahmed, Musaddeque

    2013-01-01

    Many other human species appeared in evolution in the last 6 million years that have not been able to survive to modern times and are broadly known as archaic humans, as opposed to the extant modern humans. It has always been considered fascinating to compare the modern human genome with that of archaic humans to identify modern human-specific sequence variants and figure out those that made modern humans different from their predecessors or cousin species. Neanderthals are the latest humans to become extinct, and many factors made them the best representatives of archaic humans. Even though a number of comparisons have been made sporadically between Neanderthals and modern humans, mostly following a candidate gene approach, the major breakthrough took place with the sequencing of the Neanderthal genome. The initial genome-wide comparison, based on the first draft of the Neanderthal genome, has generated some interesting inferences regarding variations in functional elements that are not shared by the two species and the debated admixture question. However, there are certain other genetic elements that were not included or included at a smaller scale in those studies, and they should be compared comprehensively to better understand the molecular make-up of modern humans and their phenotypic characteristics. Besides briefly discussing the important outcomes of the comparative analyses made so far between modern humans and Neanderthals, we propose that future comparative studies may include retrotransposons, pseudogenes, and conserved non-coding regions, all of which might have played significant roles during the evolution of modern humans. PMID:24465235

  15. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    PubMed

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  16. Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways.

    PubMed

    Tello-Ruiz, Marcela Karey; Stein, Joshua; Wei, Sharon; Youens-Clark, Ken; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene is an integrated informatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for economically important and research model crops, including wheat, potato, tomato, banana, grape, poplar, and Chlamydomonas. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) view a phylogenetic tree for a family of transcription factors, (2) explore genetic variation in the orthologues of a gene with a known trait association, and (3) upload, visualize, and privately share end user data into a new genome browser track.Moreover, this is the first publication describing Gramene's new web interface-intended to provide a simplified portal to the most complete and up-to-date set of plant genome and pathway annotations. PMID:26519404

  17. Genome-scale comparative analysis of gene fusions, gene fissions, and the fungal tree of life

    PubMed Central

    Leonard, Guy; Richards, Thomas A.

    2012-01-01

    During the course of evolution genes undergo both fusion and fission by which ORFs are joined or separated. These processes can amend gene function and represent an important factor in the evolution of protein interaction networks. Gene fusions have been suggested to be useful characters for identifying evolutionary relationships because they constitute synapomorphies or cladistic characters. To investigate the fidelity of gene-fusion characters, we developed an approach for identifying differentially distributed gene fusions among whole-genome datasets: fdfBLAST. Applying this tool to the Fungi, we identified 63 gene fusions present in two or more genomes. Using a combination of phylogenetic and comparative genomic analyses, we then investigated the evolution of these genes across 115 fungal genomes, testing each gene fusion for evidence of homoplasy, including gene fission, convergence, and horizontal gene transfer. These analyses demonstrated 110 gene-fission events. We then identified a minimum of three mechanisms that drive gene fission: separation, degeneration, and duplication. These data suggest that gene fission plays an important and hitherto underestimated role in gene evolution. Gene fusions therefore are highly labile characters, and their use for polarizing evolutionary relationships, without reference to gene and species phylogenies, is limited. Accounting for these considerable sources of homoplasy, we identified fusion characters that provide support for multiple nodes in the phylogeny of the Fungi, including relationships within the deeply derived flagellum-forming fungi (i.e., the chytrids). PMID:23236161

  18. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication

    PubMed Central

    Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L.; Searle, Steven M. J.; Minx, Patrick; Hillier, LaDeana W.; Koboldt, Daniel C.; Davis, Brian W.; Driscoll, Carlos A.; Barr, Christina S.; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W. C.; Hahn, Matthew W.; Menotti-Raymond, Marilyn; O’Brien, Stephen J.; Wilson, Richard K.; Lyons, Leslie A.; Murphy, William J.; Warren, Wesley C.

    2014-01-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  19. Comparative genomic hybridization: an overview.

    PubMed Central

    Houldsworth, J.; Chaganti, R. S.

    1994-01-01

    Comparative genomic hybridization (CGH) is a newly described molecular-cytogenetic assay that globally assays for chromosomal gains and losses in a genomic complement. In this assay, normal human metaphase chromosomes are competitively hybridized with two differentially labeled genomic DNAs (test and reference), which upon fluorescence microscopy, reveal the chromosomal locations of copy number changes in DNA sequences between the two complements. Application of CGH to DNAs extracted from fresh frozen specimens and cell lines of various tumor types has revealed a number of recurring chromosomal gains and losses that were undetected by traditional cytogenetic analysis. Few previously known sites were found to be in higher copy number, or lost by CGH, while many novel amplified regions were identified. These regions warrant further molecular genetic studies aimed at isolating the perturbed genes. Since CGH can also be performed on DNA extracted from formalin-fixed paraffin-embedded archived tumor specimens with few modifications, gains and losses of genetic material can be determined for specimens that would otherwise be unanalyzable. Prospective and retrospective application of CGH to tumor specimens would permit correlative studies to be performed, possibly identifying diagnostic and prognostic indicators of disease. CGH may also have a future role in detection and identification of chromosomal abnormalities in prenatal diagnosis and in dysmorphic anomalies. Images Figure 1 Figure 2 PMID:7992829

  20. Comparative genomic analysis of Acidithiobacillus ferrooxidans strains using the A. ferrooxidans ATCC 23270 whole-genome oligonucleotide microarray.

    PubMed

    Luo, Hailang; Shen, Li; Yin, Huaqun; Li, Qian; Chen, Qijiong; Luo, Yanjie; Liao, Liqin; Qiu, Guanzhou; Liu, Xueduan

    2009-05-01

    Acidithiobacillus ferrooxidans is an important microorganism used in biomining operations for metal recovery. Whole-genomic diversity analysis based on the oligonucleotide microarray was used to analyze the gene content of 12 strains of A. ferrooxidans purified from various mining areas in China. Among the 3100 open reading frames (ORFs) on the slides, 1235 ORFs were absent in at least 1 strain of bacteria and 1385 ORFs were conserved in all strains. The hybridization results showed that these strains were highly diverse from a genomic perspective. The hybridization results of 4 major functional gene categories, namely electron transport, carbon metabolism, extracellular polysaccharides, and detoxification, were analyzed. Based on the hybridization signals obtained, a phylogenetic tree was built to analyze the evolution of the 12 tested strains, which indicated that the geographic distribution was the main factor influencing the strain diversity of these strains. Based on the hybridization signals of genes associated with bioleaching, another phylogenetic tree showed an evolutionary relationship from which the co-relation between the clustering of specific genes and geochemistry could be observed. The results revealed that the main factor was geochemistry, among which the following 6 factors were the most important: pH, Mg, Cu, S, Fe, and Al. PMID:19483787

  1. Comparative Genome Analysis Reveals Metabolic Versatility and Environmental Adaptations of Sulfobacillus thermosulfidooxidans Strain ST

    PubMed Central

    Guo, Xue; Yin, Huaqun; Liang, Yili; Hu, Qi; Zhou, Xishu; Xiao, Yunhua; Ma, Liyuan; Zhang, Xian; Qiu, Guanzhou; Liu, Xueduan

    2014-01-01

    The genus Sulfobacillus is a cohort of mildly thermophilic or thermotolerant acidophiles within the phylum Firmicutes and requires extremely acidic environments and hypersalinity for optimal growth. However, our understanding of them is still preliminary partly because few genome sequences are available. Here, the draft genome of Sulfobacillus thermosulfidooxidans strain ST was deciphered to obtain a comprehensive insight into the genetic content and to understand the cellular mechanisms necessary for its survival. Furthermore, the expressions of key genes related with iron and sulfur oxidation were verified by semi-quantitative RT-PCR analysis. The draft genome sequence of Sulfobacillus thermosulfidooxidans strain ST, which encodes 3225 predicted coding genes on a total length of 3,333,554 bp and a 48.35% G+C, revealed the high degree of heterogeneity with other Sulfobacillus species. The presence of numerous transposases, genomic islands and complete CRISPR/Cas defence systems testifies to its dynamic evolution consistent with the genome heterogeneity. As expected, S. thermosulfidooxidans encodes a suit of conserved enzymes required for the oxidation of inorganic sulfur compounds (ISCs). The model of sulfur oxidation in S. thermosulfidooxidans was proposed, which showed some different characteristics from the sulfur oxidation of Gram-negative A. ferrooxidans. Sulfur oxygenase reductase and heterodisulfide reductase were suggested to play important roles in the sulfur oxidation. Although the iron oxidation ability was observed, some key proteins cannot be identified in S. thermosulfidooxidans. Unexpectedly, a predicted sulfocyanin is proposed to transfer electrons in the iron oxidation. Furthermore, its carbon metabolism is rather flexible, can perform the transformation of pentose through the oxidative and non-oxidative pentose phosphate pathways and has the ability to take up small organic compounds. It encodes a multitude of heavy metal resistance systems to

  2. Comparative Reannotation of 21 Aspergillus Genomes

    SciTech Connect

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  3. Comparative genomic analysis reveals a distant liver enhancer upstream of the COUP-TFII gene

    SciTech Connect

    Baroukh, Nadine; Ahituv, Nadav; Chang, Jessie; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len A.

    2004-08-20

    COUP-TFII is a central nuclear hormone receptor that tightly regulates the expression of numerous target lipid metabolism genes in vertebrates. However, it remains unclear how COUP-TFII itself is transcriptionally controlled since studies with its promoter and upstream region fail to recapitulate the genes liver expression. In an attempt to identify liver enhancers in the vicinity of COUP-TFII, we employed a comparative genomic approach. Initial comparisons between humans and mice of the 3,470kb gene poor region surrounding COUP-TFII revealed 2,023 conserved non-coding elements. To prioritize a subset of these elements for functional studies, we performed further genomic comparisons with the orthologous pufferfish (Fugu rubripes) locus and uncovered two anciently conserved non-coding sequences (CNS) upstream of COUP-TFII (CNS-62kb and CNS-66kb). Testing these two elements using reporter constructs in liver (HepG2) cells revealed that CNS-66kb, but not CNS-62kb, yielded robust in vitro enhancer activity. In addition, an in vivo reporter assay using naked DNA transfer with CNS-66kb linked to luciferase displayed strong reproducible liver expression in adult mice, further supporting its role as a liver enhancer. Together, these studies further support the utility of comparative genomics to uncover gene regulatory sequences based on evolutionary conservation and provide the substrates to better understand the regulation and expression of COUP-TFII.

  4. Comparative Genomic Sequence Analysis of the Human Chromosome 21 Down Syndrome Critical Region

    PubMed Central

    Toyoda, Atsushi; Noguchi, Hideki; Taylor, Todd D.; Ito, Takehiko; Pletcher, Mathew T.; Sakaki, Yoshiyuki; Reeves, Roger H.; Hattori, Masahira

    2002-01-01

    Comprehensive knowledge of the gene content of human chromosome 21 (HSA21) is essential for understanding the etiology of Down syndrome (DS). Here we report the largest comparison of finished mouse and human sequence to date for a 1.35-Mb region of mouse chromosome 16 (MMU16) that corresponds to human chromosome 21q22.2. This includes a portion of the commonly described “DS critical region,” thought to contain a gene or genes whose dosage imbalance contributes to a number of phenotypes associated with DS. We used comparative sequence analysis to construct a DNA feature map of this region that includes all known genes, plus 144 conserved sequences ≥100 bp long that show ≥80% identity between mouse and human but do not match known exons. Twenty of these have matches to expressed sequence tag and cDNA databases, indicating that they may be transcribed sequences from chromosome 21. Eight putative CpG islands are found at conserved positions. Models for two human genes, DSCR4 and DSCR8, are not supported by conserved sequence, and close examination indicates that low-level transcripts from these loci are unlikely to encode proteins. Gene prediction programs give different results when used to analyze the well-conserved regions between mouse and human sequences. Our findings have implications for evolution and for modeling the genetic basis of DS in mice. [Sequence data described in this paper have been submitted to the DDBJ/GenBank under accession nos. AP003148 through AP003158, and AB066227. Supplemental material is available at http://www.genome.org.] PMID:12213769

  5. Comparative Genomic Analysis Reveals 2-Oxoacid Dehydrogenase Complex Lipoylation Correlation with Aerobiosis in Archaea

    PubMed Central

    Borziak, Kirill; Posner, Mareike G.; Upadhyay, Abhishek; Danson, Michael J.; Bagby, Stefan; Dorus, Steve

    2014-01-01

    Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea

  6. Comparative analysis of bat genomes provides insight into the evolution of flight and immunity.

    PubMed

    Zhang, Guojie; Cowled, Christopher; Shi, Zhengli; Huang, Zhiyong; Bishop-Lilly, Kimberly A; Fang, Xiaodong; Wynne, James W; Xiong, Zhiqiang; Baker, Michelle L; Zhao, Wei; Tachedjian, Mary; Zhu, Yabing; Zhou, Peng; Jiang, Xuanting; Ng, Justin; Yang, Lan; Wu, Lijun; Xiao, Jin; Feng, Yue; Chen, Yuanxin; Sun, Xiaoqing; Zhang, Yong; Marsh, Glenn A; Crameri, Gary; Broder, Christopher C; Frey, Kenneth G; Wang, Lin-Fa; Wang, Jun

    2013-01-25

    Bats are the only mammals capable of sustained flight and are notorious reservoir hosts for some of the world's most highly pathogenic viruses, including Nipah, Hendra, Ebola, and severe acute respiratory syndrome (SARS). To identify genetic changes associated with the development of bat-specific traits, we performed whole-genome sequencing and comparative analyses of two distantly related species, fruit bat Pteropus alecto and insectivorous bat Myotis davidii. We discovered an unexpected concentration of positively selected genes in the DNA damage checkpoint and nuclear factor κB pathways that may be related to the origin of flight, as well as expansion and contraction of important gene families. Comparison of bat genomes with other mammalian species has provided new insights into bat biology and evolution. PMID:23258410

  7. Complete Sequence and Comparative Analysis of the Chloroplast Genome of Coconut Palm (Cocos nucifera)

    PubMed Central

    Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703

  8. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    PubMed

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703

  9. Comparative Genomic Analysis Reveals a Possible Novel Non-Tuberculous Mycobacterium Species with High Pathogenic Potential

    PubMed Central

    Choo, Siew Woh; Dutta, Avirup; Wong, Guat Jah; Wee, Wei Yee; Ang, Mia Yang; Siow, Cheuk Chuen

    2016-01-01

    Mycobacteria have been reported to cause a wide range of human diseases. We present the first whole-genome study of a Non-Tuberculous Mycobacterium, Mycobacterium sp. UM_CSW (referred to hereafter as UM_CSW), isolated from a patient diagnosed with bronchiectasis. Our data suggest that this clinical isolate is likely a novel mycobacterial species, supported by clear evidence from molecular phylogenetic, comparative genomic, ANI and AAI analyses. UM_CSW is closely related to the Mycobacterium avium complex. While it has characteristic features of an environmental bacterium, it also shows a high pathogenic potential with the presence of a wide variety of putative genes related to bacterial virulence and shares very similar pathogenomic profiles with the known pathogenic mycobacterial species. Thus, we conclude that this possible novel Mycobacterium species should be tightly monitored for its possible causative role in human infections. PMID:27035710

  10. Comparative genome analysis between Agrostis stolonifera and members of the Pooideae subfamily, including Brachypodium distachyon.

    PubMed

    Araneda, Loreto; Sim, Sung-Chur; Bae, Jin-Joo; Chakraborty, Nanda; Curley, Joe; Chang, Taehyun; Inoue, Maiko; Warnke, Scott; Jung, Geunhwa

    2013-01-01

    Creeping bentgrass (Agrostis stolonifera, allotetraploid 2n = 4x = 28) is one of the major cool-season turfgrasses. It is widely used on golf courses due to its tolerance to low mowing and aggressive growth habit. In this study, we investigated genome relationships of creeping bentgrass relative to the Triticeae (a consensus map of Triticum aestivum, T. tauschii, Hordeum vulgare, and H. spontaneum), oat, rice, and ryegrass maps using a common set of 229 EST-RFLP markers. The genome comparisons based on the RFLP markers revealed large-scale chromosomal rearrangements on different numbers of linkage groups (LGs) of creeping bentgrass relative to the Triticeae (3 LGs), oat (4 LGs), and rice (8 LGs). However, we detected no chromosomal rearrangement between creeping bentgrass and ryegrass, suggesting that these recently domesticated species might be closely related, despite their memberships to different Pooideae tribes. In addition, the genome of creeping bentgrass was compared with the complete genome sequence of Brachypodium distachyon in Pooideae subfamily using both sequences of the above-mentioned mapped EST-RFLP markers and sequences of 8,470 publicly available A. stolonifera ESTs (AgEST). We discovered large-scale chromosomal rearrangements on six LGs of creeping bentgrass relative to B. distachyon. Also, a total of 24 syntenic blocks based on 678 orthologus loci were identified between these two grass species. The EST orthologs can be utilized in further comparative mapping of Pooideae species. These results will be useful for genetic improvement of Agrostis species and will provide a better understanding of evolution within Pooideae species. PMID:24244501

  11. Gramene: a growing plant comparative genomics resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  12. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis

    PubMed Central

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense. PMID:27031249

  13. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    PubMed

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense. PMID:27031249

  14. The use of clustering software for the classification of comparative genomic hybridization data. an analysis of 109 malignant fibrous histiocytomas.

    PubMed

    Chibon, Frédéric; Mariani, Odette; Mairal, Aline; Derré, Josette; Coindre, Jean-Michel; Terrier, Philippe; Lagacé, Réal; Sastre, Xavier; Aurias, Alain

    2003-02-01

    Malignant fibrous histiocytoma (MFH) is considered the most frequent soft-tissue sarcoma of late adult life. Nevertheless, the validity of this entity has been recurrently questioned by pathologists. Preliminary analyses by comparative genomic hybridization (CGH) of series of MFH have suggested that this tumor group is heterogeneous at the genomic level, and that at least two main genetic subgroups exist. We report an analysis by CGH of a large series of 109 MFH and on the use of clustering software for an objective classification of these tumors. We confirm our preliminary CGH results and demonstrate that two main clusters of tumors are present in the series analyzed. PMID:12581902

  15. Joint GWAS Analysis: Comparing similar GWAS at different genomic resolutions identifies novel pathway associations with six complex diseases

    PubMed Central

    McGeachie, Michael J.; Clemmer, George L.; Lasky-Su, Jessica; Dahlin, Amber; Raby, Benjamin A.; Weiss, Scott T.

    2014-01-01

    We show here that combining two existing genome wide association studies (GWAS) yields additional biologically relevant information, beyond that obtained by either GWAS separately. We propose Joint GWAS Analysis, a method that compares a pair of GWAS for similarity among the top SNP associations, top genes identified, gene functional clusters, and top biological pathways. We show that Joint GWAS Analysis identifies additional enriched biological pathways that would be missed by traditional Single-GWAS analysis. Furthermore, we examine the similarities of six complex genetic disorders at the SNP-level, gene-level, gene-cluster-level, and pathway-level. We make concrete hypotheses regarding novel pathway associations for several complex disorders considered, based on the results of Joint GWAS Analysis. Together, these results demonstrate that common complex disorders share substantially more genomic architecture than has been previously realized and that the meta-analysis of GWAS needs not be limited to GWAS of the same phenotype to be informative. PMID:25838990

  16. Comparative genomic hybridization analysis of newly established retinoblastoma cell lines of adherent growth compared with Y79 of nonadherent growth.

    PubMed

    Kim, Jeong Hun; Kim, Jin Hyoung; Yu, Young Suk; Kim, Dong Hun; Kim, Yong Kyu; Kim, Kyu-Won

    2008-08-01

    Retinoblastoma (RB) shows cytogenetic aberrations involving genes other than RB gene located on 13q14. We analyzed genomic aberration in newly established RB cell lines SNUOT-RB1 and SNUOT-RB4 of adherent growth and Y79 cell line of nonadherent growth by microarray comparative genomic hybridization. SNUOT-RB1 showed 44 significant copy number changes (gain in 11 and loss in 33, P<0.0005). SNUOT-RB4 showed 42 significant copy number changes (gain in 8 and loss in 34, P<0.0005). Y79 cell line had the greatest gain of 19.65-fold in the locus of MYCN gene 2p24.1, whereas SNUOT-RB1 and SNUOT-RB4 showed no significant gain. SNUOT-RB1 and SNUOT-RB4 gained chromosomal copy numbers commonly in chromosome 11, especially in locus 11q13, which is responsible for cancer-related genes such as CCND1, MEN1, and FGF3. Losses of copy numbers occurred in chromosomes 3, 9, 10, 11, 16, and 17. In summary, SNUOT-RB1 and SNUOT-RB4 represented similar pattern in gain and loss of chromosomal copy number changes, while different from Y79. The loss of CYLD gene of tumor suppressor gene, 16q12-q13, was only on locus of common involvement in 3 cell lines. PMID:18799932

  17. Comparative genome analysis reveals the molecular basis of nicotine degradation and survival capacities of Arthrobacter

    PubMed Central

    Yao, Yuxiang; Tang, Hongzhi; Su, Fei; Xu, Ping

    2015-01-01

    Arthrobacter is one of the most prevalent genera of nicotine-degrading bacteria; however, studies of nicotine degradation in Arthrobacter species remain at the plasmid level (plasmid pAO1). Here, we report the bioinformatic analysis of a nicotine-degrading Arthrobacter aurescens M2012083, and show that the moeB and mogA genes that are essential for nicotine degradation in Arthrobacter are absent from plasmid pAO1. Homologues of all the nicotine degradation-related genes of plasmid pAO1 were found to be located on a 68,622-bp DNA segment (nic segment-1) in the M2012083 genome, showing 98.1% nucleotide acid sequence identity to the 69,252-bp nic segment of plasmid pAO1. However, the rest sequence of plasmid pAO1 other than the nic segment shows no significant similarity to the genome sequence of strain M2012083. Taken together, our data suggest that the nicotine degradation-related genes of strain M2012083 are located on the chromosome or a plasmid other than pAO1. Based on the genomic sequence comparison of strain M2012083 and six other Arthrobacter strains, we have identified 17 σ70 transcription factors reported to be involved in stress responses and 109 genes involved in environmental adaptability of strain M2012083. These results reveal the molecular basis of nicotine degradation and survival capacities of Arthrobacter species. PMID:25721465

  18. Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds.

    PubMed

    Miyazaki, Ryo; Bertelli, Claire; Benaglio, Paola; Canton, Jonas; De Coi, Nicoló; Gharib, Walid H; Gjoksi, Bebeka; Goesmann, Alexander; Greub, Gilbert; Harshman, Keith; Linke, Burkhard; Mikulic, Josip; Mueller, Linda; Nicolas, Damien; Robinson-Rechavi, Marc; Rivolta, Carlo; Roggo, Clémence; Roy, Shantanu; Sentchilo, Vladimir; Siebenthal, Alexandra Von; Falquet, Laurent; van der Meer, Jan Roelof

    2015-01-01

    Pseudomonas knackmussii B13 was the first strain to be isolated in 1974 that could degrade chlorinated aromatic hydrocarbons. This discovery was the prologue for subsequent characterization of numerous bacterial metabolic pathways, for genetic and biochemical studies, and which spurred ideas for pollutant bioremediation. In this study, we determined the complete genome sequence of B13 using next generation sequencing technologies and optical mapping. Genome annotation indicated that B13 has a variety of metabolic pathways for degrading monoaromatic hydrocarbons including chlorobenzoate, aminophenol, anthranilate and hydroxyquinol, but not polyaromatic compounds. Comparative genome analysis revealed that B13 is closest to Pseudomonas denitrificans and Pseudomonas aeruginosa. The B13 genome contains at least eight genomic islands [prophages and integrative conjugative elements (ICEs)], which were absent in closely related pseudomonads. We confirm that two ICEs are identical copies of the 103 kb self-transmissible element ICEclc that carries the genes for chlorocatechol metabolism. Comparison of ICEclc showed that it is composed of a variable and a 'core' region, which is very conserved among proteobacterial genomes, suggesting a widely distributed family of so far uncharacterized ICE. Resequencing of two spontaneous B13 mutants revealed a number of single nucleotide substitutions, as well as excision of a large 220 kb region and a prophage that drastically change the host metabolic capacity and survivability. PMID:24803113

  19. Comparative Genomics Analysis of Streptococcus Isolates from the Human Small Intestine Reveals their Adaptation to a Highly Dynamic Ecosystem

    PubMed Central

    Van den Bogert, Bartholomeus; Boekhorst, Jos; Herrmann, Ruth; Smid, Eddy J.; Zoetendal, Erwin G.; Kleerebezem, Michiel

    2013-01-01

    The human small-intestinal microbiota is characterised by relatively large and dynamic Streptococcus populations. In this study, genome sequences of small-intestinal streptococci from S. mitis, S. bovis, and S. salivarius species-groups were determined and compared with those from 58 Streptococcus strains in public databases. The Streptococcus pangenome consists of 12,403 orthologous groups of which 574 are shared among all sequenced streptococci and are defined as the Streptococcus core genome. Genome mining of the small-intestinal streptococci focused on functions playing an important role in the interaction of these streptococci in the small-intestinal ecosystem, including natural competence and nutrient-transport and metabolism. Analysis of the small-intestinal Streptococcus genomes predicts a high capacity to synthesize amino acids and various vitamins as well as substantial divergence in their carbohydrate transport and metabolic capacities, which is in agreement with observed physiological differences between these Streptococcus strains. Gene-specific PCR-strategies enabled evaluation of conservation of Streptococcus populations in intestinal samples from different human individuals, revealing that the S. salivarius strains were frequently detected in the small-intestine microbiota, supporting the representative value of the genomes provided in this study. Finally, the Streptococcus genomes allow prediction of the effect of dietary substances on Streptococcus population dynamics in the human small-intestine. PMID:24386196

  20. Sequencing and comparative analysis of the straw mushroom (Volvariella volvacea) genome.

    PubMed

    Bao, Dapeng; Gong, Ming; Zheng, Huajun; Chen, Mingjie; Zhang, Liang; Wang, Hong; Jiang, Jianping; Wu, Lin; Zhu, Yongqiang; Zhu, Gang; Zhou, Yan; Li, Chuanhua; Wang, Shengyue; Zhao, Yan; Zhao, Guoping; Tan, Qi

    2013-01-01

    Volvariella volvacea, the edible straw mushroom, is a highly nutritious food source that is widely cultivated on a commercial scale in many parts of Asia using agricultural wastes (rice straw, cotton wastes) as growth substrates. However, developments in V. volvacea cultivation have been limited due to a low biological efficiency (i.e. conversion of growth substrate to mushroom fruit bodies), sensitivity to low temperatures, and an unclear sexuality pattern that has restricted the breeding of improved strains. We have now sequenced the genome of V. volvacea and assembled it into 62 scaffolds with a total genome size of 35.7 megabases (Mb), containing 11,084 predicted gene models. Comparative analyses were performed with the model species in basidiomycete on mating type system, carbohydrate active enzymes, and fungal oxidative lignin enzymes. We also studied transcriptional regulation of the response to low temperature (4°C). We found that the genome of V. volvacea has many genes that code for enzymes, which are involved in the degradation of cellulose, hemicellulose, and pectin. The molecular genetics of the mating type system in V. volvacea was also found to be similar to the bipolar system in basidiomycetes, suggesting that it is secondary homothallism. Sensitivity to low temperatures could be due to the lack of the initiation of the biosynthesis of unsaturated fatty acids, trehalose and glycogen biosyntheses in this mushroom. Genome sequencing of V. volvacea has improved our understanding of the biological characteristics related to the degradation of the cultivating compost consisting of agricultural waste, the sexual reproduction mechanism, and the sensitivity to low temperatures at the molecular level which in turn will enable us to increase the industrial production of this mushroom. PMID:23526973

  1. Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein

    PubMed Central

    Kankainen, Matti; Paulin, Lars; Tynkkynen, Soile; von Ossowski, Ingemar; Reunanen, Justus; Partanen, Pasi; Satokari, Reetta; Vesterlund, Satu; Hendrickx, Antoni P. A.; Lebeer, Sarah; De Keersmaecker, Sigrid C. J.; Vanderleyden, Jos; Hämäläinen, Tuula; Laukkanen, Suvi; Salovuori, Noora; Ritari, Jarmo; Alatalo, Edward; Korpela, Riitta; Mattila-Sandholm, Tiina; Lassig, Anna; Hatakka, Katja; Kinnunen, Katri T.; Karjalainen, Heli; Saxelin, Maija; Laakso, Kati; Surakka, Anu; Palva, Airi; Salusjärvi, Tuomas; Auvinen, Petri; de Vos, Willem M.

    2009-01-01

    To unravel the biological function of the widely used probiotic bacterium Lactobacillus rhamnosus GG, we compared its 3.0-Mbp genome sequence with the similarly sized genome of L. rhamnosus LC705, an adjunct starter culture exhibiting reduced binding to mucus. Both genomes demonstrated high sequence identity and synteny. However, for both strains, genomic islands, 5 in GG and 4 in LC705, punctuated the colinearity. A significant number of strain-specific genes were predicted in these islands (80 in GG and 72 in LC705). The GG-specific islands included genes coding for bacteriophage components, sugar metabolism and transport, and exopolysaccharide biosynthesis. One island only found in L. rhamnosus GG contained genes for 3 secreted LPXTG-like pilins (spaCBA) and a pilin-dedicated sortase. Using anti-SpaC antibodies, the physical presence of cell wall-bound pili was confirmed by immunoblotting. Immunogold electron microscopy showed that the SpaC pilin is located at the pilus tip but also sporadically throughout the structure. Moreover, the adherence of strain GG to human intestinal mucus was blocked by SpaC antiserum and abolished in a mutant carrying an inactivated spaC gene. Similarly, binding to mucus was demonstrated for the purified SpaC protein. We conclude that the presence of SpaC is essential for the mucus interaction of L. rhamnosus GG and likely explains its ability to persist in the human intestinal tract longer than LC705 during an intervention trial. The presence of mucus-binding pili on the surface of a nonpathogenic Gram-positive bacterial strain reveals a previously undescribed mechanism for the interaction of selected probiotic lactobacilli with host tissues. PMID:19805152

  2. Sequencing and Comparative Analysis of the Straw Mushroom (Volvariella volvacea) Genome

    PubMed Central

    Bao, Dapeng; Gong, Ming; Zheng, Huajun; Chen, Mingjie; Zhang, Liang; Wang, Hong; Jiang, Jianping; Wu, Lin; Zhu, Yongqiang; Zhu, Gang; Zhou, Yan; Li, Chuanhua; Wang, Shengyue; Zhao, Yan; Zhao, Guoping; Tan, Qi

    2013-01-01

    Volvariella volvacea, the edible straw mushroom, is a highly nutritious food source that is widely cultivated on a commercial scale in many parts of Asia using agricultural wastes (rice straw, cotton wastes) as growth substrates. However, developments in V. volvacea cultivation have been limited due to a low biological efficiency (i.e. conversion of growth substrate to mushroom fruit bodies), sensitivity to low temperatures, and an unclear sexuality pattern that has restricted the breeding of improved strains. We have now sequenced the genome of V. volvacea and assembled it into 62 scaffolds with a total genome size of 35.7 megabases (Mb), containing 11,084 predicted gene models. Comparative analyses were performed with the model species in basidiomycete on mating type system, carbohydrate active enzymes, and fungal oxidative lignin enzymes. We also studied transcriptional regulation of the response to low temperature (4°C). We found that the genome of V. volvacea has many genes that code for enzymes, which are involved in the degradation of cellulose, hemicellulose, and pectin. The molecular genetics of the mating type system in V. volvacea was also found to be similar to the bipolar system in basidiomycetes, suggesting that it is secondary homothallism. Sensitivity to low temperatures could be due to the lack of the initiation of the biosynthesis of unsaturated fatty acids, trehalose and glycogen biosyntheses in this mushroom. Genome sequencing of V. volvacea has improved our understanding of the biological characteristics related to the degradation of the cultivating compost consisting of agricultural waste, the sexual reproduction mechanism, and the sensitivity to low temperatures at the molecular level which in turn will enable us to increase the industrial production of this mushroom. PMID:23526973

  3. [Comparative genome analysis in pea Pisum sativum L. varieties and lines with chromosomal and molecular markers].

    PubMed

    Samatadze, T E; Zelenina, D A; Shostak, N G; Volkov, A A; Popov, K V; Rachinskaia, O V; Borisov, A Iu; Tikhonovich, I A; Zelenin, A V; Muravenko, O V

    2008-12-01

    C banding, Ag-NOR staining, FISH with pTa71 (45S rDNA) and pTa794 (5S rDNA), and RAPD-PCR analysis were used to study the genome and chromosome polymorphism in four varieties (Frisson, Sparkle, Rondo, and Finale) and two genetic lines (Sprint-2 and SGE) of pea Pisum sativum L. A comparison of the C-banding patterns did not reveal any polymorphism within the varieties. The most significant between-variety differences were observed for the size of C bands on satellite chromosomes 4 and 7. All grain pea varieties (Frisson, Sparkle, and Rondo) had a large C band in the satellite of chromosome 4 and a medium C band in the region adjacent to the satellite thread on chromosome 7. C bands were almost of the same size in the genetic lines and vegetable variety Finale. In all accessions, 45S rDNA mapped to the secondary constriction regions of chromosomes 1, 3, and 5. The signal from chromosome 5 in the lines was more intense than in the varieties. Ag-NOR staining showed that the transcriptional activity of the 45S rRNA genes on chromosome 7 was higher than on chromosome 4 in all accessions. No more than four Ag-NOR-positive nucleoli were observed in interphase nuclei. Statistical analysis of the total area of Ag-NOR-stained nucleoli did not detect any significant difference between the accessions examined. RAPD-PCR analysis revealed high between-variety and low within-variety genomic polymorphism. Chromosomal and molecular markers proved to be promising for genome identification in pea varieties and lines. PMID:19178083

  4. Comparative genome analysis of a large Dutch Legionella pneumophila strain collection identifies five markers highly correlated with clinical strains

    PubMed Central

    2010-01-01

    Background Discrimination between clinical and environmental strains within many bacterial species is currently underexplored. Genomic analyses have clearly shown the enormous variability in genome composition between different strains of a bacterial species. In this study we have used Legionella pneumophila, the causative agent of Legionnaire's disease, to search for genomic markers related to pathogenicity. During a large surveillance study in The Netherlands well-characterized patient-derived strains and environmental strains were collected. We have used a mixed-genome microarray to perform comparative-genome analysis of 257 strains from this collection. Results Microarray analysis indicated that 480 DNA markers (out of in total 3360 markers) showed clear variation in presence between individual strains and these were therefore selected for further analysis. Unsupervised statistical analysis of these markers showed the enormous genomic variation within the species but did not show any correlation with a pathogenic phenotype. We therefore used supervised statistical analysis to identify discriminating markers. Genetic programming was used both to identify predictive markers and to define their interrelationships. A model consisting of five markers was developed that together correctly predicted 100% of the clinical strains and 69% of the environmental strains. Conclusions A novel approach for identifying predictive markers enabling discrimination between clinical and environmental isolates of L. pneumophila is presented. Out of over 3000 possible markers, five were selected that together enabled correct prediction of all the clinical strains included in this study. This novel approach for identifying predictive markers can be applied to all bacterial species, allowing for better discrimination between strains well equipped to cause human disease and relatively harmless strains. PMID:20630115

  5. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  6. Comparative Genomic Analysis of Pseudomonas chlororaphis PCL1606 Reveals New Insight into Antifungal Compounds Involved in Biocontrol.

    PubMed

    Calderón, Claudia E; Ramos, Cayo; de Vicente, Antonio; Cazorla, Francisco M

    2015-03-01

    Pseudomonas chlororaphis PCL1606 is a rhizobacterium that has biocontrol activity against many soilborne phytopathogenic fungi. The whole genome sequence of this strain was obtained using the Illumina Hiseq 2000 sequencing platform and was assembled using SOAP denovo software. The resulting 6.66-Mb complete sequence of the PCL1606 genome was further analyzed. A comparative genomic analysis using 10 plant-associated strains within the fluorescent Pseudomonas group, including the complete genome of P. chlororaphis PCL1606, revealed a diverse spectrum of traits involved in multitrophic interactions with plants and microbes as well as biological control. Phylogenetic analysis of these strains using eight housekeeping genes clearly placed strain PCL1606 into the P. chlororaphis group. The genome sequence of P. chlororaphis PCL1606 revealed the presence of sequences that were homologous to biosynthetic genes for the antifungal compounds 2-hexyl, 5-propyl resorcinol (HPR), hydrogen cyanide, and pyrrolnitrin; this is the first report of pyrrolnitrin encoding genes in this P. chlororaphis strain. Single-, double-, and triple-insertional mutants in the biosynthetic genes of each antifungal compound were used to test their roles in the production of these antifungal compounds and in antagonism and biocontrol of two fungal pathogens. The results confirmed the function of HPR in the antagonistic phenotype and in the biocontrol activity of P. chlororaphis PCL1606. PMID:25679537

  7. Complete Mitochondrial Genome Sequence of Acrida cinerea (Acrididae: Orthoptera) and Comparative Analysis of Mitochondrial Genomes in Orthoptera

    PubMed Central

    Liu, Nian; Huang, Yuan

    2010-01-01

    The complete 15,599-bp mitogenome of Acrida cinerea was determined and compared with that of the other 20 orthopterans. It displays characteristic gene content, genome organization, nucleotide composition, and codon usage found in other Caelifera mitogenomes. Comparison of 21 orthopteran sequences revealed that the tRNAs encoded by the H-strand appear more conserved than those by the L-stand. All tRNAs form the typical clover-leaf structure except trnS (agn), and most of the size variation among tRNAs stemmed from the length variation in the arm and loop of TΨC and the loop of DHU. The derived secondary structure models of the rrnS and rrnL from 21 orthoptera species closely resemble those from other insects on CRW except a considerably enlarged loop of helix 1399 of rrnS in Caelifera, which is a potentially autapomorphy of Caelifera. In the A+T-rich region, tandem repeats are not only conserved in the closely related mitogenome but also share some conserved motifs in the same subfamily. A stem-loop structure, 16 bp or longer, is likely to be involved in replication initiation in Caelifera and Grylloidea. A long T-stretch (>17 bp) with conserved stem-loop structure next to rrnS on the H-strand, bounded by a purine at either end, exists in the three species from Tettigoniidae. PMID:21197069

  8. Comparative genomic analysis of a neurotoxigenic Clostridium species using partial genome sequence: Phylogenetic analysis of a few conserved proteins involved in cellular processes and metabolism.

    PubMed

    Alam, Syed Imteyaz; Dixit, Aparna; Tomar, Arvind; Singh, Lokendra

    2010-04-01

    Clostridial organisms produce neurotoxins, which are generally regarded as the most potent toxic substances of biological origin and potential biological warfare agents. Clostridium tetani produces tetanus neurotoxin and is responsible for the fatal tetanus disease. In spite of the extensive immunization regimen, the disease is an important cause of death especially among neonates. Strains of C. tetani have not been genetically characterized except the complete genome sequencing of strain E88. The present study reports the genetic makeup and phylogenetic affiliations of an environmental strain of this bacterium with respect to C. tetani E88 and other clostridia. A shot gun library was constructed from the genomic DNA of C. tetani drde, isolated from decaying fish sample. Unique clones were sequenced and sequences compared with its closest relative C. tetani E88. A total of 275 clones were obtained and 32,457 bases of non-redundant sequence were generated. A total of 150 base changes were observed over the entire length of sequence obtained, including, additions, deletions and base substitutions. Of the total 120 ORFs detected, 48 exhibited closest similarity to E88 proteins of which three are hypothetical proteins. Eight of the ORFs exhibited similarity with hypothetical proteins from other organisms and 10 aligned with other proteins from unrelated organisms. There is an overall conservation of protein sequences among the two strains of C. tetani and. Selected ORFs involved in cellular processes and metabolism were subjected to phylogenetic analysis. PMID:19527791

  9. Genome sequence of a proteolytic (Group I) Clostridium botulinum strain Hall A and comparative analysis of the clostridial genomes

    PubMed Central

    Sebaihia, Mohammed; Peck, Michael W.; Minton, Nigel P.; Thomson, Nicholas R.; Holden, Matthew T.G.; Mitchell, Wilfrid J.; Carter, Andrew T.; Bentley, Stephen D.; Mason, David R.; Crossman, Lisa; Paul, Catherine J.; Ivens, Alasdair; Wells-Bennik, Marjon H.J.; Davis, Ian J.; Cerdeño-Tárraga, Ana M.; Churcher, Carol; Quail, Michael A.; Chillingworth, Tracey; Feltwell, Theresa; Fraser, Audrey; Goodhead, Ian; Hance, Zahra; Jagels, Kay; Larke, Natasha; Maddison, Mark; Moule, Sharon; Mungall, Karen; Norbertczak, Halina; Rabbinowitsch, Ester; Sanders, Mandy; Simmonds, Mark; White, Brian; Whithead, Sally; Parkhill, Julian

    2007-01-01

    Clostridium botulinum is a heterogeneous Gram-positive species that comprises four genetically and physiologically distinct groups of bacteria that share the ability to produce botulinum neurotoxin, the most poisonous toxin known to man, and the causative agent of botulism, a severe disease of humans and animals. We report here the complete genome sequence of a representative of Group I (proteolytic) C. botulinum (strain Hall A, ATCC 3502). The genome consists of a chromosome (3,886,916 bp) and a plasmid (16,344 bp), which carry 3650 and 19 predicted genes, respectively. Consistent with the proteolytic phenotype of this strain, the genome harbors a large number of genes encoding secreted proteases and enzymes involved in uptake and metabolism of amino acids. The genome also reveals a hitherto unknown ability of C. botulinum to degrade chitin. There is a significant lack of recently acquired DNA, indicating a stable genomic content, in strong contrast to the fluid genome of Clostridium difficile, which can form longer-term relationships with its host. Overall, the genome indicates that C. botulinum is adapted to a saprophytic lifestyle both in soil and aquatic environments. This pathogen relies on its toxin to rapidly kill a wide range of prey species, and to gain access to nutrient sources, it releases a large number of extracellular enzymes to soften and destroy rotting or decayed tissues. PMID:17519437

  10. Genome-wide Comparative Analysis of Atopic Dermatitis and Psoriasis Gives Insight into Opposing Genetic Mechanisms

    PubMed Central

    Baurecht, Hansjörg; Hotze, Melanie; Brand, Stephan; Büning, Carsten; Cormican, Paul; Corvin, Aiden; Ellinghaus, David; Ellinghaus, Eva; Esparza-Gordillo, Jorge; Fölster-Holst, Regina; Franke, Andre; Gieger, Christian; Hubner, Norbert; Illig, Thomas; Irvine, Alan D.; Kabesch, Michael; Lee, Young A.E.; Lieb, Wolfgang; Marenholz, Ingo; McLean, W.H. Irwin; Morris, Derek W.; Mrowietz, Ulrich; Nair, Rajan; Nöthen, Markus M.; Novak, Natalija; O’Regan, Grainne M.; Schreiber, Stefan; Smith, Catherine; Strauch, Konstantin; Stuart, Philip E.; Trembath, Richard; Tsoi, Lam C.; Weichenthal, Michael; Barker, Jonathan; Elder, James T.; Weidinger, Stephan; Cordell, Heather J.; Brown, Sara J.

    2015-01-01

    Atopic dermatitis and psoriasis are the two most common immune-mediated inflammatory disorders affecting the skin. Genome-wide studies demonstrate a high degree of genetic overlap, but these diseases have mutually exclusive clinical phenotypes and opposing immune mechanisms. Despite their prevalence, atopic dermatitis and psoriasis very rarely co-occur within one individual. By utilizing genome-wide association study and ImmunoChip data from >19,000 individuals and methodologies developed from meta-analysis, we have identified opposing risk alleles at shared loci as well as independent disease-specific loci within the epidermal differentiation complex (chromosome 1q21.3), the Th2 locus control region (chromosome 5q31.1), and the major histocompatibility complex (chromosome 6p21–22). We further identified previously unreported pleiotropic alleles with opposing effects on atopic dermatitis and psoriasis risk in PRKRA and ANXA6/TNIP1. In contrast, there was no evidence for shared loci with effects operating in the same direction on both diseases. Our results show that atopic dermatitis and psoriasis have distinct genetic mechanisms with opposing effects in shared pathways influencing epidermal differentiation and immune response. The statistical analysis methods developed in the conduct of this study have produced additional insight from previously published data sets. The approach is likely to be applicable to the investigation of the genetic basis of other complex traits with overlapping and distinct clinical features. PMID:25574825