Science.gov

Sample records for comparative genomics analysis

  1. Comparative Genome Analysis of Enterobacter cloacae

    PubMed Central

    Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

    2013-01-01

    The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314

  2. Comparative analysis of the Borrelia garinii genome.

    PubMed

    Glöckner, G; Lehmann, R; Romualdi, A; Pradella, S; Schulte-Spechtel, U; Schilhabel, M; Wilske, B; Sühnel, J; Platzer, M

    2004-01-01

    Three members of the genus Borrelia (B.burgdorferi, B.garinii, B.afzelii) cause tick-borne borreliosis. Depending on the Borrelia species involved, the borreliosis differs in its clinical symptoms. Comparative genomics opens up a way to elucidate the underlying differences in Borrelia species. We analysed a low redundancy whole-genome shotgun (WGS) assembly of a B.garinii strain isolated from a patient with neuroborreliosis in comparison to the B.burgdorferi genome. This analysis reveals that most of the chromosome is conserved (92.7% identity on DNA as well as on amino acid level) in the two species, and no chromosomal rearrangement or larger insertions/deletions could be observed. Furthermore, two collinear plasmids (lp54 and cp26) seem to belong to the basic genome inventory of Borrelia species. These three collinear parts of the Borrelia genome encode 861 genes, which are orthologous in the two species examined. The majority of the genetic information of the other plasmids of B.burgdorferii is also present in B.garinii although orthology is not easy to define due to a high redundancy of the plasmid fraction. Yet, we did not find counterparts of the B.burgdorferi plasmids lp36 and lp38 or their respective gene repertoire in the B.garinii genome. Thus, phenotypic differences between the two species could be attributable to the presence or absence of these two plasmids as well as to the potentially positively selected genes. PMID:15547252

  3. Image analysis in comparative genomic hybridization

    SciTech Connect

    Lundsteen, C.; Maahr, J.; Christensen, B.

    1995-01-01

    Comparative genomic hybridization (CGH) is a new technique by which genomic imbalances can be detected by combining in situ suppression hybridization of whole genomic DNA and image analysis. We have developed software for rapid, quantitative CGH image analysis by a modification and extension of the standard software used for routine karyotyping of G-banded metaphase spreads in the Magiscan chromosome analysis system. The DAPI-counterstained metaphase spread is karyotyped interactively. Corrections for image shifts between the DAPI, FITC, and TRITC images are done manually by moving the three images relative to each other. The fluorescence background is subtracted. A mean filter is applied to smooth the FITC and TRITC images before the fluorescence ratio between the individual FITC and TRITC-stained chromosomes is computed pixel by pixel inside the area of the chromosomes determined by the DAPI boundaries. Fluorescence intensity ratio profiles are generated, and peaks and valleys indicating possible gains and losses of test DNA are marked if they exceed ratios below 0.75 and above 1.25. By combining the analysis of several metaphase spreads, consistent findings of gains and losses in all or almost all spreads indicate chromosomal imbalance. Chromosomal imbalances are detected either by visual inspection of fluorescence ratio (FR) profiles or by a statistical approach that compares FR measurements of the individual case with measurements of normal chromosomes. The complete analysis of one metaphase can be carried out in approximately 10 minutes. 8 refs., 7 figs., 1 tab.

  4. Comparative genome analysis of Basidiomycete fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  5. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  6. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  7. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  8. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    PubMed Central

    Chari, Raj; Lockwood, William W.; Lam, Wan L.

    2006-01-01

    Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development. PMID:17992253

  9. Comparative DNA Sequence Analysis of Wheat and Rice Genomes

    PubMed Central

    Sorrells, Mark E.; La Rota, Mauricio; Bermudez-Kandianis, Catherine E.; Greene, Robert A.; Kantety, Ramesh; Munkvold, Jesse D.; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J.; Qi, Lili L.; Echalier, Benjamin; Gill, Bikram S.; Matthews, David E.; Lazo, Gerard R.; Chao, Shiaoman; Anderson, Olin D.; Edwards, Hugh; Linkiewicz, Anna M.; Dubcovsky, Jorge; Akhunov, Eduard D.; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T.; Peng, Junhua; Lapitan, Nora L.V.; Gonzalez-Hernandez, Jose L.; Anderson, James A.; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F.; Choi, Dong-Woog; Close, Timothy J.; Dilbirligi, Muharrem; Gill, Kulvinder S.; Steber, Camille; Walker-Simmons, Mary K.; McGuire, Patrick E.; Qualset, Calvin O.

    2003-01-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  10. Comparative Genome Analysis of Basidiomycete Fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  11. Comparative genomics - a perspective.

    PubMed

    Sivashankari, Selvarajan; Shanmughavel, Piramanayagam

    2007-03-27

    The rapidly emerging field of comparative genomics has yielded dramatic results. Comparative genome analysis has become feasible with the availability of a number of completely sequenced genomes. Comparison of complete genomes between organisms allow for global views on genome evolution and the availability of many completely sequenced genomes increases the predictive power in deciphering the hidden information in genome design, function and evolution. Thus, comparison of human genes with genes from other genomes in a genomic landscape could help assign novel functions for un-annotated genes. Here, we discuss the recently used techniques for comparative genomics and their derived inferences in genome biology.

  12. Comparative genomics - A perspective

    PubMed Central

    Sivashankari, Selvarajan; Shanmughavel, Piramanayagam

    2007-01-01

    The rapidly emerging field of comparative genomics has yielded dramatic results. Comparative genome analysis has become feasible with the availability of a number of completely sequenced genomes. Comparison of complete genomes between organisms allow for global views on genome evolution and the availability of many completely sequenced genomes increases the predictive power in deciphering the hidden information in genome design, function and evolution. Thus, comparison of human genes with genes from other genomes in a genomic landscape could help assign novel functions for un-annotated genes. Here, we discuss the recently used techniques for comparative genomics and their derived inferences in genome biology. PMID:17597925

  13. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    NASA Astrophysics Data System (ADS)

    Song, Jiuzhou; Ware, Tony; Liu, Shu-Lin; Surette, M.

    2004-12-01

    Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  14. Mycobacterial species as case-study of comparative genome analysis.

    PubMed

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  15. Cytogenetic analysis from DNA by comparative genomic hybridization.

    PubMed

    Tachdjian, G; Aboura, A; Lapierre, J M; Viguié, F

    2000-01-01

    Comparative genomic hybridization (CGH) is a modified in situ hybridization technique which allows detection and mapping of DNA sequence copy differences between two genomes in a single experiment. In CGH analysis, two differentially labelled genomic DNA (study and reference) are co-hybridized to normal metaphase spreads. Chromosomal locations of copy number changes in the DNA segments of the study genome are revealed by a variable fluorescence intensity ratio along each target chromosome. Since its development, CGH has been applied mostly as a research tool in the field of cancer cytogenetics to identify genetic changes in many previously unknown regions. CGH may also have a role in clinical cytogenetics for detection and identification of unbalanced chromosomal abnormalities.

  16. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. PMID:25296770

  17. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use.

  18. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    SciTech Connect

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  19. The Chlamydia psittaci Genome: A Comparative Analysis of Intracellular Pathogens

    PubMed Central

    Saluz, Hans Peter

    2012-01-01

    Background Chlamydiaceae are a family of obligate intracellular pathogens causing a wide range of diseases in animals and humans, and facing unique evolutionary constraints not encountered by free-living prokaryotes. To investigate genomic aspects of infection, virulence and host preference we have sequenced Chlamydia psittaci, the pathogenic agent of ornithosis. Results A comparison of the genome of the avian Chlamydia psittaci isolate 6BC with the genomes of other chlamydial species, C. trachomatis, C. muridarum, C. pneumoniae, C. abortus, C. felis and C. caviae, revealed a high level of sequence conservation and synteny across taxa, with the major exception of the human pathogen C. trachomatis. Important differences manifest in the polymorphic membrane protein family specific for the Chlamydiae and in the highly variable chlamydial plasticity zone. We identified a number of psittaci-specific polymorphic membrane proteins of the G family that may be related to differences in host-range and/or virulence as compared to closely related Chlamydiaceae. We calculated non-synonymous to synonymous substitution rate ratios for pairs of orthologous genes to identify putative targets of adaptive evolution and predicted type III secreted effector proteins. Conclusions This study is the first detailed analysis of the Chlamydia psittaci genome sequence. It provides insights in the genome architecture of C. psittaci and proposes a number of novel candidate genes mostly of yet unknown function that may be important for pathogen-host interactions. PMID:22506068

  20. Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species

    PubMed Central

    Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

    2012-01-01

    Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of

  1. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    PubMed

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  2. Comparative analysis of genomic signal processing for microarray data clustering.

    PubMed

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  3. Comparative genomic analysis of seven Mycoplasma hyosynoviae strains

    PubMed Central

    Bumgardner, Eric A; Kittichotirat, Weerayuth; Bumgarner, Roger E; Lawrence, Paulraj K

    2015-01-01

    Infection with Mycoplasma hyosynoviae can result in debilitating arthritis in pigs, particularly those aged 10 weeks or older. Strategies for controlling this pathogen are becoming increasingly important due to the rise in the number of cases of arthritis that have been attributed to infection in recent years. In order to begin to develop interventions to prevent arthritis caused by M. hyosynoviae, more information regarding the specific proteins and potential virulence factors that its genome encodes was needed. However, the genome of this emerging swine pathogen had not been sequenced previously. In this report, we present a comparative analysis of the genomes of seven strains of M. hyosynoviae isolated from different locations in North America during the years 2010 to 2013. We identified several putative virulence factors that may contribute to the ability of this pathogen to adhere to host cells. Additionally, we discovered several prophage genes present within the genomes of three strains that show significant similarity to MAV1, a phage isolated from the related species, M. arthritidis. We also identified CRISPR-Cas and type III restriction and modification systems present in two strains that may contribute to their ability to defend against phage infection. PMID:25693846

  4. Comparative analysis of Acinetobacters: three genomes for three lifestyles.

    PubMed

    Vallenet, David; Nordmann, Patrice; Barbe, Valérie; Poirel, Laurent; Mangenot, Sophie; Bataille, Elodie; Dossat, Carole; Gas, Shahinaz; Kreimeyer, Annett; Lenoble, Patricia; Oztas, Sophie; Poulain, Julie; Segurens, Béatrice; Robert, Catherine; Abergel, Chantal; Claverie, Jean-Michel; Raoult, Didier; Médigue, Claudine; Weissenbach, Jean; Cruveiller, Stéphane

    2008-03-19

    Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i) whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss); ii) strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii) several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors) were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS). Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment), louse, soil.

  5. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis.

    PubMed

    Bengelsdorf, Frank R; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood-Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (P thlA ) from C. acetobutylicum or native pta-ack promoter (P pta-ack ) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  6. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

    PubMed Central

    Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  7. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis.

    PubMed

    Bengelsdorf, Frank R; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood-Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (P thlA ) from C. acetobutylicum or native pta-ack promoter (P pta-ack ) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  8. Comparative analysis of essential genes in prokaryotic genomic islands.

    PubMed

    Zhang, Xi; Peng, Chong; Zhang, Ge; Gao, Feng

    2015-07-30

    Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands.

  9. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae

    PubMed Central

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-01-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  10. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae.

    PubMed

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-03-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  11. Genome-wide Comparative Analysis of Annexin Superfamily in Plants

    PubMed Central

    Jami, Sravan Kumar; Clark, Greg B.; Ayele, Belay T.; Ashe, Paula; Kirti, Pulugurtha Bharadwaja

    2012-01-01

    Most annexins are calcium-dependent, phospholipid-binding proteins with suggested functions in response to environmental stresses and signaling during plant growth and development. They have previously been identified and characterized in Arabidopsis and rice, and constitute a multigene family in plants. In this study, we performed a comparative analysis of annexin gene families in the sequenced genomes of Viridiplantae ranging from unicellular green algae to multicellular plants, and identified 149 genes. Phylogenetic studies of these deduced annexins classified them into nine different arbitrary groups. The occurrence and distribution of bona fide type II calcium binding sites within the four annexin domains were found to be different in each of these groups. Analysis of chromosomal distribution of annexin genes in rice, Arabidopsis and poplar revealed their localization on various chromosomes with some members also found on duplicated chromosomal segments leading to gene family expansion. Analysis of gene structure suggests sequential or differential loss of introns during the evolution of land plant annexin genes. Intron positions and phases are well conserved in annexin genes from representative genomes ranging from Physcomitrella to higher plants. The occurrence of alternative motifs such as K/R/HGD was found to be overlapping or at the mutated regions of the type II calcium binding sites indicating potential functional divergence in certain plant annexins. This study provides a basis for further functional analysis and characterization of annexin multigene families in the plant lineage. PMID:23133603

  12. Comparative genomic analysis of ten Streptococcus pneumoniae temperate bacteriophages.

    PubMed

    Romero, Patricia; Croucher, Nicholas J; Hiller, N Luisa; Hu, Fen Z; Ehrlich, Garth D; Bentley, Stephen D; García, Ernesto; Mitchell, Tim J

    2009-08-01

    Streptococcus pneumoniae is an important human pathogen that often carries temperate bacteriophages. As part of a program to characterize the genetic makeup of prophages associated with clinical strains and to assess the potential roles that they play in the biology and pathogenesis in their host, we performed comparative genomic analysis of 10 temperate pneumococcal phages. All of the genomes are organized into five major gene clusters: lysogeny, replication, packaging, morphogenesis, and lysis clusters. All of the phage particles observed showed a Siphoviridae morphology. The only genes that are well conserved in all the genomes studied are those involved in the integration and the lysis of the host in addition to two genes, of unknown function, within the replication module. We observed that a high percentage of the open reading frames contained no similarities to any sequences catalogued in public databases; however, genes that were homologous to known phage virulence genes, including the pblB gene of Streptococcus mitis and the vapE gene of Dichelobacter nodosus, were also identified. Interestingly, bioinformatic tools showed the presence of a toxin-antitoxin system in the phage phiSpn_6, and this represents the first time that an addition system in a pneumophage has been identified. Collectively, the temperate pneumophages contain a diverse set of genes with various levels of similarity among them. PMID:19502408

  13. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    PubMed Central

    2011-01-01

    Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921

  14. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    PubMed

    Klima, Cassidy L; Cook, Shaun R; Zaheer, Rahat; Laing, Chad; Gannon, Vick P; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W; McAllister, Tim A

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  15. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources

    PubMed Central

    Klima, Cassidy L.; Cook, Shaun R.; Zaheer, Rahat; Laing, Chad; Gannon, Vick P.; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W.; McAllister, Tim A.

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2–8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  16. Microarray Comparative Genomic Hybridisation Analysis Incorporating Genomic Organisation, and Application to Enterobacterial Plant Pathogens

    PubMed Central

    Pritchard, Leighton; Liu, Hui; Booth, Clare; Douglas, Emma; François, Patrice; Schrenzel, Jacques; Hedley, Peter E.; Birch, Paul R. J.; Toth, Ian K.

    2009-01-01

    Microarray comparative genomic hybridisation (aCGH) provides an estimate of the relative abundance of genomic DNA (gDNA) taken from comparator and reference organisms by hybridisation to a microarray containing probes that represent sequences from the reference organism. The experimental method is used in a number of biological applications, including the detection of human chromosomal aberrations, and in comparative genomic analysis of bacterial strains, but optimisation of the analysis is desirable in each problem domain. We present a method for analysis of bacterial aCGH data that encodes spatial information from the reference genome in a hidden Markov model. This technique is the first such method to be validated in comparisons of sequenced bacteria that diverge at the strain and at the genus level: Pectobacterium atrosepticum SCRI1043 (Pba1043) and Dickeya dadantii 3937 (Dda3937); and Lactococcus lactis subsp. lactis IL1403 and L. lactis subsp. cremoris MG1363. In all cases our method is found to outperform common and widely used aCGH analysis methods that do not incorporate spatial information. This analysis is applied to comparisons between commercially important plant pathogenic soft-rotting enterobacteria (SRE) Pba1043, P. atrosepticum SCRI1039, P. carotovorum 193, and Dda3937. Our analysis indicates that it should not be assumed that hybridisation strength is a reliable proxy for sequence identity in aCGH experiments, and robustly extends the applicability of aCGH to bacterial comparisons at the genus level. Our results in the SRE further provide evidence for a dynamic, plastic ‘accessory’ genome, revealing major genomic islands encoding gene products that provide insight into, and may play a direct role in determining, variation amongst the SRE in terms of their environmental survival, host range and aetiology, such as phytotoxin synthesis, multidrug resistance, and nitrogen fixation. PMID:19696881

  17. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  18. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    SciTech Connect

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The species P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this

  19. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE PAGESBeta

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; et al

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but

  20. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    PubMed Central

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  1. The genome sequence of Blochmannia floridanus: comparative analysis of reduced genomes.

    PubMed

    Gil, Rosario; Silva, Francisco J; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C H J; Gross, Roy; Moya, Andrés

    2003-08-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  2. Genome sequence and comparative genome analysis of Lactobacillus casei: insights into their niche-associated evolution.

    PubMed

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F; Broadbent, Jeff R; Steele, James L

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  3. A Mitochondrial Genome of Rhyparochromidae (Hemiptera: Heteroptera) and a Comparative Analysis of Related Mitochondrial Genomes

    PubMed Central

    Li, Teng; Yang, Jie; Li, Yinwan; Cui, Ying; Xie, Qiang; Bu, Wenjun; Hillis, David M.

    2016-01-01

    The Rhyparochromidae, the largest family of Lygaeoidea, encompasses more than 1,850 described species, but no mitochondrial genome has been sequenced to date. Here we describe the first mitochondrial genome for Rhyparochromidae: a complete mitochondrial genome of Panaorus albomaculatus (Scott, 1874). This mitochondrial genome is comprised of 16,345 bp, and contains the expected 37 genes and control region. The majority of the control region is made up of a large tandem-repeat region, which has a novel pattern not previously observed in other insects. The tandem-repeats region of P. albomaculatus consists of 53 tandem duplications (including one partial repeat), which is the largest number of tandem repeats among all the known insect mitochondrial genomes. Slipped-strand mispairing during replication is likely to have generated this novel pattern of tandem repeats. Comparative analysis of tRNA gene families in sequenced Pentatomomorpha and Lygaeoidea species shows that the pattern of nucleotide conservation is markedly higher on the J-strand. Phylogenetic reconstruction based on mitochondrial genomes suggests that Rhyparochromidae is not the sister group to all the remaining Lygaeoidea, and supports the monophyly of Lygaeoidea. PMID:27756915

  4. MGcV: the microbial genomic context viewer for comparative genome analysis

    PubMed Central

    2013-01-01

    Background Conserved gene context is used in many types of comparative genome analyses. It is used to provide leads on gene function, to guide the discovery of regulatory sequences, but also to aid in the reconstruction of metabolic networks. We present the Microbial Genomic context Viewer (MGcV), an interactive, web-based application tailored to strengthen the practice of manual comparative genome context analysis for bacteria. Results MGcV is a versatile, easy-to-use tool that renders a visualization of the genomic context of any set of selected genes, genes within a phylogenetic tree, genomic segments, or regulatory elements. It is tailored to facilitate laborious tasks such as the interactive annotation of gene function, the discovery of regulatory elements, or the sequence-based reconstruction of gene regulatory networks. We illustrate that MGcV can be used in gene function annotation by visually integrating information on prokaryotic genes, like their annotation as available from NCBI with other annotation data such as Pfam domains, sub-cellular location predictions and gene-sequence characteristics such as GC content. We also illustrate the usefulness of the interactive features that allow the graphical selection of genes to facilitate data gathering (e.g. upstream regions, ID’s or annotation), in the analysis and reconstruction of transcription regulation. Moreover, putative regulatory elements and their corresponding scores or data from RNA-seq and microarray experiments can be uploaded, visualized and interpreted in (ranked-) comparative context maps. The ranked maps allow the interpretation of predicted regulatory elements and experimental data in light of each other. Conclusion MGcV advances the manual comparative analysis of genes and regulatory elements by providing fast and flexible integration of gene related data combined with straightforward data retrieval. MGcV is available at http://mgcv.cmbi.ru.nl. PMID:23547764

  5. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis

    PubMed Central

    Jun, Se-Ran; Wassenaar, Trudy M.; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher W.; Doktycz, Mitchel J.; Pelletier, Dale A.

    2015-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants. PMID:26519390

  6. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  7. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    PubMed Central

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  8. Comparative genomic analysis reveals bilateral breast cancers are genetically independent.

    PubMed

    Song, Fangfang; Li, Xiangchun; Song, Fengju; Zhao, Yanrui; Li, Haixin; Zheng, Hong; Gao, Zhibo; Wang, Jun; Zhang, Wei; Chen, Kexin

    2015-10-13

    Bilateral breast cancer (BBC) poses a major challenge for oncologists because of the cryptic relationship between the two lesions. The purpose of this study was to determine the origin of the contralateral breast cancer (either dependent or independent of the index tumor). Here, we used ultra-deep whole-exome sequencing and array comparative genomic hybridization (aCGH) to study four paired samples of BBCs with different tumor subtypes and time intervals between the developments of each tumor. We used two paired primary breast tumors and corresponding metastatic liver lesions as the control. We tested the origin independent nature of BBC in three ways: mutational concordance, mutational signature clustering, and clonality analysis using copy number profiles. We found that the paired BBC samples had near-zero concordant mutation rates, which were much lower than those of the paired primary/metastasis samples. The results of a mutational signature analysis also suggested that BBCs are independent of one another. A clonality analysis using aCGH data further revealed that paired BBC samples was clonally independent, in contrast to clonal related origin found for paired primary/metastasis samples. Our preliminary findings show that BBCs in Han Chinese women are origin independent and thus should be treated separately. PMID:26378809

  9. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  10. Sequence and comparative genomic analysis of actin-related proteins.

    PubMed

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4.

  11. Comparative Analysis of Genome Diversity in Bullmastiff Dogs.

    PubMed

    Mortlock, Sally-Anne; Khatkar, Mehar S; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  12. Comparative Analysis of Genome Diversity in Bullmastiff Dogs

    PubMed Central

    Mortlock, Sally-Anne; Khatkar, Mehar S.; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  13. Hidden Markov models for evolution and comparative genomics analysis.

    PubMed

    Bykova, Nadezda A; Favorov, Alexander V; Mironov, Andrey A

    2013-01-01

    The problem of reconstruction of ancestral states given a phylogeny and data from extant species arises in a wide range of biological studies. The continuous-time Markov model for the discrete states evolution is generally used for the reconstruction of ancestral states. We modify this model to account for a case when the states of the extant species are uncertain. This situation appears, for example, if the states for extant species are predicted by some program and thus are known only with some level of reliability; it is common for bioinformatics field. The main idea is formulation of the problem as a hidden Markov model on a tree (tree HMM, tHMM), where the basic continuous-time Markov model is expanded with the introduction of emission probabilities of observed data (e.g. prediction scores) for each underlying discrete state. Our tHMM decoding algorithm allows us to predict states at the ancestral nodes as well as to refine states at the leaves on the basis of quantitative comparative genomics. The test on the simulated data shows that the tHMM approach applied to the continuous variable reflecting the probabilities of the states (i.e. prediction score) appears to be more accurate then the reconstruction from the discrete states assignment defined by the best score threshold. We provide examples of applying our model to the evolutionary analysis of N-terminal signal peptides and transcription factor binding sites in bacteria. The program is freely available at http://bioinf.fbb.msu.ru/~nadya/tHMM and via web-service at http://bioinf.fbb.msu.ru/treehmmweb.

  14. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  15. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org.

  16. Delineation of Steroid-Degrading Microorganisms through Comparative Genomic Analysis

    PubMed Central

    Bergstrand, Lee H.; Cardenas, Erick; Holert, Johannes; Van Hamme, Jonathan D.

    2016-01-01

    ABSTRACT Steroids are ubiquitous in natural environments and are a significant growth substrate for microorganisms. Microbial steroid metabolism is also important for some pathogens and for biotechnical applications. This study delineated the distribution of aerobic steroid catabolism pathways among over 8,000 microorganisms whose genomes are available in the NCBI RefSeq database. Combined analysis of bacterial, archaeal, and fungal genomes with both hidden Markov models and reciprocal BLAST identified 265 putative steroid degraders within only Actinobacteria and Proteobacteria, which mainly originated from soil, eukaryotic host, and aquatic environments. These bacteria include members of 17 genera not previously known to contain steroid degraders. A pathway for cholesterol degradation was conserved in many actinobacterial genera, particularly in members of the Corynebacterineae, and a pathway for cholate degradation was conserved in members of the genus Rhodococcus. A pathway for testosterone and, sometimes, cholate degradation had a patchy distribution among Proteobacteria. The steroid degradation genes tended to occur within large gene clusters. Growth experiments confirmed bioinformatic predictions of steroid metabolism capacity in nine bacterial strains. The results indicate there was a single ancestral 9,10-seco-steroid degradation pathway. Gene duplication, likely in a progenitor of Rhodococcus, later gave rise to a cholate degradation pathway. Proteobacteria and additional Actinobacteria subsequently obtained a cholate degradation pathway via horizontal gene transfer, in some cases facilitated by plasmids. Catabolism of steroids appears to be an important component of the ecological niches of broad groups of Actinobacteria and individual species of Proteobacteria. PMID:26956583

  17. Complete genome sequencing and comparative analysis of the linezolid-resistant Enterococcus faecalis strain DENG1.

    PubMed

    Yu, Zhijian; Chen, Zhong; Cheng, Hang; Zheng, Jinxin; Li, Duoyun; Deng, Xiangbin; Pan, Weiguang; Yang, Weizhi; Deng, Qiwen

    2014-07-01

    Genome level analysis of bacterial strains provides information on genetic composition and resistance mechanisms to clinically relevant antibiotics. To date, whole genome characterization of linezolid-resistant Enterococcus faecalis isolated in the clinic is lacking. In this study, we report the entire genome sequence, genomic characteristics and virulence factors of a pathogenic E. faecalis strain, DENG1. Our results showed considerable differences in genomic characteristics and virulence factors compared with other E. faecalis strains (V583 and OG1RF). The genome of this LZD-resistant E. faecalis strain can be used as a reference to study the mechanism of LZD resistance and the phylogenetic relationship of E. faecalis strains worldwide.

  18. e-Fungi: a data resource for comparative analysis of fungal genomes

    PubMed Central

    Hedeler, Cornelia; Wong, Han Min; Cornell, Michael J; Alam, Intikhab; Soanes, Darren M; Rattray, Magnus; Hubbard, Simon J; Talbot, Nicholas J; Oliver, Stephen G; Paton, Norman W

    2007-01-01

    Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the genomes stored in the database

  19. Comparative analysis of trichomonad genome sizes and karyotypes.

    PubMed

    Zubácová, Zuzana; Cimbůrek, Zdenek; Tachezy, Jan

    2008-09-01

    In parasitic protists, the genome sizes range from 2.9Mb in Encephalitozoon cuniculi to about 160Mb in Trichomonas vaginalis. The suprisingly large genome size of the former human parasite resulted from the expansion of various repetitive elements, specific gene families, and possibly from large-scale genome duplication. The reason for this phenomenon, as well as whether other trichomonad species have undergone a similar genome expansion, is not known. In this work we studied the genomes of nine selected species of the Trichomonadea group. We found that each species has a characteristic karyotype with a stable and haploid number of chromosomes. Relatively large genome sizes were found in all the tested species, although over a rather broad range (86-177Mb). The largest genomes were typically observed in the Trichomonas and Tritrichomonas genera (133-177Mb), while Tetratrichomonas gallinarum contains the smallest genome (86Mb). The genome size correlated with the cell volume, however, no relationship between genome size and the site of infection or trichomonad phagocytic ability was observed. The data presented here provide primary information towards selecting a trichomonad species for future large-scale sequencing to elucidate the evolution of unusual parabasalid genomes. PMID:18606195

  20. Comparative analysis of trichomonad genome sizes and karyotypes.

    PubMed

    Zubácová, Zuzana; Cimbůrek, Zdenek; Tachezy, Jan

    2008-09-01

    In parasitic protists, the genome sizes range from 2.9Mb in Encephalitozoon cuniculi to about 160Mb in Trichomonas vaginalis. The suprisingly large genome size of the former human parasite resulted from the expansion of various repetitive elements, specific gene families, and possibly from large-scale genome duplication. The reason for this phenomenon, as well as whether other trichomonad species have undergone a similar genome expansion, is not known. In this work we studied the genomes of nine selected species of the Trichomonadea group. We found that each species has a characteristic karyotype with a stable and haploid number of chromosomes. Relatively large genome sizes were found in all the tested species, although over a rather broad range (86-177Mb). The largest genomes were typically observed in the Trichomonas and Tritrichomonas genera (133-177Mb), while Tetratrichomonas gallinarum contains the smallest genome (86Mb). The genome size correlated with the cell volume, however, no relationship between genome size and the site of infection or trichomonad phagocytic ability was observed. The data presented here provide primary information towards selecting a trichomonad species for future large-scale sequencing to elucidate the evolution of unusual parabasalid genomes.

  1. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species.

  2. The tiger genome and comparative analysis with lion and snow leopard genomes

    PubMed Central

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  3. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  4. Comparative genomic analysis of hyperthermophilic archaeal fuselloviridae viruses

    SciTech Connect

    B. Wiedenheft; K. Stedman; F. Roberto; D. Willits; A. K. Gleske; L. Zoeller; J. Snyder; T. Douglas; M. Young

    2004-02-01

    The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindleshaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of _15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate.

  5. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    PubMed

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  6. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    PubMed

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites.

  7. The Korea brassica genome project: a glimpse of the brassica genome based on comparative genome analysis with Arabidopsis.

    PubMed

    Yang, Tae-Jin; Kim, Jung-Sun; Lim, Ki-Byung; Kwon, Soo-Jin; Kim, Jin-A; Jin, Mina; Park, Jee Young; Lim, Myung-Ho; Kim, Ho-Il; Kim, Seog Hyung; Lim, Yong Pyo; Park, Beom-Seok

    2005-01-01

    A complete genome sequence provides unlimited information in the sequenced organism as well as in related taxa. According to the guidance of the Multinational Brassica Genome Project (MBGP), the Korea Brassica Genome Project (KBGP) is sequencing chromosome 1 (cytogenetically oriented chromosome #1) of Brassica rapa. We have selected 48 seed BACs on chromosome 1 using EST genetic markers and FISH analyses. Among them, 30 BAC clones have been sequenced and 18 are on the way. Comparative genome analyses of the EST sequences and sequenced BAC clones from Brassica chromosome 1 revealed their homeologous partner regions on the Arabidopsis genome and a syntenic comparative map between Brassica chromosome 1 and Arabidopsis chromosomes. In silico chromosome walking and clone validation have been successfully applied to extending sequence contigs based on the comparative map and BAC end sequences. In addition, we have defined the (peri)centromeric heterochromatin blocks with centromeric tandem repeats, rDNA and centromeric retrotransposons. In-depth sequence analyses of five homeologous BAC clones and an Arabidopsis chromosomal region reveal overall co-linearity, with 82% sequence similarity. The data indicate that the Brassica genome has undergone triplication and subsequent gene losses after the divergence of Arabidopsis and Brassica. Based on in-depth comparative genome analyses, we propose a comparative genomics approach for conquering the Brassica genome. In 2005 we intend to construct an integrated physical map, including sequence information from 500 BAC clones and integration of fingerprinting data and end sequence data of more than 100,000 BAC clones.

  8. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  9. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  10. Comparative Transcriptional and Genomic Analysis of Plasmodium falciparum Field Isolates

    PubMed Central

    Mackinnon, Margaret J.; Li, Jinguang; Mok, Sachel; Kortok, Moses M.; Marsh, Kevin; Preiser, Peter R.; Bozdech, Zbynek

    2009-01-01

    Mechanisms for differential regulation of gene expression may underlie much of the phenotypic variation and adaptability of malaria parasites. Here we describe transcriptional variation among culture-adapted field isolates of Plasmodium falciparum, the species responsible for most malarial disease. It was found that genes coding for parasite protein export into the red cell cytosol and onto its surface, and genes coding for sexual stage proteins involved in parasite transmission are up-regulated in field isolates compared with long-term laboratory isolates. Much of this variability was associated with the loss of small or large chromosomal segments, or other forms of gene copy number variation that are prevalent in the P. falciparum genome (copy number variants, CNVs). Expression levels of genes inside these segments were correlated to that of genes outside and adjacent to the segment boundaries, and this association declined with distance from the CNV boundary. This observation could not be explained by copy number variation in these adjacent genes. This suggests a local-acting regulatory role for CNVs in transcription of neighboring genes and helps explain the chromosomal clustering that we observed here. Transcriptional co-regulation of physical clusters of adaptive genes may provide a way for the parasite to readily adapt to its highly heterogeneous and strongly selective environment. PMID:19898609

  11. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  12. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach

    PubMed Central

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A.; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S.; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  13. Comparative promoter analysis in vertebrate genomes with the CORG workbench.

    PubMed

    Dieterich, Christoph; Vingron, Martin

    2006-01-01

    CORG is a versatile web-based workbench for comparative promoter analysis in vertebrate model organisms. Two kinds of information are explicitly considered in the automated annotation process. First, local conservation patterns in upstream regions of homologous genes: These phylogenetic footprints are likely to stem from sequence elements that are under selective pressure. The CORG pipeline detects and exploits patterns of local similarity to annotate promoter regions. Second, experimental data on transcription start sites: exon positions and DNA binding site descriptions complete the promoter annotation. These data are made available via an interactive web portal. Individual promoter studies are supported by a JAVA applet that supplies all data down to the nucleotide level.

  14. Comparative Analysis of Alu Repeats in Primate Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Alu repeats are SINEs (Short intersperse repetitive elements) which enjoy a successful application in genome evolution, population biology, phylogenetics and forensics. Human Alu consensus sequences were widely used as surrogates in nonhuman primate studies with an assumption that all p...

  15. Ebolavirus comparative genomics

    DOE PAGESBeta

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; et al

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  16. Draft genome sequence of Cellulomonas carbonis T26(T) and comparative analysis of six Cellulomonas genomes.

    PubMed

    Zhuang, Weiping; Zhang, Shengzhe; Xia, Xian; Wang, Gejiao

    2015-01-01

    Most Cellulomonas strains are cellulolytic and this feature may be applied in straw degradation and bioremediation. In this study, Cellulomonas carbonis T26(T), Cellulomonas bogoriensis DSM 16987(T) and Cellulomonas cellasea 20108(T) were sequenced. Here we described the draft genomic information of C. carbonis T26(T) and compared it to the related Cellulomonas genomes. Strain T26(T) has a 3,990,666 bp genome size with a G + C content of 73.4 %, containing 3418 protein-coding genes and 59 RNA genes. The results showed good correlation between the genotypes and the physiological phenotypes. The information are useful for the better application of the Cellulomonas strains.

  17. Genomic analysis by oligonucleotide array Comparative Genomic Hybridization utilizing formalin-fixed, paraffin-embedded tissues.

    PubMed

    Savage, Stephanie J; Hostetter, Galen

    2011-01-01

    Formalin fixation has been used to preserve tissues for more than a hundred years, and there are currently more than 300 million archival samples in the United States alone. The application of genomic protocols such as high-density oligonucleotide array Comparative Genomic Hybridization (aCGH) to formalin-fixed, paraffin-embedded (FFPE) tissues, therefore, opens an untapped resource of available tissues for research and facilitates utilization of existing clinical data in a research sample set. However, formalin fixation results in cross-linking of proteins and DNA, typically leading to such a significant degradation of DNA template that little is available for use in molecular applications. Here, we describe a protocol to circumvent formalin fixation artifact by utilizing enzymatic reactions to obtain quality DNA from a wide range of FFPE tissues for successful genome-wide discovery of gene dosage alterations in archival clinical samples.

  18. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    SciTech Connect

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  19. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  20. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity.

    PubMed

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-08-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  1. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity.

    PubMed

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-08-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies.

  2. Jam packed genomes--a preliminary, comparative analysis of nucleomorphs.

    PubMed

    Gilson, Paul R; McFadden, Geoffrey I

    2002-05-01

    There are two ways eukaryotic cells can permanently acquire chloroplasts. They can take up a cyanobacterium and turn it into a chloroplast or they can engulf an alga that already has a chloroplast. The second method is far more common and there are at least seven major groups of protists that have obtained their chloroplasts, this way. In most cases little remains of the engulfed alga apart from its chloroplast, but in two groups, the cryptomonads and chlorarachniophytes, a small remnant nucleus of the engulfed alga is still present. These tiny nuclei, called nucleomorphs, are the smallest and most compact eukaryotic genomes known and recently the nucleomorph of the cryptomonad alga Guillardia theta, was completely sequenced (551 kilobases). The nucleomorph of the chlorarachniophyte Bigellowiella natans (380 kilobases), is also being sequenced and is about half complete. We discuss some of the similarities and differences that are emerging between these two nucleomorph genomes. Both genomes contain just three chromosomes that encode mainly housekeeping genes and a few proteins for chloroplast functions. The bulk of nucleomorph gene coding capacity, therefore, appears to be devoted to self perpetuation and creating gene and protein expression machineries to make a small number of essential chloroplast proteins. We discuss reasons why both nucleomorphs are extraordinarily compact and why their gene sequences are evolving rapidly.

  3. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  4. Comparative Analysis of Apicomplexa and Genomic Diversity in Eukaryotes

    PubMed Central

    Templeton, Thomas J.; Iyer, Lakshminarayan M.; Anantharaman, Vivek; Enomoto, Shinichiro; Abrahante, Juan E.; Subramanian, G.M.; Hoffman, Stephen L.; Abrahamsen, Mitchell S.; Aravind, L.

    2004-01-01

    The apicomplexans Plasmodium and Cryptosporidium have developed distinctive adaptations via lineage-specific gene loss and gene innovation in the process of diverging from a common parasitic ancestor. The two lineages have acquired distinct but overlapping sets of surface protein adhesion domains typical of animal proteins, but in no case do they share multidomain architectures identical to animals. Cryptosporidium, but not Plasmodium, possesses an animal-type O-linked glycosylation pathway, along with >30 predicted surface proteins having mucin-like segments. The two parasites have notable qualitative differences in conserved protein architectures associated with chromatin dynamics and transcription. Cryptosporidium shows considerable reduction in the number of introns and a concomitant loss of spliceosomal machinery components. We also describe additional molecular characteristics distinguishing Apicomplexa from other eukaryotes for which complete genome sequences are available. PMID:15342554

  5. Comparative Genomic Analysis of Human Fungal Pathogens Causing Paracoccidioidomycosis

    PubMed Central

    Desjardins, Christopher A.; Champion, Mia D.; Holder, Jason W.; Muszewska, Anna; Goldberg, Jonathan; Bailão, Alexandre M.; Brigido, Marcelo Macedo; Ferreira, Márcia Eliana da Silva; Garcia, Ana Maria; Grynberg, Marcin; Gujja, Sharvari; Heiman, David I.; Henn, Matthew R.; Kodira, Chinnappa D.; León-Narváez, Henry; Longo, Larissa V. G.; Ma, Li-Jun; Malavazi, Iran; Matsuo, Alisson L.; Morais, Flavia V.; Pereira, Maristela; Rodríguez-Brito, Sabrina; Sakthikumar, Sharadha; Salem-Izacc, Silvia M.; Sykes, Sean M.; Teixeira, Marcus Melo; Vallejo, Milene C.; Walter, Maria Emília Machado Telles; Yandava, Chandri; Young, Sarah; Zeng, Qiandong; Zucker, Jeremy; Felipe, Maria Sueli; Goldman, Gustavo H.; Haas, Brian J.; McEwen, Juan G.; Nino-Vega, Gustavo; Puccia, Rosana; San-Blas, Gioconda; Soares, Celia Maria de Almeida; Birren, Bruce W.; Cuomo, Christina A.

    2011-01-01

    Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18) and one strain of Paracoccidioides lutzii (Pb01). These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic species of

  6. Comparative genomic analysis of human fungal pathogens causing paracoccidioidomycosis.

    PubMed

    Desjardins, Christopher A; Champion, Mia D; Holder, Jason W; Muszewska, Anna; Goldberg, Jonathan; Bailão, Alexandre M; Brigido, Marcelo Macedo; Ferreira, Márcia Eliana da Silva; Garcia, Ana Maria; Grynberg, Marcin; Gujja, Sharvari; Heiman, David I; Henn, Matthew R; Kodira, Chinnappa D; León-Narváez, Henry; Longo, Larissa V G; Ma, Li-Jun; Malavazi, Iran; Matsuo, Alisson L; Morais, Flavia V; Pereira, Maristela; Rodríguez-Brito, Sabrina; Sakthikumar, Sharadha; Salem-Izacc, Silvia M; Sykes, Sean M; Teixeira, Marcus Melo; Vallejo, Milene C; Walter, Maria Emília Machado Telles; Yandava, Chandri; Young, Sarah; Zeng, Qiandong; Zucker, Jeremy; Felipe, Maria Sueli; Goldman, Gustavo H; Haas, Brian J; McEwen, Juan G; Nino-Vega, Gustavo; Puccia, Rosana; San-Blas, Gioconda; Soares, Celia Maria de Almeida; Birren, Bruce W; Cuomo, Christina A

    2011-10-01

    Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18) and one strain of Paracoccidioides lutzii (Pb01). These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic species of

  7. Comparative analysis of microsatellites in chloroplast genomes of lower and higher plants.

    PubMed

    George, Biju; Bhatt, Bhavin S; Awasthi, Mayur; George, Binu; Singh, Achuit K

    2015-11-01

    Microsatellites, or simple sequence repeats (SSRs), contain repetitive DNA sequence where tandem repeats of one to six base pairs are present number of times. Chloroplast genome sequences have been  shown to possess extensive variations in the length, number and distribution of SSRs. However, a comparative analysis of chloroplast microsatellites is not available. Considering their potential importance in generating genomic diversity, we have systematically analysed the abundance and distribution of simple and compound microsatellites in 164 sequenced chloroplast genomes from wide range of plants. The key findings of these studies are (1) a large number of mononucleotide repeats as compared to SSR(2-6)(di-, tri-, tetra-, penta-, hexanucleotide repeats) are present in all chloroplast genomes investigated, (2) lower plants such as algae show wide variation in relative abundance, density and distribution of microsatellite repeats as compared to flowering plants, (3) longer SSRs are excluded from coding regions of most chloroplast genomes, (4) GC content has a weak influence on number, relative abundance and relative density of mononucleotide as well as SSR(2-6). However, GC content strongly showed negative correlation with relative density (R (2) = 0.5, P < 0.05) and relative abundance (R (2) = 0.6, P < 0.05) of cSSRs. In summary, our comparative studies of chloroplast genomes illustrate the variable distribution of microsatellites and revealed that chloroplast genome of smaller plants possesses relatively more genomic diversity compared to higher plants.

  8. Parallel WGA and WTA for Comparative Genome and Transcriptome NGS Analysis Using Tiny Cell Numbers.

    PubMed

    Korfhage, Christian; Fricke, Evelyn; Meier, Andreas

    2015-07-01

    Genomic DNA determines how and when the transcriptome is changed by a trigger or environmental change and how cellular metabolism is influenced. Comparative genome and transcriptome analysis of the same cell sample links a defined genome with all changes in the bases, structure, or numbers of the transcriptome. However, comparative genome and transcriptome analysis using next-generation sequencing (NGS) or real-time PCR is often limited by the small amount of sample available. In mammals, the amount of DNA and RNA in a single cell is ∼10 picograms, but deep analysis of the genome and transcriptome currently requires several hundred nanograms of nucleic acids for library preparation for NGS sequencing. Consequently, accurate whole-genome amplification (WGA) and whole-transcriptome amplification (WTA) is required for such quantitative analysis. This unit describes how the genome and the transcriptome of a tiny number of cells can be amplified in a highly parallel and comparable process. Protocols for quality control of amplified DNA and application of amplified DNA for NGS are included.

  9. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum.

    PubMed

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio; Middelboe, Mathias

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259-93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  10. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    PubMed Central

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  11. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum.

    PubMed

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio; Middelboe, Mathias

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259-93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  12. Genome Mapping in Plant Comparative Genomics.

    PubMed

    Chaney, Lindsay; Sharp, Aaron R; Evans, Carrie R; Udall, Joshua A

    2016-09-01

    Genome mapping produces fingerprints of DNA sequences to construct a physical map of the whole genome. It provides contiguous, long-range information that complements and, in some cases, replaces sequencing data. Recent advances in genome-mapping technology will better allow researchers to detect large (>1kbp) structural variations between plant genomes. Some molecular and informatics complications need to be overcome for this novel technology to achieve its full utility. This technology will be useful for understanding phenotype responses due to DNA rearrangements and will yield insights into genome evolution, particularly in polyploids. In this review, we outline recent advances in genome-mapping technology, including the processes required for data collection and analysis, and applications in plant comparative genomics.

  13. Investigating hookworm genomes by comparative analysis of two Ancylostoma species

    PubMed Central

    Mitreva, Makedonka; McCarter, James P; Arasu, Prema; Hawdon, John; Martin, John; Dante, Mike; Wylie, Todd; Xu, Jian; Stajich, Jason E; Kapulkin, Wadim; Clifton, Sandra W; Waterston, Robert H; Wilson, Richard K

    2005-01-01

    Background Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum. Results Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 in infective L3 and vit-3 in adults. Over 80% of the genes have homologs in C. elegans, and nearly 30% of these were with observable RNA interference phenotypes. Homologies were identified to nematode-specific and clade V specific gene families. To study the evolution of hookworm genes, 574 A. caninum / A. ceylanicum orthologs were identified, all of which were found to be under purifying selection with distribution ratios of nonsynonymous to synonymous amino acid substitutions similar to that reported for C. elegans / C. briggsae orthologs. The phylogenetic distance between A. caninum and A. ceylanicum is almost identical to that for C. elegans / C. briggsae. Conclusion The genes discovered should substantially accelerate research toward better understanding of the parasites' basic biology as well as new therapies including vaccines and novel anthelmintics. PMID:15854223

  14. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  15. Comparative genomic analysis as a tool for biologicaldiscovery

    SciTech Connect

    Nobrega, Marcelo A.; Pennacchio, Len A.

    2003-03-30

    Biology is a discipline rooted in comparisons. Comparative physiology has assembled a detailed catalogue of the biological similarities and differences between species, revealing insights into how life has adapted to fill a wide-range of environmental niches. For example, the oxygen and carbon dioxide carrying capacity of vertebrate has evolved to provide strong advantages for species respiring at sea level, at high elevation or within water. Comparative- anatomy, -biochemistry, -pharmacology, -immunology and -cell biology have provided the fundamental paradigms from which each discipline has grown.

  16. Comparative genomic analysis of the Tribolium immune system

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The red flour beetle Tribolium castaneum has contributed a wealth of knowledge on insect development but limited information about innate immunity. With its complete nucleotide sequence determined, we have taken the opportunity to annotate immunity-related genes and compare them with homologous mole...

  17. Comparative genome analysis of the oleaginous yeast Trichosporon fermentans reveals its potential applications in lipid accumulation.

    PubMed

    Shen, Qi; Chen, Yue; Jin, Danfeng; Lin, Hui; Wang, Qun; Zhao, Yu-Hua

    2016-11-01

    In this work, Trichosporon fermentans CICC 1368, which has been shown to accumulate cellular lipids efficiently using industry-agricultural wastes, was subjected to preliminary genome analysis, yielding a genome size of 31.3 million bases and 12,702 predicted protein-coding genes. Our analysis also showed a high degree of gene duplications and unique genes compared with those observed in other oleaginous yeasts, with 3-4-fold more genes related to fatty acid elongation and degradation compared with those in Rhodosporidium toruloides NP11 and Yarrowia lipolytica CLIB122. Phylogenetic analysis with other oleaginous microbes suggested that the lipogenic capacity of T. fermentans was obtained during evolution after the divergence of genera. Thus, our study provided the first draft genome and comparative analysis of T. fermentans, laying the foundation for its genetic improvement to facilitate cost-effective lipid production. PMID:27664738

  18. Comparative genome analysis of the oleaginous yeast Trichosporon fermentans reveals its potential applications in lipid accumulation.

    PubMed

    Shen, Qi; Chen, Yue; Jin, Danfeng; Lin, Hui; Wang, Qun; Zhao, Yu-Hua

    2016-11-01

    In this work, Trichosporon fermentans CICC 1368, which has been shown to accumulate cellular lipids efficiently using industry-agricultural wastes, was subjected to preliminary genome analysis, yielding a genome size of 31.3 million bases and 12,702 predicted protein-coding genes. Our analysis also showed a high degree of gene duplications and unique genes compared with those observed in other oleaginous yeasts, with 3-4-fold more genes related to fatty acid elongation and degradation compared with those in Rhodosporidium toruloides NP11 and Yarrowia lipolytica CLIB122. Phylogenetic analysis with other oleaginous microbes suggested that the lipogenic capacity of T. fermentans was obtained during evolution after the divergence of genera. Thus, our study provided the first draft genome and comparative analysis of T. fermentans, laying the foundation for its genetic improvement to facilitate cost-effective lipid production.

  19. Genome Based Phylogeny and Comparative Genomic Analysis of Intra-Mammary Pathogenic Escherichia coli

    PubMed Central

    Richards, Vincent P.; Lefébure, Tristan; Pavinski Bitar, Paulina D.; Dogan, Belgin; Simpson, Kenneth W.; Schukken, Ynte H.; Stanhope, Michael J.

    2015-01-01

    Escherichia coli is an important cause of bovine mastitis and can cause both severe inflammation with a short-term transient infection, as well as less severe, but more chronic inflammation and infection persistence. E. coli is a highly diverse organism that has been classified into a number of different pathotypes or pathovars, and mammary pathogenic E. coli (MPEC) has been proposed as a new such pathotype. The purpose of this study was to use genome sequence data derived from both transient and persistent MPEC isolates (two isolates of each phenotype) to construct a genome-based phylogeny that places MPEC in its phylogenetic context with other E. coli pathovars. A subsidiary goal was to conduct comparative genomic analyses of these MPEC isolates with other E. coli pathovars to provide a preliminary perspective on loci that might be correlated with the MPEC phenotype. Both concatenated and consensus tree phylogenies did not support MPEC monophyly or the monophyly of either transient or persistent phenotypes. Three of the MPEC isolates (ECA-727, ECC-Z, and ECA-O157) originated from within the predominately commensal clade of E. coli, referred to as phylogroup A. The fourth MPEC isolate, of the persistent phenotype (ECC-1470), was sister group to an isolate of ETEC, falling within the E. coli B1 clade. This suggests that the MPEC phenotype has arisen on numerous independent occasions and that this has often, although not invariably, occurred from commensal ancestry. Examination of the genes present in the MPEC strains relative to the commensal strains identified a consistent presence of the type VI secretion system (T6SS) in the MPEC strains, with only occasional representation in commensal strains, suggesting that T6SS may be associated with MPEC pathogenesis and/or as an inter-bacterial competitive attribute and therefore could represent a useful target to explore for the development of MPEC specific inhibitors. PMID:25807497

  20. Comparative genomic analysis of esophageal adenocarcinoma and squamous cell carcinoma.

    PubMed

    Agrawal, Nishant; Jiao, Yuchen; Bettegowda, Chetan; Hutfless, Susan M; Wang, Yuxuan; David, Stefan; Cheng, Yulan; Twaddell, William S; Latt, Nyan L; Shin, Eun J; Wang, Li-Dong; Wang, Liang; Yang, Wancai; Velculescu, Victor E; Vogelstein, Bert; Papadopoulos, Nickolas; Kinzler, Kenneth W; Meltzer, Stephen J

    2012-10-01

    Esophageal cancer ranks sixth in cancer death. To explore its genetic origins, we conducted exomic sequencing on 11 esophageal adenocarcinomas (EAC) and 12 esophageal squamous cell carcinomas (ESCC) from the United States. Interestingly, inactivating mutations of NOTCH1 were identified in 21% of ESCCs but not in EACs. There was a substantial disparity in the spectrum of mutations, with more indels in ESCCs, A:T>C:G transversions in EACs, and C:G>G:C transversions in ESCCs (P < 0.0001). Notably, NOTCH1 mutations were more frequent in North American ESCCs (11 of 53 cases) than in ESCCs from China (1 of 48 cases). A parallel analysis found that most mutations in EACs were already present in matched Barrett esophagus. These discoveries highlight key genetic differences between EACs and ESCCs and between American and Chinese ESCCs, and suggest that NOTCH1 is a tumor suppressor gene in the esophagus. Finally, we provide a genetic basis for the evolution of EACs from Barrett esophagus.

  1. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  2. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources. PMID:27446038

  3. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources.

  4. Comparative genomics of Brassicaceae crops.

    PubMed

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-05-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  5. Comparative analysis of Salmonella genomes identifies a metabolic network for escalating growth in the inflamed gut.

    PubMed

    Nuccio, Sean-Paul; Bäumler, Andreas J

    2014-03-18

    The Salmonella genus comprises a group of pathogens associated with illnesses ranging from gastroenteritis to typhoid fever. We performed an in silico analysis of comparatively reannotated Salmonella genomes to identify genomic signatures indicative of disease potential. By removing numerous annotation inconsistencies and inaccuracies, the process of reannotation identified a network of 469 genes involved in central anaerobic metabolism, which was intact in genomes of gastrointestinal pathogens but degrading in genomes of extraintestinal pathogens. This large network contained pathways that enable gastrointestinal pathogens to utilize inflammation-derived nutrients as well as many of the biochemical reactions used for the enrichment and biochemical discrimination of Salmonella serovars. Thus, comparative genome analysis identifies a metabolic network that provides clues about the strategies for nutrient acquisition and utilization that are characteristic of gastrointestinal pathogens. IMPORTANCE While some Salmonella serovars cause infections that remain localized to the gut, others disseminate throughout the body. Here, we compared Salmonella genomes to identify characteristics that distinguish gastrointestinal from extraintestinal pathogens. We identified a large metabolic network that is functional in gastrointestinal pathogens but decaying in extraintestinal pathogens. While taxonomists have used traits from this network empirically for many decades for the enrichment and biochemical discrimination of Salmonella serovars, our findings suggest that it is part of a "business plan" for growth in the inflamed gastrointestinal tract. By identifying a large metabolic network characteristic of Salmonella serovars associated with gastroenteritis, our in silico analysis provides a blueprint for potential strategies to utilize inflammation-derived nutrients and edge out competing gut microbes.

  6. Uterine smooth muscle tumor analysis by comparative genomic hybridization: a useful diagnostic tool in challenging lesions.

    PubMed

    Croce, Sabrina; Ribeiro, Agnes; Brulard, Celine; Noel, Jean-Christophe; Amant, Frederic; Stoeckle, Eberhard; Devouassoux-Shisheborah, Mojgan; Floquet, Anne; Arnould, Laurent; Guyon, Frederic; Mishellany, Florence; Garbay, Delphine; Cuppens, Tine; Zikan, Michal; Leroux, Agnès; Frouin, Eric; Duvillard, Pierre; Terrier, Philippe; Farre, Isabelle; Valo, Isabelle; MacGrogan, Gaetan M; Chibon, Frederic

    2015-07-01

    The diagnosis and management of uterine smooth muscle tumors with uncertain malignant potential (STUMP) is often challenging, and genomic data on these lesions as well as on uterine smooth muscle lesions are limited. We tested the hypothesis that genomic profile determination by array-CGH could split STUMP into a benign group with scarce chromosomal alterations akin to leiomyoma and a malignant group with high chromosomal instability akin to leiomyosarcoma. Array-CGH genomic profile analysis was conducted for a series of 29 cases of uterine STUMP. A group of ten uterine leiomyomas and ten uterine leiomyosarcomas served as controls. The mean age was 50 years (range, 24-85) and the follow-up ranged from 12 to 156 months (average 70 months). Since STUMP is a heterogenous group of tumors with genomic profiles that can harbor few to many chromosomal alterations, we compared genomic indices in leiomyomas and leiomyosarcomas and set a genomic index=10 threshold. Tumors with a genomic index <10 were classified as nonrecurring STUMPs and those with a genomic index >10 represented STUMPs with recurrences and unfavorable outcomes. Hence, the genomic index threshold splits the STUMP category into two groups of tumors with different outcomes: a group comparable to leiomyomas and another similar to leiomyosarcomas, but more indolent. In our STUMP series, genomic analysis by array-CGH is an innovative diagnostic tool for problematic smooth muscle uterine lesions, complementary to the morphological evaluation approach. We provide an improved classification method for distinguishing truly malignant tumors from benign lesions within the category of STUMP, especially those with equivocal morphological features.

  7. Sequence Search and Comparative Genomic Analysis of SUMO-Activating Enzymes Using CoGe.

    PubMed

    Carretero-Paulet, Lorenzo; Albert, Victor A

    2016-01-01

    The growing number of genome sequences completed during the last few years has made necessary the development of bioinformatics tools for the easy access and retrieval of sequence data, as well as for downstream comparative genomic analyses. Some of these are implemented as online platforms that integrate genomic data produced by different genome sequencing initiatives with data mining tools as well as various comparative genomic and evolutionary analysis possibilities.Here, we use the online comparative genomics platform CoGe ( http://www.genomevolution.org/coge/ ) (Lyons and Freeling. Plant J 53:661-673, 2008; Tang and Lyons. Front Plant Sci 3:172, 2012) (1) to retrieve the entire complement of orthologous and paralogous genes belonging to the SUMO-Activating Enzymes 1 (SAE1) gene family from a set of species representative of the Brassicaceae plant eudicot family with genomes fully sequenced, and (2) to investigate the history, timing, and molecular mechanisms of the gene duplications driving the evolutionary expansion and functional diversification of the SAE1 family in Brassicaceae. PMID:27424761

  8. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    PubMed Central

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis. PMID:27525259

  9. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India.

    PubMed

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar; Radhakrishnan, Girish

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis. PMID:27525259

  10. Comparative Genomics Analysis of Rice and Pineapple Contributes to Understand the Chromosome Number Reduction and Genomic Changes in Grasses

    PubMed Central

    Wang, Jinpeng; Yu, Jiaxiang; Sun, Pengchuan; Li, Yuxian; Xia, Ruiyan; Liu, Yinzhe; Ma, Xuelian; Yu, Jigao; Yang, Nanshan; Lei, Tianyu; Wang, Zhenyi; Wang, Li; Ge, Weina; Song, Xiaoming; Liu, Xiaojian; Sun, Sangrong; Liu, Tao; Jin, Dianchuan; Pan, Yuxin; Wang, Xiyin

    2016-01-01

    Rice is one of the most researched model plant, and has a genome structure most resembling that of the grass common ancestor after a grass common tetraploidization ∼100 million years ago. There has been a standing controversy whether there had been five or seven basic chromosomes, before the tetraploidization, which were tackled but could not be well solved for the lacking of a sequenced and assembled outgroup plant to have a conservative genome structure. Recently, the availability of pineapple genome, which has not been subjected to the grass-common tetraploidization, provides a precious opportunity to solve the above controversy and to research into genome changes of rice and other grasses. Here, we performed a comparative genomics analysis of pineapple and rice, and found solid evidence that grass-common ancestor had 2n = 2x = 14 basic chromosomes before the tetraploidization and duplicated to 2n = 4x = 28 after the event. Moreover, we proposed that enormous gene missing from duplicated regions in rice should be explained by an allotetraploid produced by prominently divergent parental lines, rather than gene losses after their divergence. This means that genome fractionation might have occurred before the formation of the allotetraploid grass ancestor. PMID:27757123

  11. Comparative analysis of plastid genomes of non-photosynthetic Ericaceae and their photosynthetic relatives.

    PubMed

    Logacheva, Maria D; Schelkunov, Mikhail I; Shtratnikova, Victoria Y; Matveeva, Maria V; Penin, Aleksey A

    2016-07-25

    Although plastid genomes of flowering plants are typically highly conserved regarding their size, gene content and order, there are some exceptions. Ericaceae, a large and diverse family of flowering plants, warrants special attention within the context of plastid genome evolution because it includes both non-photosynthetic and photosynthetic species with rearranged plastomes and putative losses of "essential" genes. We characterized plastid genomes of three species of Ericaceae, non-photosynthetic Monotropa uniflora and Hypopitys monotropa and photosynthetic Pyrola rotundifolia, using high-throughput sequencing. As expected for non-photosynthetic plants, M. uniflora and H. monotropa have small plastid genomes (46 kb and 35 kb, respectively) lacking genes related to photosynthesis, whereas P. rotundifolia has a larger genome (169 kb) with a gene set similar to other photosynthetic plants. The examined genomes contain an unusually high number of repeats and translocations. Comparative analysis of the expanded set of Ericaceae plastomes suggests that the genes clpP and accD that are present in the plastid genomes of almost all plants have not been lost in this family (as was previously thought) but rather persist in these genomes in unusual forms. Also we found a new gene in P. rotundifolia that emerged as a result of duplication of rps4 gene.

  12. Comparative analysis of plastid genomes of non-photosynthetic Ericaceae and their photosynthetic relatives.

    PubMed

    Logacheva, Maria D; Schelkunov, Mikhail I; Shtratnikova, Victoria Y; Matveeva, Maria V; Penin, Aleksey A

    2016-01-01

    Although plastid genomes of flowering plants are typically highly conserved regarding their size, gene content and order, there are some exceptions. Ericaceae, a large and diverse family of flowering plants, warrants special attention within the context of plastid genome evolution because it includes both non-photosynthetic and photosynthetic species with rearranged plastomes and putative losses of "essential" genes. We characterized plastid genomes of three species of Ericaceae, non-photosynthetic Monotropa uniflora and Hypopitys monotropa and photosynthetic Pyrola rotundifolia, using high-throughput sequencing. As expected for non-photosynthetic plants, M. uniflora and H. monotropa have small plastid genomes (46 kb and 35 kb, respectively) lacking genes related to photosynthesis, whereas P. rotundifolia has a larger genome (169 kb) with a gene set similar to other photosynthetic plants. The examined genomes contain an unusually high number of repeats and translocations. Comparative analysis of the expanded set of Ericaceae plastomes suggests that the genes clpP and accD that are present in the plastid genomes of almost all plants have not been lost in this family (as was previously thought) but rather persist in these genomes in unusual forms. Also we found a new gene in P. rotundifolia that emerged as a result of duplication of rps4 gene. PMID:27452401

  13. Comparative analysis of plastid genomes of non-photosynthetic Ericaceae and their photosynthetic relatives

    PubMed Central

    Logacheva, Maria D.; Schelkunov, Mikhail I.; Shtratnikova, Victoria Y.; Matveeva, Maria V.; Penin, Aleksey A.

    2016-01-01

    Although plastid genomes of flowering plants are typically highly conserved regarding their size, gene content and order, there are some exceptions. Ericaceae, a large and diverse family of flowering plants, warrants special attention within the context of plastid genome evolution because it includes both non-photosynthetic and photosynthetic species with rearranged plastomes and putative losses of “essential” genes. We characterized plastid genomes of three species of Ericaceae, non-photosynthetic Monotropa uniflora and Hypopitys monotropa and photosynthetic Pyrola rotundifolia, using high-throughput sequencing. As expected for non-photosynthetic plants, M. uniflora and H. monotropa have small plastid genomes (46 kb and 35 kb, respectively) lacking genes related to photosynthesis, whereas P. rotundifolia has a larger genome (169 kb) with a gene set similar to other photosynthetic plants. The examined genomes contain an unusually high number of repeats and translocations. Comparative analysis of the expanded set of Ericaceae plastomes suggests that the genes clpP and accD that are present in the plastid genomes of almost all plants have not been lost in this family (as was previously thought) but rather persist in these genomes in unusual forms. Also we found a new gene in P. rotundifolia that emerged as a result of duplication of rps4 gene. PMID:27452401

  14. Genome Information Broker for Viruses (GIB-V): database for comparative analysis of virus genomes

    PubMed Central

    Hirahata, Masaki; Abe, Takashi; Tanaka, Naoto; Kuwana, Yoshikazu; Shigemoto, Yasumasa; Miyazaki, Satoru; Suzuki, Yoshiyuki; Sugawara, Hideaki

    2007-01-01

    Genome Information Broker for Viruses (GIB-V) is a comprehensive virus genome/segment database. We extracted 18 418 complete virus genomes/segments from the International Nucleotide Sequence Database Collaboration (INSDC, ) by DNA Data Bank of Japan (DDBJ), EMBL and GenBank and stored them in our system. The list of registered viruses is arranged hierarchically according to taxonomy. Keyword searches can be performed for genome/segment data or biological features of any virus stored in GIB-V. GIB-V is equipped with a BLAST search function, and search results are displayed graphically or in list form. Moreover, the BLAST results can be used online with the ClustalW feature of the DDBJ. All available virus genome/segment data can be collected by the GIB-V download function. GIB-V can be accessed at no charge at . PMID:17158166

  15. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    PubMed

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  16. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective

    PubMed Central

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163

  17. Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae

    PubMed Central

    Lavín, José L; Kiil, Kristoffer; Resano, Ohiana; Ussery, David W; Oguiza, José A

    2007-01-01

    Background Pseudomonas syringae is a widespread bacterial plant pathogen, and strains of P. syringae may be assigned to different pathovars based on host specificity among different plant species. The genomes of P. syringae pv. syringae (Psy) B728a, pv. tomato (Pto) DC3000 and pv. phaseolicola (Pph) 1448A have been recently sequenced providing a major resource for comparative genomic analysis. A mechanism commonly found in bacteria for signal transduction is the two-component system (TCS), which typically consists of a sensor histidine kinase (HK) and a response regulator (RR). P. syringae requires a complex array of TCS proteins to cope with diverse plant hosts, host responses, and environmental conditions. Results Based on the genomic data, pattern searches with Hidden Markov Model (HMM) profiles have been used to identify putative HKs and RRs. The genomes of Psy B728a, Pto DC3000 and Pph 1448A were found to contain a large number of genes encoding TCS proteins, and a core of complete TCS proteins were shared between these genomes: 30 putative TCS clusters, 11 orphan HKs, 33 orphan RRs, and 16 hybrid HKs. A close analysis of the distribution of genes encoding TCS proteins revealed important differences in TCS proteins among the three P. syringae pathovars. Conclusion In this article we present a thorough analysis of the identification and distribution of TCS proteins among the sequenced genomes of P. syringae. We have identified differences in TCS proteins among the three P. syringae pathovars that may contribute to their diverse host ranges and association with plant hosts. The identification and analysis of the repertoire of TCS proteins in the genomes of P. syringae pathovars constitute a basis for future functional genomic studies of the signal transduction pathways in this important bacterial phytopathogen. PMID:17971244

  18. How to become a uropathogen: Comparative genomic analysis of extraintestinal pathogenic Escherichia coli strains

    PubMed Central

    Brzuszkiewicz, Elzbieta; Brüggemann, Holger; Liesegang, Heiko; Emmerth, Melanie; Ölschläger, Tobias; Nagy, Gábor; Albermann, Kaj; Wagner, Christian; Buchrieser, Carmen; Emődy, Levente; Gottschalk, Gerhard; Hacker, Jörg; Dobrindt, Ulrich

    2006-01-01

    Uropathogenic Escherichia coli (UPEC) strain 536 (O6:K15:H31) is one of the model organisms of extraintestinal pathogenic E. coli (ExPEC). To analyze this strain's genetic basis of urovirulence, we sequenced the entire genome and compared the data with the genome sequence of UPEC strain CFT073 (O6:K2:H1) and to the available genomes of nonpathogenic E. coli strain MG1655 (K-12) and enterohemorrhagic E. coli. The genome of strain 536 is ≈292 kb smaller than that of strain CFT073. Genomic differences between both UPEC are mainly restricted to large pathogenicity islands, parts of which are unique to strain 536 or CFT073. Genome comparison underlines that repeated insertions and deletions in certain parts of the genome contribute to genome evolution. Furthermore, 427 and 432 genes are only present in strain 536 or in both UPEC, respectively. The majority of the latter genes is encoded within smaller horizontally acquired DNA regions scattered all over the genome. Several of these genes are involved in increasing the pathogens' fitness and adaptability. Analysis of virulence-associated traits expressed in the two UPEC O6 strains, together with genome comparison, demonstrate the marked genetic and phenotypic variability among UPEC. The ability to accumulate and express a variety of virulence-associated genes distinguishes ExPEC from many commensals and forms the basis for the individual virulence potential of ExPEC. Accordingly, instead of a common virulence mechanism, different ways exist among ExPEC to cause disease. PMID:16912116

  19. How to become a uropathogen: comparative genomic analysis of extraintestinal pathogenic Escherichia coli strains.

    PubMed

    Brzuszkiewicz, Elzbieta; Brüggemann, Holger; Liesegang, Heiko; Emmerth, Melanie; Olschläger, Tobias; Nagy, Gábor; Albermann, Kaj; Wagner, Christian; Buchrieser, Carmen; Emody, Levente; Gottschalk, Gerhard; Hacker, Jörg; Dobrindt, Ulrich

    2006-08-22

    Uropathogenic Escherichia coli (UPEC) strain 536 (O6:K15:H31) is one of the model organisms of extraintestinal pathogenic E. coli (ExPEC). To analyze this strain's genetic basis of urovirulence, we sequenced the entire genome and compared the data with the genome sequence of UPEC strain CFT073 (O6:K2:H1) and to the available genomes of nonpathogenic E. coli strain MG1655 (K-12) and enterohemorrhagic E. coli. The genome of strain 536 is approximately 292 kb smaller than that of strain CFT073. Genomic differences between both UPEC are mainly restricted to large pathogenicity islands, parts of which are unique to strain 536 or CFT073. Genome comparison underlines that repeated insertions and deletions in certain parts of the genome contribute to genome evolution. Furthermore, 427 and 432 genes are only present in strain 536 or in both UPEC, respectively. The majority of the latter genes is encoded within smaller horizontally acquired DNA regions scattered all over the genome. Several of these genes are involved in increasing the pathogens' fitness and adaptability. Analysis of virulence-associated traits expressed in the two UPEC O6 strains, together with genome comparison, demonstrate the marked genetic and phenotypic variability among UPEC. The ability to accumulate and express a variety of virulence-associated genes distinguishes ExPEC from many commensals and forms the basis for the individual virulence potential of ExPEC. Accordingly, instead of a common virulence mechanism, different ways exist among ExPEC to cause disease.

  20. Comparative analysis of dinoflagellate chloroplast genomes reveals rRNA and tRNA genes

    PubMed Central

    Barbrook, Adrian C; Santucci, Nicole; Plenderleith, Lindsey J; Hiller, Roger G; Howe, Christopher J

    2006-01-01

    Background Peridinin-containing dinoflagellates have a highly reduced chloroplast genome, which is unlike that found in other chloroplast containing organisms. Genome reduction appears to be the result of extensive transfer of genes to the nuclear genome. Unusually the genes believed to be remaining in the chloroplast genome are found on small DNA 'minicircles'. In this study we present a comparison of sets of minicircle sequences from three dinoflagellate species. Results PCR was used to amplify several minicircles from Amphidinium carterae so that a homologous set of gene-containing minicircles was available for Amphidinium carterae and Amphidinium operculatum, two apparently closely related peridinin-containing dinoflagellates. We compared the sequences of these minicircles to determine the content and characteristics of their chloroplast genomes. We also made comparisons with minicircles which had been obtained from Heterocapsa triquetra, another peridinin-containing dinoflagellate. These in silico comparisons have revealed several genetic features which were not apparent in single species analyses. The features include further protein coding genes, unusual rRNA genes, which we show are transcribed, and the first examples of tRNA genes from peridinin-containing dinoflagellate chloroplast genomes. Conclusion Comparative analysis of minicircle sequences has allowed us to identify previously unrecognised features of dinoflagellate chloroplast genomes, including additional protein and RNA genes. The chloroplast rRNA gene sequences are radically different from those in other organisms, and in many ways resemble the rRNA genes found in some highly reduced mitochondrial genomes. The retention of certain tRNA genes in the dinoflagellate chloroplast genome has important implications for models of chloroplast-mitochondrion interaction. PMID:17123435

  1. Comparative genomics and functional analysis of the 936 group of lactococcal Siphoviridae phages

    PubMed Central

    Murphy, James; Bottacini, Francesca; Mahony, Jennifer; Kelleher, Philip; Neve, Horst; Zomer, Aldert; Nauta, Arjen; van Sinderen, Douwe

    2016-01-01

    Genome sequencing and comparative analysis of bacteriophage collections has greatly enhanced our understanding regarding their prevalence, phage-host interactions as well as the overall biodiversity of their genomes. This knowledge is very relevant to phages infecting Lactococcus lactis, since they constitute a significant risk factor for dairy fermentations. Of the eighty four lactococcal phage genomes currently available, fifty five belong to the so-called 936 group, the most prevalent of the ten currently recognized lactococcal phage groups. Here, we report the genetic characteristics of a new collection of 936 group phages. By combining these genomes to those sequenced previously we determined the core and variable elements of the 936 genome. Genomic variation occurs across the 936 phage genome, such as genetic elements that (i) lead to a +1 translational frameshift resulting in the formation of additional structures on the phage tail, (ii) specify a double neck passage structure, and (iii) encode packaging module-associated methylases. Hierarchical clustering of the gene complement of the 936 group phages and nucleotide alignments allowed grouping of the ninety 936 group phages into distinct clusters, which in general appear to correspond with their geographical origin. PMID:26892066

  2. Comparative Genomic Analysis of Malaria Mosquito Vector-Associated Novel Pathogen Elizabethkingia anophelis

    PubMed Central

    Teo, Jeanette; Tan, Sean Yang-Yi; Liu, Yang; Tay, Martin; Ding, Yichen; Li, Yingying; Kjelleberg, Staffan; Givskov, Michael; Lin, Raymond T.P.; Yang, Liang

    2014-01-01

    Acquisition of Elizabethkingia infections in intensive care units (ICUs) has risen in the past decade. Treatment of Elizabethkingia infections is challenging due to the lack of effective therapeutic regimens, leading to a high mortality rate. Elizabethkingia infections have long been attributed to Elizabethkingia meningoseptica. Recently, we used whole-genome sequencing to reveal that E. anophelis is the pathogenic agent for an Elizabethkingia outbreak at two ICUs. We performed comparative genomic analysis of seven hospital-isolated E. anophelis strains with five available Elizabethkingia spp. genomes deposited in the National Center for Biotechnology Information Database. A pan-genomic approach was applied to identify the core- and pan-genome for the Elizabethkingia genus. We showed that unlike the hospital-isolated pathogen E. meningoseptica ATCC 12535 strain, the hospital-isolated E. anophelis strains have genome content and organization similar to the E. anophelis Ag1 and R26 strains isolated from the midgut microbiota of the malaria mosquito vector Anopheles gambiae. Both the core- and accessory genomes of Elizabethkingia spp. possess genes conferring antibiotic resistance and virulence. Our study highlights that E. anophelis is an emerging bacterial pathogen for hospital environments. PMID:24803570

  3. Complete Mitochondrial Genome of Haplorchis taichui and Comparative Analysis with Other Trematodes

    PubMed Central

    Lee, Dongmin; Choe, Seongjun; Park, Hansol; Jeon, Hyeong-Kyu; Chai, Jong-Yil; Sohn, Woon-Mok; Yong, Tai-Soon; Min, Duk-Young; Rim, Han-Jong

    2013-01-01

    Mitochondrial genomes have been extensively studied for phylogenetic purposes and to investigate intra- and interspecific genetic variations. In recent years, numerous groups have undertaken sequencing of platyhelminth mitochondrial genomes. Haplorchis taichui (family Heterophyidae) is a trematode that infects humans and animals mainly in Asia, including the Mekong River basin. We sequenced and determined the organization of the complete mitochondrial genome of H. taichui. The mitochondrial genome is 15,130 bp long, containing 12 protein-coding genes, 2 ribosomal RNAs (rRNAs, a small and a large subunit), and 22 transfer RNAs (tRNAs). Like other trematodes, it does not encode the atp8 gene. All genes are transcribed from the same strand. The ATG initiation codon is used for 9 protein-coding genes, and GTG for the remaining 3 (nad1, nad4, and nad5). The mitochondrial genome of H. taichui has a single long non-coding region between trnE and trnG. H. taichui has evolved as being more closely related to Opisthorchiidae than other trematode groups with maximal support in the phylogenetic analysis. Our results could provide a resource for the comparative mitochondrial genome analysis of trematodes, and may yield genetic markers for molecular epidemiological investigations into intestinal flukes. PMID:24516279

  4. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

    PubMed Central

    Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  5. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    PubMed

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  6. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    PubMed

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  7. Complete Genome Sequence of Borrelia afzelii K78 and Comparative Genome Analysis

    PubMed Central

    Schüler, Wolfgang; Bunikis, Ignas; Weber-Lehman, Jacqueline; Comstedt, Pär; Kutschan-Bunikis, Sabrina; Stanek, Gerold; Huber, Jutta; Meinke, Andreas; Bergström, Sven; Lundberg, Urban

    2015-01-01

    The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp) and 13 plasmids (8 linear and 5 circular) together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes. PMID:25798594

  8. Genome Sequencing and Comparative Genomics Analysis Revealed Pathogenic Potential in Penicillium capsulatum as a Novel Fungal Pathogen Belonging to Eurotiales

    PubMed Central

    Yang, Ying; Chen, Min; Li, Zongwei; Al-Hatmi, Abdullah M. S.; de Hoog, Sybren; Pan, Weihua; Ye, Qiang; Bo, Xiaochen; Li, Zhen; Wang, Shengqi; Wang, Junzhi; Chen, Huipeng; Liao, Wanqing

    2016-01-01

    Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptomes of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNPs in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen. PMID:27761131

  9. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok.

    PubMed

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-08-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution. PMID:27581124

  10. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok

    PubMed Central

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-01-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution. PMID:27581124

  11. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok.

    PubMed

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-08-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution.

  12. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis

    PubMed Central

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5’ portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids. PMID:26046631

  13. Comparative genomic analysis of four representative plant growth-promoting rhizobacteria in Pseudomonas

    PubMed Central

    2013-01-01

    Background Some Pseudomonas strains function as predominant plant growth-promoting rhizobacteria (PGPR). Within this group, Pseudomonas chlororaphis and Pseudomonas fluorescens are non-pathogenic biocontrol agents, and some Pseudomonas aeruginosa and Pseudomonas stutzeri strains are PGPR. P. chlororaphis GP72 is a plant growth-promoting rhizobacterium with a fully sequenced genome. We conducted a genomic analysis comparing GP72 with three other pseudomonad PGPR: P. fluorescens Pf-5, P. aeruginosa M18, and the nitrogen-fixing strain P. stutzeri A1501. Our aim was to identify the similarities and differences among these strains using a comparative genomic approach to clarify the mechanisms of plant growth-promoting activity. Results The genome sizes of GP72, Pf-5, M18, and A1501 ranged from 4.6 to 7.1 M, and the number of protein-coding genes varied among the four species. Clusters of Orthologous Groups (COGs) analysis assigned functions to predicted proteins. The COGs distributions were similar among the four species. However, the percentage of genes encoding transposases and their inactivated derivatives (COG L) was 1.33% of the total genes with COGs classifications in A1501, 0.21% in GP72, 0.02% in Pf-5, and 0.11% in M18. A phylogenetic analysis indicated that GP72 and Pf-5 were the most closely related strains, consistent with the genome alignment results. Comparisons of predicted coding sequences (CDSs) between GP72 and Pf-5 revealed 3544 conserved genes. There were fewer conserved genes when GP72 CDSs were compared with those of A1501 and M18. Comparisons among the four Pseudomonas species revealed 603 conserved genes in GP72, illustrating common plant growth-promoting traits shared among these PGPR. Conserved genes were related to catabolism, transport of plant-derived compounds, stress resistance, and rhizosphere colonization. Some strain-specific CDSs were related to different kinds of biocontrol activities or plant growth promotion. The GP72 genome

  14. OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis.

    PubMed

    Whiteside, Matthew D; Winsor, Geoffrey L; Laird, Matthew R; Brinkman, Fiona S L

    2013-01-01

    Prediction of orthologs (homologous genes that diverged because of speciation) is an integral component of many comparative genomics methods. Although orthologs are more likely to have similar function versus paralogs (genes that diverged because of duplication), recent studies have shown that their degree of functional conservation is variable. Also, there are inherent problems with several large-scale ortholog prediction approaches. To address these issues, we previously developed Ortholuge, which uses phylogenetic distance ratios to provide more precise ortholog assessments for a set of predicted orthologs. However, the original version of Ortholuge required manual intervention and was not easily accessible; therefore, we now report the development of OrtholugeDB, available online at http://www.pathogenomics.sfu.ca/ortholugedb. OrtholugeDB provides ortholog predictions for completely sequenced bacterial and archaeal genomes from NCBI based on reciprocal best Basic Local Alignment Search Tool hits, supplemented with further evaluation by the more precise Ortholuge method. The OrtholugeDB web interface facilitates user-friendly and flexible ortholog analysis, from single genes to genomes, plus flexible data download options. We compare Ortholuge with similar methods, showing how it may more consistently identify orthologs with conserved features across a wide range of taxonomic distances. OrtholugeDB facilitates rapid, and more accurate, bacterial and archaeal comparative genomic analysis and large-scale ortholog predictions.

  15. Comparative Analysis of Genomics and Proteomics in Bacillus thuringiensis 4.0718

    PubMed Central

    Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu

    2015-01-01

    Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for

  16. Genome-wide comparative analysis of pogo-like transposable elements in different Fusarium species.

    PubMed

    Dufresne, Marie; Lespinet, Olivier; Daboussi, Marie-Josée; Hua-Van, Aurélie

    2011-10-01

    The recent availability of genome sequences of four different Fusarium species offers the opportunity to perform extensive comparative analyses, in particular of repeated sequences. In a recent work, the overall content of such sequences in the genomes of three phylogenetically related Fusarium species, F. graminearum, F. verticillioides, and F. oxysporum f. sp. lycopersici has been estimated. In this study, we present an exhaustive characterization of pogo-like elements, named Fots, in four Fusarium genomes. Overall 10 Fot and two Fot-related miniature inverted-repeat transposable element families were identified, revealing a diversification of multiple lineages of pogo-like elements, some of which accompanied by a gain of introns. This analysis also showed that such elements are present in an unusual high proportion in the genomes of F. oxysporum f. sp. lycopersici and Nectria haematococca (anamorph F. solani f. sp. pisi) in contrast with most other fungal genomes in which retroelements are the most represented. Interestingly, our analysis showed that the most numerous Fot families all contain potentially active or mobilisable copies, thus conferring a mutagenic potential of these transposable elements and consequently a role in strain adaptation and genome evolution. This role is strongly reinforced when examining their genomic distribution which is clearly biased with a high proportion (more than 80%) located on strain- or species-specific regions enriched in genes involved in pathogenicity and/or adaptation. Finally, the different reproductive characteristics of the four Fusarium species allowed us to investigate the impact of the process of repeat-induced point mutations on the expansion and diversification of Fot elements.

  17. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    PubMed

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution. PMID:27129539

  18. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade

    PubMed Central

    Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A.; Zhou, Zeyang; Vossbrinck, Charles R.

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An “ACCCTT” motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic

  19. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade.

    PubMed

    Xu, Jinshan; He, Qiang; Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A; Zhou, Zeyang; Vossbrinck, Charles R

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An "ACCCTT" motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic ecology

  20. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    SciTech Connect

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  1. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles* #

    PubMed Central

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-yu; Zhang, Xiao-mei; Song, Da-feng; Zhang, Chen

    2016-01-01

    Objective: In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. Methods: The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). Results: We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Conclusions: Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate. PMID:27487802

  2. Genome-level identification, gene expression, and comparative analysis of porcine ß-defensin genes

    PubMed Central

    2012-01-01

    Background Beta-defensins (β-defensins) are innate immune peptides with evolutionary conservation across a wide range of species and has been suggested to play important roles in innate immune reactions against pathogens. However, the complete β-defensin repertoire in the pig has not been fully addressed. Result A BLAST analysis was performed against the available pig genomic sequence in the NCBI database to identify β-defensin-related sequences using previously reported β-defensin sequences of pigs, humans, and cattle. The porcine β-defensin gene clusters were mapped to chromosomes 7, 14, 15 and 17. The gene expression analysis of 17 newly annotated porcine β-defensin genes across 15 tissues using semi-quantitative reverse transcription polymerase chain reaction (RT-PCR) showed differences in their tissue distribution, with the kidney and testis having the largest pBD expression repertoire. We also analyzed single nucleotide polymorphisms (SNPs) in the mature peptide region of pBD genes from 35 pigs of 7 breeds. We found 8 cSNPs in 7 pBDs. Conclusion We identified 29 porcine β-defensin (pBD) gene-like sequences, including 17 unreported pBDs in the porcine genome. Comparative analysis of β-defensin genes in the pig genome with those in human and cattle genomes showed structural conservation of β-defensin syntenic regions among these species. PMID:23150902

  3. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  4. Comparative analysis of gene expression among low G+C gram-positive genomes.

    PubMed

    Karlin, Samuel; Theriot, Julie; Mrázek, Jan

    2004-04-20

    We present a comparative analysis of predicted highly expressed (PHX) genes in the low G+C Gram-positive genomes of Bacillus subtilis, Bacillus halodurans, Listeria monocytogenes, Listeria innocua, Lactococcus lactis, Streptococcus pyogenes, Streptococcus pneumoniae, Staphylococcus aureus, Clostridium acetobutylicum, and Clostridium perfringens. Most enzymes acting in glycolysis and fermentation pathways are PHX in these genomes, but not those involved in the TCA cycle and respiration, suggesting that these organisms have predominantly adapted to grow rapidly in an anaerobic environment. Only B. subtilis and B. halodurans have several TCA cycle PHX genes, whereas the TCA pathway is entirely missing from the metabolic repertoire of the two Streptococcus species and is incomplete in Listeria, Lactococcus, and Clostridium. Pyruvate-formate lyase, an enzyme critical in mixed acid fermentation, is among the highest PHX genes in all these genomes except for C. acetobutylicum (not PHX), and B. subtilis, and B. halodurans (missing). Pyruvate-formate lyase is also prominently PHX in enteric gamma-proteobacteria, but not in other prokaryotes. Phosphotransferase system genes are generally PHX with selection of different substrates in different genomes. The various substrate specificities among phosphotransferase systems in different genomes apparently reflect on differences in habitat, lifestyle, and nutrient sources.

  5. Genome Sequencing and Comparative Analysis of the Biocontrol Agent Trichoderma harzianum sensu stricto TR274

    SciTech Connect

    Steindorff, Andrei S.; Noronha, Elilane F.; Ulhoa, Cirano J.; Kuo, Alan; Salamov, Asaf A.; Haridas, Sajeet; Riley, Robert W.; Druzhinina, Irina S.; Kubicek, Christian P.; Grigoriev, Igor V.

    2015-03-17

    Biological control is a complex process which requires many mechanisms and a high diversity of biochemical pathways. The species of Trichoderma harzianum are well known for their biocontrol activity against many plant pathogens. To gain new insights into the biocontrol mechanism used by T. harzianum, we sequenced the isolate TR274 genome using Illumina. The assembly was performed using AllPaths-LG with a maximum coverage of 100x. The assembly resulted in 2282 contigs with a N50 of 37033bp. The genome size generated was 40.8 Mb and the GC content was 47.7%, similar to other Trichoderma genomes. Using the JGI Annotation Pipeline we predicted 13,932 genes with a high transcriptome support. CEGMA tests suggested 100% genome completeness and 97.9% of RNA-SEQ reads were mapped to the genome. The phylogenetic comparison using orthologous proteins with all Trichoderma genomes sequenced at JGI, corroborates the Trichoderma (T. asperellum and T. atroviride), Longibrachiatum (T. reesei and T. longibrachiatum) and Pachibasium (T. harzianum and T. virens) section division described previously. The comparison between two Trichoderma harzianum species suggests a high genome similarity but some strain-specific expansions. Analyses of the secondary metabolites, CAZymes, transporters, proteases, transcription factors were performed. The Pachybasium section expanded virtually all categories analyzed compared with the other sections, specially Longibrachiatum section, that shows a clear contraction. These results suggests that these proteins families have an important role in their respective phenotypes. Future analysis will improve the understanding of this complex genus and give some insights about its lifestyle and the interactions with the environment.

  6. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    PubMed Central

    Trantas, Emmanouil A.; Licciardello, Grazia; Almeida, Nalvo F.; Witek, Kamil; Strano, Cinzia P.; Duxbury, Zane; Ververidis, Filippos; Goumas, Dimitrios E.; Jones, Jonathan D. G.; Guttman, David S.; Catara, Vittoria; Sarris, Panagiotis F.

    2015-01-01

    The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor) and P. mediterranea (Pmed), are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for genes that encode proteins involved in commercially important chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of type III secretion system and known type III effector-encoding genes from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes. Genome-mining also revealed the presence of gene clusters for biosynthesis of siderophores, polyketides, non-ribosomal peptides, and hydrogen cyanide. A highly conserved quorum sensing system was detected in all strains, although species specific differences were observed. Our study provides the basis for in-depth investigations regarding the molecular mechanisms underlying virulence strategies in the battle between plants and microbes. PMID:26300874

  7. Expanding the repertoire of secretory peptides controlling root development with comparative genome analysis and functional assays.

    PubMed

    Ghorbani, Sarieh; Lin, Yao-Cheng; Parizot, Boris; Fernandez, Ana; Njo, Maria Fransiska; Van de Peer, Yves; Beeckman, Tom; Hilson, Pierre

    2015-08-01

    Plant genomes encode numerous small secretory peptides (SSPs) whose functions have yet to be explored. Based on structural features that characterize SSP families known to take part in postembryonic development, this comparative genome analysis resulted in the identification of genes coding for oligopeptides potentially involved in cell-to-cell communication. Because genome annotation based on short sequence homology is difficult, the criteria for the de novo identification and aggregation of conserved SSP sequences were first benchmarked across five reference plant species. The resulting gene families were then extended to 32 genome sequences, including major crops. The global phylogenetic pattern common to the functionally characterized SSP families suggests that their apparition and expansion coincide with that of the land plants. The SSP families can be searched online for members, sequences and consensus (http://bioinformatics.psb.ugent.be/webtools/PlantSSP/). Looking for putative regulators of root development, Arabidopsis thaliana SSP genes were further selected through transcriptome meta-analysis based on their expression at specific stages and in specific cell types in the course of the lateral root formation. As an additional indication that formerly uncharacterized SSPs may control development, this study showed that root growth and branching were altered by the application of synthetic peptides matching conserved SSP motifs, sometimes in very specific ways. The strategy used in the study, combining comparative genomics, transcriptome meta-analysis and peptide functional assays in planta, pinpoints factors potentially involved in non-cell-autonomous regulatory mechanisms. A similar approach can be implemented in different species for the study of a wide range of developmental programmes.

  8. Expanding the repertoire of secretory peptides controlling root development with comparative genome analysis and functional assays

    PubMed Central

    Ghorbani, Sarieh; Lin, Yao-Cheng; Parizot, Boris; Fernandez, Ana; Njo, Maria Fransiska; Van de Peer, Yves; Beeckman, Tom; Hilson, Pierre

    2015-01-01

    Plant genomes encode numerous small secretory peptides (SSPs) whose functions have yet to be explored. Based on structural features that characterize SSP families known to take part in postembryonic development, this comparative genome analysis resulted in the identification of genes coding for oligopeptides potentially involved in cell-to-cell communication. Because genome annotation based on short sequence homology is difficult, the criteria for the de novo identification and aggregation of conserved SSP sequences were first benchmarked across five reference plant species. The resulting gene families were then extended to 32 genome sequences, including major crops. The global phylogenetic pattern common to the functionally characterized SSP families suggests that their apparition and expansion coincide with that of the land plants. The SSP families can be searched online for members, sequences and consensus (http://bioinformatics.psb.ugent.be/webtools/PlantSSP/). Looking for putative regulators of root development, Arabidopsis thaliana SSP genes were further selected through transcriptome meta-analysis based on their expression at specific stages and in specific cell types in the course of the lateral root formation. As an additional indication that formerly uncharacterized SSPs may control development, this study showed that root growth and branching were altered by the application of synthetic peptides matching conserved SSP motifs, sometimes in very specific ways. The strategy used in the study, combining comparative genomics, transcriptome meta-analysis and peptide functional assays in planta, pinpoints factors potentially involved in non-cell-autonomous regulatory mechanisms. A similar approach can be implemented in different species for the study of a wide range of developmental programmes. PMID:26195730

  9. Comparative 3D genome structure analysis of the fission and the budding yeast.

    PubMed

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species.

  10. Comparative Genomic Analysis among Four Representative Isolates of Phytophthora sojae Reveals Genes under Evolutionary Selection

    PubMed Central

    Ye, Wenwu; Wang, Yang; Tyler, Brett M.; Wang, Yuanchao

    2016-01-01

    Comparative genomic analysis is useful for identifying genes affected by evolutionary selection and for studying adaptive variation in gene functions. In Phytophthora sojae, a model oomycete plant pathogen, the related study is lacking. We compared sequence data among four isolates of P. sojae, which represent its four major genotypes. These isolates exhibited >99.688%, >99.864%, and >98.981% sequence identities at genome, gene, and non-gene regions, respectively. One hundred and fifty-three positive selection and 139 negative selection candidate genes were identified. Between the two categories of genes, the positive selection genes were flanked by larger intergenic regions, poorly annotated in function, and less conserved; they had relatively lower transcription levels but many genes had increased transcripts during infection. Genes coding for predicted secreted proteins, particularly effectors, were overrepresented in positive selection. Several RxLR effector genes were identified as positive selection genes, exhibiting much stronger positive selection levels. In addition, candidate genes with presence/absence polymorphism were analyzed. This study provides a landscape of genomic variation among four representative P. sojae isolates and characterized several evolutionary selection-affected gene candidates. The results suggest a relatively covert two-speed genome evolution pattern in P. sojae and will provide clues for identification of new virulence factors in the oomycete plant pathogens. PMID:27746768

  11. Comparative genomic analysis of Helicobacter pylori from Malaysia identifies three distinct lineages suggestive of differential evolution

    PubMed Central

    Kumar, Narender; Mariappan, Vanitha; Baddam, Ramani; Lankapalli, Aditya K.; Shaik, Sabiha; Goh, Khean-Lee; Loke, Mun Fai; Perkins, Tim; Benghezal, Mohammed; Hasnain, Seyed E.; Vadivelu, Jamuna; Marshall, Barry J.; Ahmed, Niyaz

    2015-01-01

    The discordant prevalence of Helicobacter pylori and its related diseases, for a long time, fostered certain enigmatic situations observed in the countries of the southern world. Variation in H. pylori infection rates and disease outcomes among different populations in multi-ethnic Malaysia provides a unique opportunity to understand dynamics of host–pathogen interaction and genome evolution. In this study, we extensively analyzed and compared genomes of 27 Malaysian H. pylori isolates and identified three major phylogeographic lineages: hspEastAsia, hpEurope and hpSouthIndia. The analysis of the virulence genes within the core genome, however, revealed a comparable pathogenic potential of the strains. In addition, we identified four genes limited to strains of East-Asian lineage. Our analyses identified a few strain-specific genes encoding restriction modification systems and outlined 311 core genes possibly under differential evolutionary constraints, among the strains representing different ethnic groups. The cagA and vacA genes also showed variations in accordance with the host genetic background of the strains. Moreover, restriction modification genes were found to be significantly enriched in East-Asian strains. An understanding of these variations in the genome content would provide significant insights into various adaptive and host modulation strategies harnessed by H. pylori to effectively persist in a host-specific manner. PMID:25452339

  12. Comparative Analysis of Mitochondrial Genomes of Five Aphid Species (Hemiptera: Aphididae) and Phylogenetic Implications

    PubMed Central

    Wang, Yuan; Huang, Xiao-Lei; Qiao, Ge-Xia

    2013-01-01

    Insect mitochondrial genomes (mitogenomes) are of great interest in exploring molecular evolution, phylogenetics and population genetics. Only two mitogenomes have been previously released in the insect group Aphididae, which consists of about 5,000 known species including some agricultural, forestry and horticultural pests. Here we report the complete 16,317 bp mitogenome of Cavariella salicicola and two nearly complete mitogenomes of Aphis glycines and Pterocomma pilosum. We also present a first comparative analysis of mitochondrial genomes of aphids. Results showed that aphid mitogenomes share conserved genomic organization, nucleotide and amino acid composition, and codon usage features. All 37 genes usually present in animal mitogenomes were sequenced and annotated. The analysis of gene evolutionary rate revealed the lowest and highest rates for COI and ATP8, respectively. A unique repeat region exclusively in aphid mitogenomes, which included variable numbers of tandem repeats in a lineage-specific manner, was highlighted for the first time. This region may have a function as another origin of replication. Phylogenetic reconstructions based on protein-coding genes and the stem-loop structures of control regions confirmed a sister relationship between Cavariella and pterocommatines. Current evidence suggest that pterocommatines could be formally transferred into Macrosiphini. Our paper also offers methodological instructions for obtaining other Aphididae mitochondrial genomes. PMID:24147014

  13. Comparative genomic analysis reveals evidence of two novel Vibrio species closely related to V. cholerae

    PubMed Central

    2010-01-01

    Background In recent years genome sequencing has been used to characterize new bacterial species, a method of analysis available as a result of improved methodology and reduced cost. Included in a constantly expanding list of Vibrio species are several that have been reclassified as novel members of the Vibrionaceae. The description of two putative new Vibrio species, Vibrio sp. RC341 and Vibrio sp. RC586 for which we propose the names V. metecus and V. parilis, respectively, previously characterized as non-toxigenic environmental variants of V. cholerae is presented in this study. Results Based on results of whole-genome average nucleotide identity (ANI), average amino acid identity (AAI), rpoB similarity, MLSA, and phylogenetic analysis, the new species are concluded to be phylogenetically closely related to V. cholerae and V. mimicus. Vibrio sp. RC341 and Vibrio sp. RC586 demonstrate features characteristic of V. cholerae and V. mimicus, respectively, on differential and selective media, but their genomes show a 12 to 15% divergence (88 to 85% ANI and 92 to 91% AAI) compared to the sequences of V. cholerae and V. mimicus genomes (ANI <95% and AAI <96% indicative of separate species). Vibrio sp. RC341 and Vibrio sp. RC586 share 2104 ORFs (59%) and 2058 ORFs (56%) with the published core genome of V. cholerae and 2956 (82%) and 3048 ORFs (84%) with V. mimicus MB-451, respectively. The novel species share 2926 ORFs with each other (81% Vibrio sp. RC341 and 81% Vibrio sp. RC586). Virulence-associated factors and genomic islands of V. cholerae and V. mimicus, including VSP-I and II, were found in these environmental Vibrio spp. Conclusions Results of this analysis demonstrate these two environmental vibrios, previously characterized as variant V. cholerae strains, are new species which have evolved from ancestral lineages of the V. cholerae and V. mimicus clade. The presence of conserved integration loci for genomic islands as well as evidence of horizontal gene

  14. Genome-wide analysis of simple sequence repeats in marine animals-a comparative approach.

    PubMed

    Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng

    2014-10-01

    Tandem simple sequence repeats (SSRs) are one of the most popular molecular markers in genetic analysis owing to their ubiquitous occurrence,high reproducibility, multiallelic nature, and codominant mode. High mutability makes SSRs play a role in genome evolution and correspondingly show different patterns. Comparative analysis of genomic SSRs in different taxonomic groups usually focuses on land species, while marine animals have been neglected. This study examined the abundance of genomic SSRs with repeated unit lengths of 1-6 bp in 30 marine animals including nine taxonomic groups and further compared with the land species. More than thousands of SSRs were discovered in every organism which provided a huge resource for the development of molecular markers. Thirty marine animals showed profound differences in SSR characteristics, but some group-specific trends were also found. Both similarities and differences of repeat patterns were discovered between the land and marine species. Two taxon-specific SSR types were discovered: the pentanucleotides motif AGAGG in Euteleostei and the hexanucleotide repeats of ATGTAC in Porifera and Echinodermata. Gene ontology (GO) enrichment analysis of two representative species (Amphimedon queenslandica for Porifera and Strongylocentrotus purpuratus for Echinodermata) revealed functional preference of the ATGTAC motif associated genes, and this might hint at evolutionary significance.

  15. Clinical utility of an array comparative genomic hybridization analysis for Williams syndrome.

    PubMed

    Yagihashi, Tatsuhiko; Torii, Chiharu; Takahashi, Reiko; Omori, Mikimasa; Kosaki, Rika; Yoshihashi, Hiroshi; Ihara, Masahiro; Minagawa-Kawai, Yasuyo; Yamamoto, Junichi; Takahashi, Takao; Kosaki, Kenjiro

    2014-11-01

    To reveal the relation between intellectual disability and the deleted intervals in Williams syndrome, we performed an array comparative genomic hybridization analysis and standardized developmental testing for 11 patients diagnosed as having Williams syndrome based on fluorescent in situ hybridization testing. One patient had a large 4.2-Mb deletion spanning distally beyond the common 1.5-Mb intervals observed in 10/11 patients. We formulated a linear equation describing the developmental age of the 10 patients with the common deletion; the developmental age of the patient with the 4.2-Mb deletion was significantly below the expectation (developmental age = 0.51 × chronological age). The large deletion may account for the severe intellectual disability; therefore, the use of array comparative genomic hybridization may provide practical information regarding individuals with Williams syndrome.

  16. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods

    PubMed Central

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from “Tua Nao” of Thailand traces a different evolutionary process from other strains. PMID:26505996

  17. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods.

    PubMed

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from "Tua Nao" of Thailand traces a different evolutionary process from other strains.

  18. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Simple sequence repeats (SSR) or microsatellite markers are one of the most informative and versatile DNA-based markers. The use of next-generation sequencing technologies allow whole genome sequencing and make it possible to develop large numbers of SSRs through bioinformatic analysis of genome da...

  19. Comparative and phylogenetic analysis of the mitochondrial genomes in basal hymenopterans

    PubMed Central

    Song, Sheng-Nan; Tang, Pu; Wei, Shu-Jun; Chen, Xue-Xin

    2016-01-01

    The Symphyta is traditionally accepted as a paraphyletic group located in a basal position of the order Hymenoptera. Herein, we conducted a comparative analysis of the mitochondrial genomes in the Symphyta by describing two newly sequenced ones, from Trichiosoma anthracinum, representing the first mitochondrial genome in family Cimbicidae, and Asiemphytus rufocephalus, from family Tenthredinidae. The sequenced lengths of these two mitochondrial genomes were 15,392 and 14,864 bp, respectively. Within the sequenced region, trnC and trnY were rearranged to the upstream of trnI-nad2 in T. anthracinum, while in A. rufocephalus all sequenced genes were arranged in the putative insect ancestral gene arrangement. Rearrangement of the tRNA genes is common in the Symphyta. The rearranged genes are mainly from trnL1 and two tRNA clusters of trnI-trnQ-trnM and trnW-trnC-trnY. The mitochondrial genomes of Symphyta show a biased usage of A and T rather than G and C. Protein-coding genes in Symphyta species show a lower evolutionary rate than those of Apocrita. The Ka/Ks ratios were all less than 1, indicating purifying selection of Symphyta species. Phylogenetic analyses supported the paraphyly and basal position of Symphyta in Hymenoptera. The well-supported phylogenetic relationship in the study is Tenthredinoidea + (Cephoidea + (Orussoidea + Apocrita)). PMID:26879745

  20. Search for enhancers: teleost models in comparative genomic and transgenic analysis of cis regulatory elements.

    PubMed

    Müller, Ferenc; Blader, Patrick; Strähle, Uwe

    2002-06-01

    Homology searches between DNA sequences of evolutionary distant species (phylogenetic footprinting) offer a fast detection method for regulatory sequences. Because of the small size of their genomes, tetraodontid species such as the Japanese pufferfish and green spotted pufferfish have become attractive models for comparative genomics. A disadvantage of the tetraodontid species is, however, that they cannot be bred and manipulated routinely under laboratory conditions, so these species are less attractive for developmental and genetic analysis. In contrast, an increasing arsenal of transgene techniques with the developmental model species zebrafish and medaka are being used for functional analysis of cis regulatory sequences. The main disadvantage is the much larger genome. While comparison between many loci proved the suitability of phylogenetic footprinting using fish and mammalian sequences, fast rate of change in enhancer structure and gene duplication within teleosts may obscure detection of homologies. Here we discuss the contribution and potentials provided by different teleost models for the detection and functional analysis of conserved cis-regulatory elements. PMID:12111739

  1. Genome Sequence of Babesia bovis and Comparative Analysis of Apicomplexan Hemoprotozoa

    PubMed Central

    Brayton, Kelly A; Lau, Audrey O. T; Herndon, David R; Hannick, Linda; Kappmeyer, Lowell S; Berens, Shawn J; Bidwell, Shelby L; Brown, Wendy C; Crabtree, Jonathan; Fadrosh, Doug; Feldblum, Tamara; Forberger, Heather A; Haas, Brian J; Howell, Jeanne M; Khouri, Hoda; Koo, Hean; Mann, David J; Norimine, Junzo; Paulsen, Ian T; Radune, Diana; Ren, Qinghu; Smith, Roger K; Suarez, Carlos E; White, Owen; Wortman, Jennifer R; Knowles, Donald P; McElwain, Terry F; Nene, Vishvanath M

    2007-01-01

    Babesia bovis is an apicomplexan tick-transmitted pathogen of cattle imposing a global risk and severe constraints to livestock health and economic development. The complete genome sequence was undertaken to facilitate vaccine antigen discovery, and to allow for comparative analysis with the related apicomplexan hemoprotozoa Theileria parva and Plasmodium falciparum. At 8.2 Mbp, the B. bovis genome is similar in size to that of Theileria spp. Structural features of the B. bovis and T. parva genomes are remarkably similar, and extensive synteny is present despite several chromosomal rearrangements. In contrast, B. bovis and P. falciparum, which have similar clinical and pathological features, have major differences in genome size, chromosome number, and gene complement. Chromosomal synteny with P. falciparum is limited to microregions. The B. bovis genome sequence has allowed wide scale analyses of the polymorphic variant erythrocyte surface antigen protein (ves1 gene) family that, similar to the P. falciparum var genes, is postulated to play a role in cytoadhesion, sequestration, and immune evasion. The ∼150 ves1 genes are found in clusters that are distributed throughout each chromosome, with an increased concentration adjacent to a physical gap on chromosome 1 that contains multiple ves1-like sequences. ves1 clusters are frequently linked to a novel family of variant genes termed smorfs that may themselves contribute to immune evasion, may play a role in variant erythrocyte surface antigen protein biology, or both. Initial expression analysis of ves1 and smorf genes indicates coincident transcription of multiple variants. B. bovis displays a limited metabolic potential, with numerous missing pathways, including two pathways previously described for the P. falciparum apicoplast. This reduced metabolic potential is reflected in the B. bovis apicoplast, which appears to have fewer nuclear genes targeted to it than other apicoplast containing organisms. Finally

  2. Comparative Genomic and Functional Analysis of Lactobacillus casei and Lactobacillus rhamnosus Strains Marketed as Probiotics

    PubMed Central

    Douillard, François P.; Ribbera, Angela; Järvinen, Hanna M.; Kant, Ravi; Pietilä, Taija E.; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K.; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi

    2013-01-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which could be linked to their genotypes. The two isolated L. rhamnosus strains had genomes that were virtually identical to that of L. rhamnosus GG, testifying to their genomic stability and integrity in food products. The L. casei strains showed much greater genomic heterogeneity. Remarkably, all strains contained an intact spaCBA pilus gene cluster. However, only the L. rhamnosus strains produced mucus-binding SpaCBA pili under the conditions tested. Transcription initiation mapping demonstrated that the insertion of an iso-IS30 element upstream of the pilus gene cluster in L. rhamnosus strains but absent in L. casei strains had constituted a functional promoter driving pilus gene expression. All L. rhamnosus strains triggered an NF-κB response via Toll-like receptor 2 (TLR2) in a reporter cell line, whereas the L. casei strains did not or did so to a much lesser extent. This study demonstrates that the two L. rhamnosus strains isolated from probiotic products are virtually identical to L. rhamnosus GG and further highlights the differences between these and L. casei strains widely marketed as probiotics, in terms of genome content, mucus-binding and metabolic capacities, and host signaling capabilities. PMID:23315726

  3. Comparative genomic and functional analysis of Lactobacillus casei and Lactobacillus rhamnosus strains marketed as probiotics.

    PubMed

    Douillard, François P; Ribbera, Angela; Järvinen, Hanna M; Kant, Ravi; Pietilä, Taija E; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi; de Vos, Willem M

    2013-03-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which could be linked to their genotypes. The two isolated L. rhamnosus strains had genomes that were virtually identical to that of L. rhamnosus GG, testifying to their genomic stability and integrity in food products. The L. casei strains showed much greater genomic heterogeneity. Remarkably, all strains contained an intact spaCBA pilus gene cluster. However, only the L. rhamnosus strains produced mucus-binding SpaCBA pili under the conditions tested. Transcription initiation mapping demonstrated that the insertion of an iso-IS30 element upstream of the pilus gene cluster in L. rhamnosus strains but absent in L. casei strains had constituted a functional promoter driving pilus gene expression. All L. rhamnosus strains triggered an NF-κB response via Toll-like receptor 2 (TLR2) in a reporter cell line, whereas the L. casei strains did not or did so to a much lesser extent. This study demonstrates that the two L. rhamnosus strains isolated from probiotic products are virtually identical to L. rhamnosus GG and further highlights the differences between these and L. casei strains widely marketed as probiotics, in terms of genome content, mucus-binding and metabolic capacities, and host signaling capabilities. PMID:23315726

  4. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    PubMed

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements.

  5. Genome-scale metabolic modeling of Mucor circinelloides and comparative analysis with other oleaginous species.

    PubMed

    Vongsangnak, Wanwipa; Klanchui, Amornpan; Tawornsamretkit, Iyarest; Tatiyaborwornchai, Witthawin; Laoteng, Kobkul; Meechai, Asawin

    2016-06-01

    We present a novel genome-scale metabolic model iWV1213 of Mucor circinelloides, which is an oleaginous fungus for industrial applications. The model contains 1213 genes, 1413 metabolites and 1326 metabolic reactions across different compartments. We demonstrate that iWV1213 is able to accurately predict the growth rates of M. circinelloides on various nutrient sources and culture conditions using Flux Balance Analysis and Phenotypic Phase Plane analysis. Comparative analysis of three oleaginous genome-scale models, including M. circinelloides (iWV1213), Mortierella alpina (iCY1106) and Yarrowia lipolytica (iYL619_PCP) revealed that iWV1213 possesses a higher number of genes involved in carbohydrate, amino acid, and lipid metabolisms that might contribute to its versatility in nutrient utilization. Moreover, the identification of unique and common active reactions among the Zygomycetes oleaginous models using Flux Variability Analysis unveiled a set of gene/enzyme candidates as metabolic engineering targets for cellular improvement. Thus, iWV1213 offers a powerful metabolic engineering tool for multi-level omics analysis, enabling strain optimization as a cell factory platform of lipid-based production.

  6. Comparative genome mapping in Brassica.

    PubMed

    Lagercrantz, U; Lydiate, D J

    1996-12-01

    A Brassica nigra genetic linkage map was developed from a highly polymorphic cross analyzed with a set of low copy number Brassica RFLP probes. The Brassica genome is extensively duplicated with eight distinct sets of chromosomal segments, each present in three copies, covering virtually the whole genome. Thus, B. nigra could be descended from a hexaploid ancestor. A comparative analysis of B. nigra, B. oleracea and B. rapa genomes, based on maps developed using a common set of RFLP probes, was also performed. The three genomes have distinct chromosomal structures differentiated by a large number of rearrangements, but collinear regions involving virtually the whole of each the three genomes were identified. The genic contents of B. nigra, B. oleracea and B. rapa were basically equivalent and differences in chromosome number (8, 9 and 10, respectively) are probably the result of chromosome fusions and/ or fissions. The strong conservation of overall genic content across the three Brassica genomes mirrors the conservation of genic content observed over a much longer evolutionary span in cereals. However, the rate of chromosomal rearrangement in crucifers is much higher than that observed in cereal genomes.

  7. Comparative Mitochondrial Genome Analysis of Eligma narcissus and other Lepidopteran Insects Reveals Conserved Mitochondrial Genome Organization and Phylogenetic Relationships.

    PubMed

    Dai, Li-Shang; Zhu, Bao-Jian; Zhao, Yue; Zhang, Cong-Fen; Liu, Chao-Liang

    2016-01-01

    In this study, we sequenced the complete mitochondrial genome of Eligma narcissus and compared it with 18 other lepidopteran species. The mitochondrial genome (mitogenome) was a circular molecule of 15,376 bp containing 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes and an adenine (A) + thymine (T) - rich region. The positive AT skew (0.007) indicated the occurrence of more As than Ts. The arrangement of 13 PCGs was similar to that of other sequenced lepidopterans. All PCGs were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which was initiated by the CGA sequence, as observed in other lepidopterans. The results of the codon usage analysis indicated that Asn, Ile, Leu, Tyr and Phe were the five most frequent amino acids. All tRNA genes were shown to be folded into the expected typical cloverleaf structure observed for mitochondrial tRNA genes. Phylogenetic relationships were analyzed based on the nucleotide sequences of 13 PCGs from other insect mitogenomes, which confirmed that E. narcissus is a member of the Noctuidae superfamily. PMID:27222440

  8. A Comparative Analysis of Mitochondrial Genomes in Coleoptera (Arthropoda: Insecta) and Genome Descriptions of Six New Beetles

    PubMed Central

    Song, H.; Cameron, S. L.; Whiting, M. F.

    2008-01-01

    Coleoptera is the most diverse group of insects with over 360,000 described species divided into four suborders: Adephaga, Archostemata, Myxophaga, and Polyphaga. In this study, we present six new complete mitochondrial genome (mtgenome) descriptions, including a representative of each suborder, and analyze the evolution of mtgenomes from a comparative framework using all available coleopteran mtgenomes. We propose a modification of atypical cox1 start codons based on sequence alignment to better reflect the conservation observed across species as well as findings of TTG start codons in other genes. We also analyze tRNA-Ser(AGN) anticodons, usually GCU in arthropods, and report a conserved UCU anticodon as a possible synapomorphy across Polyphaga. We further analyze the secondary structure of tRNA-Ser(AGN) and present a consensus structure and an updated covariance model that allows tRNAscan-SE (via the COVE software package) to locate and fold these atypical tRNAs with much greater consistency. We also report secondary structure predictions for both rRNA genes based on conserved stems. All six species of beetle have the same gene order as the ancestral insect. We report noncoding DNA regions, including a small gap region of about 20 bp between tRNA-Ser(UCN) and nad1 that is present in all six genomes, and present results of a base composition analysis. PMID:18779259

  9. Comparative Mitochondrial Genome Analysis of Eligma narcissus and other Lepidopteran Insects Reveals Conserved Mitochondrial Genome Organization and Phylogenetic Relationships

    PubMed Central

    Dai, Li-Shang; Zhu, Bao-Jian; Zhao, Yue; Zhang, Cong-Fen; Liu, Chao-Liang

    2016-01-01

    In this study, we sequenced the complete mitochondrial genome of Eligma narcissus and compared it with 18 other lepidopteran species. The mitochondrial genome (mitogenome) was a circular molecule of 15,376 bp containing 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes and an adenine (A) + thymine (T) − rich region. The positive AT skew (0.007) indicated the occurrence of more As than Ts. The arrangement of 13 PCGs was similar to that of other sequenced lepidopterans. All PCGs were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which was initiated by the CGA sequence, as observed in other lepidopterans. The results of the codon usage analysis indicated that Asn, Ile, Leu, Tyr and Phe were the five most frequent amino acids. All tRNA genes were shown to be folded into the expected typical cloverleaf structure observed for mitochondrial tRNA genes. Phylogenetic relationships were analyzed based on the nucleotide sequences of 13 PCGs from other insect mitogenomes, which confirmed that E. narcissus is a member of the Noctuidae superfamily. PMID:27222440

  10. Pyrosequencing-Based Comparative Genome Analysis of Vibrio vulnificus Environmental Isolates

    PubMed Central

    Morrison, Shatavia S.; Williams, Tiffany; Cain, Aurora; Froelich, Brett; Taylor, Casey; Baker-Austin, Craig; Verner-Jeffreys, David; Hartnell, Rachel; Oliver, James D.; Gibas, Cynthia J.

    2012-01-01

    Between 1996 and 2006, the US Centers for Disease Control reported that the only category of food-borne infections increasing in frequency were those caused by members of the genus Vibrio. The Gram-negative bacterium Vibrio vulnificus is a ubiquitous inhabitant of estuarine waters, and is the number one cause of seafood-related deaths in the US. Many V. vulnificus isolates have been studied, and it has been shown that two genetically distinct subtypes, distinguished by 16S rDNA and other gene polymorphisms, are associated predominantly with either environmental or clinical isolation. While local genetic differences between the subtypes have been probed, only the genomes of clinical isolates have so far been completely sequenced. In order to better understand V. vulnificus as an agent of disease and to identify the molecular components of its virulence mechanisms, we have completed whole genome shotgun sequencing of three diverse environmental genotypes using a pyrosequencing approach. V. vulnificus strain JY1305 was sequenced to a depth of 33×, and strains E64MW and JY1701 were sequenced to lesser depth, covering approximately 99.9% of each genome. We have performed a comparative analysis of these sequences against the previously published sequences of three V. vulnificus clinical isolates. We find that the genome of V. vulnificus is dynamic, with 1.27% of genes in the C-genotype genomes not found in the E- genotype genomes. We identified key genes that differentiate between the genomes of the clinical and environmental genotypes. 167 genes were found to be specifically associated with environmental genotypes and 278 genes with clinical genotypes. Genes specific to the clinical strains include components of sialic acid catabolism, mannitol fermentation, and a component of a Type IV secretory pathway VirB4, as well as several other genes with potential significance for human virulence. Genes specific to environmental strains included several that may have

  11. Comparative genome analysis of 19 Ureaplasma urealyticum and Ureaplasma parvum strains

    PubMed Central

    2012-01-01

    Background Ureaplasma urealyticum (UUR) and Ureaplasma parvum (UPA) are sexually transmitted bacteria among humans implicated in a variety of disease states including but not limited to: nongonococcal urethritis, infertility, adverse pregnancy outcomes, chorioamnionitis, and bronchopulmonary dysplasia in neonates. There are 10 distinct serotypes of UUR and 4 of UPA. Efforts to determine whether difference in pathogenic potential exists at the ureaplasma serovar level have been hampered by limitations of antibody-based typing methods, multiple cross-reactions and poor discriminating capacity in clinical samples containing two or more serovars. Results We determined the genome sequences of the American Type Culture Collection (ATCC) type strains of all UUR and UPA serovars as well as four clinical isolates of UUR for which we were not able to determine serovar designation. UPA serovars had 0.75−0.78 Mbp genomes and UUR serovars were 0.84−0.95 Mbp. The original classification of ureaplasma isolates into distinct serovars was largely based on differences in the major ureaplasma surface antigen called the multiple banded antigen (MBA) and reactions of human and animal sera to the organisms. Whole genome analysis of the 14 serovars and the 4 clinical isolates showed the mba gene was part of a large superfamily, which is a phase variable gene system, and that some serovars have identical sets of mba genes. Most of the differences among serovars are hypothetical genes, and in general the two species and 14 serovars are extremely similar at the genome level. Conclusions Comparative genome analysis suggests UUR is more capable of acquiring genes horizontally, which may contribute to its greater virulence for some conditions. The overwhelming evidence of extensive horizontal gene transfer among these organisms from our previous studies combined with our comparative analysis indicates that ureaplasmas exist as quasi-species rather than as stable serovars in their native

  12. A comparative analysis of the DNA recombination repair pathway in mycobacterial genomes.

    PubMed

    Singh, Amandeep; Bhagavat, Raghu; Vijayan, M; Chandra, Nagasuma

    2016-07-01

    In prokaryotes, repair by homologous recombination provides a major means to reinstate the genetic information lost in DNA damage. Recombination repair pathway in mycobacteria has multiple differences as compared to that in Escherichia coli. Of about 20 proteins known to be involved in the pathway, a set of 9 proteins, namely, RecF, RecO, RecR, RecA, SSBa, RuvA, RuvB and RuvC was found to be indispensable among the 43 mycobacterial strains. A domain level analysis indicated that most domains involved in recombination repair are unique to these proteins and are present as single copies in the genomes. Synteny analysis reveals that the gene order of proteins involved in the pathway is not conserved, suggesting that they may be regulated differently in different species. Sequence conservation among the same protein from different strains suggests the importance of RecO-RecA and RecFOR-RecA presynaptic pathways in the repair of double strand-breaks and single strand-breaks respectively. New annotations obtained from the analysis, include identification of a protein with a probable Holliday junction binding role present in 41 mycobacterial genomes and that of a RecB-like nuclease, containing a cas4 domain, present in 42 genomes. New insights into the binding of small molecules to the relevant proteins are provided by binding pocket analysis using three dimensional structural models. Analysis of the various features of the recombination repair pathway, presented here, is likely to provide a framework for further exploring stress response and emergence of drug resistance in mycobacteria. PMID:27450012

  13. Genome sequence of the model sulfate reducer Desulfovibrio gigas: a comparative analysis within the Desulfovibrio genus*

    PubMed Central

    Morais-Silva, Fabio O; Rezende, Antonio Mauro; Pimentel, Catarina; Santos, Catia I; Clemente, Carla; Varela–Raposo, Ana; Resende, Daniela M; da Silva, Sofia M; de Oliveira, Luciana Márcia; Matos, Marcia; Costa, Daniela A; Flores, Orfeu; Ruiz, Jerónimo C; Rodrigues-Pousada, Claudina

    2014-01-01

    Desulfovibrio gigas is a model organism of sulfate-reducing bacteria of which energy metabolism and stress response have been extensively studied. The complete genomic context of this organism was however, not yet available. The sequencing of the D. gigas genome provides insights into the integrated network of energy conserving complexes and structures present in this bacterium. Comparison with genomes of other Desulfovibrio spp. reveals the presence of two different CRISPR/Cas systems in D. gigas. Phylogenetic analysis using conserved protein sequences (encoded by rpoB and gyrB) indicates two main groups of Desulfovibrio spp, being D. gigas more closely related to D. vulgaris and D. desulfuricans strains. Gene duplications were found such as those encoding fumarate reductase, formate dehydrogenase, and superoxide dismutase. Complexes not yet described within Desulfovibrio genus were identified: Mnh complex, a v-type ATP-synthase as well as genes encoding the MinCDE system that could be responsible for the larger size of D. gigas when compared to other members of the genus. A low number of hydrogenases and the absence of the codh/acs and pfl genes, both present in D. vulgaris strains, indicate that intermediate cycling mechanisms may contribute substantially less to the energy gain in D. gigas compared to other Desulfovibrio spp. This might be compensated by the presence of other unique genomic arrangements of complexes such as the Rnf and the Hdr/Flox, or by the presence of NAD(P)H related complexes, like the Nuo, NfnAB or Mnh. PMID:25055974

  14. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    PubMed Central

    2011-01-01

    Background It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb) survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN), which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE) tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH). Methods Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA) was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. Results The mean of the derivative log ratio spread (DLRSpread), which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05). The concordance of results between aCGH and fluorescence in situ hybridization (FISH) for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively). The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15). Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40). Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005) independent factor which was associated with larger total length of CNA of breast cancers. Conclusions Thus, archival FFPE tissues from A-bomb survivors are useful for genome-wide a

  15. The genome sequence of E. coli W (ATCC 9637): comparative genome analysis and an improved genome-scale reconstruction of E. coli

    PubMed Central

    2011-01-01

    Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637), one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp) and pRK2 (5,360 bp), are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks): it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models. PMID:21208457

  16. The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae

    PubMed Central

    Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

    2016-01-01

    Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965

  17. The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae.

    PubMed

    Hao, Zhaodong; Cheng, Tielong; Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

    2016-01-01

    Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965

  18. Genome Sequencing and Comparative Analysis of Saccharomyces cerevisiae Strains of the Peterhof Genetic Collection

    PubMed Central

    Drozdova, Polina B.; Tarasov, Oleg V.; Matveenko, Andrew G.; Radchenko, Elina A.; Sopova, Julia V.; Polev, Dmitrii E.; Inge-Vechtomov, Sergey G.; Dobrynin, Pavel V.

    2016-01-01

    The Peterhof genetic collection of Saccharomyces cerevisiae strains (PGC) is a large laboratory stock that has accumulated several thousands of strains for over than half a century. It originated independently of other common laboratory stocks from a distillery lineage (race XII). Several PGC strains have been extensively used in certain fields of yeast research but their genomes have not been thoroughly explored yet. Here we employed whole genome sequencing to characterize five selected PGC strains including one of the closest to the progenitor, 15V-P4, and several strains that have been used to study translation termination and prions in yeast (25-25-2V-P3982, 1B-D1606, 74-D694, and 6P-33G-D373). The genetic distance between the PGC progenitor and S288C is comparable to that between two geographically isolated populations. The PGC seems to be closer to two bakery strains than to S288C-related laboratory stocks or European wine strains. In genomes of the PGC strains, we found several loci which are absent from the S288C genome; 15V-P4 harbors a rare combination of the gene cluster characteristic for wine strains and the RTM1 cluster. We closely examined known and previously uncharacterized gene variants of particular strains and were able to establish the molecular basis for known phenotypes including phenylalanine auxotrophy, clumping behavior and galactose utilization. Finally, we made sequencing data and results of the analysis available for the yeast community. Our data widen the knowledge about genetic variation between Saccharomyces cerevisiae strains and can form the basis for planning future work in PGC-related strains and with PGC-derived alleles. PMID:27152522

  19. Comparative Whole-Genome Analysis of Clinical Isolates Reveals Characteristic Architecture of Mycobacterium tuberculosis Pangenome

    PubMed Central

    Periwal, Vinita; Patowary, Ashok; Vellarikkal, Shamsudheen Karuthedath; Gupta, Anju; Singh, Meghna; Mittal, Ashish; Jeyapaul, Shamini; Chauhan, Rajendra Kumar; Singh, Ajay Vir; Singh, Pravin Kumar; Garg, Parul; Katoch, Viswa Mohan; Katoch, Kiran; Chauhan, Devendra Singh; Sivasubbu, Sridhar; Scaria, Vinod

    2015-01-01

    The tubercle complex consists of closely related mycobacterium species which appear to be variants of a single species. Comparative genome analysis of different strains could provide useful clues and insights into the genetic diversity of the species. We integrated genome assemblies of 96 strains from Mycobacterium tuberculosis complex (MTBC), which included 8 Indian clinical isolates sequenced and assembled in this study, to understand its pangenome architecture. We predicted genes for all the 96 strains and clustered their respective CDSs into homologous gene clusters (HGCs) to reveal a hard-core, soft-core and accessory genome component of MTBC. The hard-core (HGCs shared amongst 100% of the strains) was comprised of 2,066 gene clusters whereas the soft-core (HGCs shared amongst at least 95% of the strains) comprised of 3,374 gene clusters. The change in the core and accessory genome components when observed as a function of their size revealed that MTBC has an open pangenome. We identified 74 HGCs that were absent from reference strains H37Rv and H37Ra but were present in most of clinical isolates. We report PCR validation on 9 candidate genes depicting 7 genes completely absent from H37Rv and H37Ra whereas 2 genes shared partial homology with them accounting to probable insertion and deletion events. The pangenome approach is a promising tool for studying strain specific genetic differences occurring within species. We also suggest that since selecting appropriate target genes for typing purposes requires the expected target gene be present in all isolates being typed, therefore estimating the core-component of the species becomes a subject of prime importance. PMID:25853708

  20. Identification and comparative analysis of a genomic island in Mycobacterium avium subsp. hominissuis.

    PubMed

    Lahiri, Annesha; Sanchini, Andrea; Semmler, Torsten; Schäfer, Hubert; Lewin, Astrid

    2014-11-01

    Mycobacterium avium subsp. hominissuis (MAH) is an environmental bacterium causing opportunistic infections. The objective of this study was to identify flexible genome regions in MAH isolated from different sources. By comparing five complete and draft MAH genomes we identified a genomic island conferring additional flexibility to the MAH genomes. The island was absent in one of the five strains and had sizes between 16.37 and 84.85kb in the four other strains. The genes present in the islands differed among strains and included phage- and plasmid-derived genes, integrase genes, hypothetical genes, and virulence-associated genes like mmpL or mce genes.

  1. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  2. Comparative genome-wide transcriptional analysis of human left and right internal mammary arteries

    PubMed Central

    Ferrari, Giovanni; Quackenbush, John; Strobeck, John; Hu, Lan; Johnson, Christopher K.; Mak, Andrew; Shaw, Richard E.; Sayles, Kathleen; Brizzio, Mariano E.; Zapolanski, Alex; Grau, Juan B.

    2014-01-01

    In coronary artery bypass grafting (CABG), the combined use of left and right internal mammary arteries (LIMA and RIMA) — collectively known as bilateral IMAs (BIMAs) provides a survival advantage over the use of LIMA alone. However, gene expression in RIMA has never been compared to that in LIMA. Here we report a genome-wide transcriptional analysis of BIMA to investigate the expression profiles of these conduits in patients undergoing CABG. As expected, in comparing the BIMAs to the aorta, we found differences in pathways and processes associated with atherosclerosis, inflammation, and cell signaling — pathways which provide biological support for the observation that BIMA grafts deliver long-term benefits to the patients and protect against continued atherosclerosis. These data support the widespread use of BIMAs as the preferred conduits in CABG. PMID:24858532

  3. Comparative genomics-based identification and analysis of cis-regulatory elements.

    PubMed

    Ogino, Hajime; Ochi, Haruki; Uchiyama, Chihiro; Louie, Sarah; Grainger, Robert M

    2012-01-01

    Identification of cis-regulatory elements, such as enhancers and promoters, is very important not only for analysis of gene regulatory networks but also as a tool for targeted gene expression experiments. In this chapter, we introduce an easy but reliable approach to predict enhancers of a gene of interest by comparing mammalian and Xenopus genome sequences, and to examine their activity using a co-transgenesis technique in Xenopus embryos. Since the bioinformatics analysis utilizes publically available web tools, bench biologists can easily perform it without any need for special computing capability. The co-transgenesis assay, which directly uses polymerase chain reaction products, quickly screens for the activity of the candidate elements in a cloning-free manner.

  4. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    PubMed Central

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  5. A comparative genomic analysis of the alkalitolerant soil bacterium Bacillus lehensis G1.

    PubMed

    Noor, Yusuf Muhammad; Samsulrizal, Nurul Hidayah; Jema'on, Noor Azah; Low, Kheng Oon; Ramli, Aizi Nor Mazila; Alias, Noor Izawati; Damis, Siti Intan Rosdianah; Fuzi, Siti Fatimah Zaharah Mohd; Isa, Mohd Noor Mat; Murad, Abdul Munir Abdul; Raih, Mohd Firdaus Mohd; Bakar, Farah Diba Abu; Najimudin, Nazalan; Mahadi, Nor Muhammad; Illias, Rosli Md

    2014-07-25

    Bacillus lehensis G1 is a Gram-positive, moderately alkalitolerant bacterium isolated from soil samples. B. lehensis produces cyclodextrin glucanotransferase (CGTase), an enzyme that has enabled the extensive use of cyclodextrin in foodstuffs, chemicals, and pharmaceuticals. The genome sequence of B. lehensis G1 consists of a single circular 3.99 Mb chromosome containing 4017 protein-coding sequences (CDSs), of which 2818 (70.15%) have assigned biological roles, 936 (23.30%) have conserved domains with unknown functions, and 263 (6.55%) have no match with any protein database. Bacillus clausii KSM-K16 was established as the closest relative to B. lehensis G1 based on gene content similarity and 16S rRNA phylogenetic analysis. A total of 2820 proteins from B. lehensis G1 were found to have orthologues in B. clausii, including sodium-proton antiporters, transport proteins, and proteins involved in ATP synthesis. A comparative analysis of these proteins and those in B. clausii and other alkaliphilic Bacillus species was carried out to investigate their contributions towards the alkalitolerance of the microorganism. The similarities and differences in alkalitolerance-related genes among alkalitolerant/alkaliphilic Bacillus species highlight the complex mechanism of pH homeostasis. The B. lehensis G1 genome was also mined for proteins and enzymes with potential viability for industrial and commercial purposes. PMID:24811681

  6. Genome-Wide Comparative Analysis of Chemosensory Gene Families in Five Tsetse Fly Species.

    PubMed

    Macharia, Rosaline; Mireji, Paul; Murungi, Edwin; Murilla, Grace; Christoffels, Alan; Aksoy, Serap; Masiga, Daniel

    2016-02-01

    For decades, odour-baited traps have been used for control of tsetse flies (Diptera; Glossinidae), vectors of African trypanosomes. However, differential responses to known attractants have been reported in different Glossina species, hindering establishment of a universal vector control tool. Availability of full genome sequences of five Glossina species offers an opportunity to compare their chemosensory repertoire and enhance our understanding of their biology in relation to chemosensation. Here, we identified and annotated the major chemosensory gene families in Glossina. We identified a total of 118, 115, 124, and 123 chemosensory genes in Glossina austeni, G. brevipalpis, G. f. fuscipes, G. pallidipes, respectively, relative to 127 reported in G. m. morsitans. Our results show that tsetse fly genomes have fewer chemosensory genes when compared to other dipterans such as Musca domestica (n>393), Drosophila melanogaster (n = 246) and Anopheles gambiae (n>247). We also found that Glossina chemosensory genes are dispersed across distantly located scaffolds in their respective genomes, in contrast to other insects like D. melanogaster whose genes occur in clusters. Further, Glossina appears to be devoid of sugar receptors and to have expanded CO2 associated receptors, potentially reflecting Glossina's obligate hematophagy and the need to detect hosts that may be out of sight. We also identified, in all species, homologs of Ir84a; a Drosophila-specific ionotropic receptor that promotes male courtship suggesting that this is a conserved trait in tsetse flies. Notably, our selection analysis revealed that a total of four gene loci (Gr21a, GluRIIA, Gr28b, and Obp83a) were under positive selection, which confers fitness advantage to species. These findings provide a platform for studies to further define the language of communication of tsetse with their environment, and influence development of novel approaches for control. PMID:26886411

  7. Genome-Wide Comparative Analysis of Chemosensory Gene Families in Five Tsetse Fly Species.

    PubMed

    Macharia, Rosaline; Mireji, Paul; Murungi, Edwin; Murilla, Grace; Christoffels, Alan; Aksoy, Serap; Masiga, Daniel

    2016-02-01

    For decades, odour-baited traps have been used for control of tsetse flies (Diptera; Glossinidae), vectors of African trypanosomes. However, differential responses to known attractants have been reported in different Glossina species, hindering establishment of a universal vector control tool. Availability of full genome sequences of five Glossina species offers an opportunity to compare their chemosensory repertoire and enhance our understanding of their biology in relation to chemosensation. Here, we identified and annotated the major chemosensory gene families in Glossina. We identified a total of 118, 115, 124, and 123 chemosensory genes in Glossina austeni, G. brevipalpis, G. f. fuscipes, G. pallidipes, respectively, relative to 127 reported in G. m. morsitans. Our results show that tsetse fly genomes have fewer chemosensory genes when compared to other dipterans such as Musca domestica (n>393), Drosophila melanogaster (n = 246) and Anopheles gambiae (n>247). We also found that Glossina chemosensory genes are dispersed across distantly located scaffolds in their respective genomes, in contrast to other insects like D. melanogaster whose genes occur in clusters. Further, Glossina appears to be devoid of sugar receptors and to have expanded CO2 associated receptors, potentially reflecting Glossina's obligate hematophagy and the need to detect hosts that may be out of sight. We also identified, in all species, homologs of Ir84a; a Drosophila-specific ionotropic receptor that promotes male courtship suggesting that this is a conserved trait in tsetse flies. Notably, our selection analysis revealed that a total of four gene loci (Gr21a, GluRIIA, Gr28b, and Obp83a) were under positive selection, which confers fitness advantage to species. These findings provide a platform for studies to further define the language of communication of tsetse with their environment, and influence development of novel approaches for control.

  8. Genome-Wide Comparative Analysis of Chemosensory Gene Families in Five Tsetse Fly Species

    PubMed Central

    Macharia, Rosaline; Mireji, Paul; Murungi, Edwin; Murilla, Grace; Christoffels, Alan; Aksoy, Serap; Masiga, Daniel

    2016-01-01

    For decades, odour-baited traps have been used for control of tsetse flies (Diptera; Glossinidae), vectors of African trypanosomes. However, differential responses to known attractants have been reported in different Glossina species, hindering establishment of a universal vector control tool. Availability of full genome sequences of five Glossina species offers an opportunity to compare their chemosensory repertoire and enhance our understanding of their biology in relation to chemosensation. Here, we identified and annotated the major chemosensory gene families in Glossina. We identified a total of 118, 115, 124, and 123 chemosensory genes in Glossina austeni, G. brevipalpis, G. f. fuscipes, G. pallidipes, respectively, relative to 127 reported in G. m. morsitans. Our results show that tsetse fly genomes have fewer chemosensory genes when compared to other dipterans such as Musca domestica (n>393), Drosophila melanogaster (n = 246) and Anopheles gambiae (n>247). We also found that Glossina chemosensory genes are dispersed across distantly located scaffolds in their respective genomes, in contrast to other insects like D. melanogaster whose genes occur in clusters. Further, Glossina appears to be devoid of sugar receptors and to have expanded CO2 associated receptors, potentially reflecting Glossina's obligate hematophagy and the need to detect hosts that may be out of sight. We also identified, in all species, homologs of Ir84a; a Drosophila-specific ionotropic receptor that promotes male courtship suggesting that this is a conserved trait in tsetse flies. Notably, our selection analysis revealed that a total of four gene loci (Gr21a, GluRIIA, Gr28b, and Obp83a) were under positive selection, which confers fitness advantage to species. These findings provide a platform for studies to further define the language of communication of tsetse with their environment, and influence development of novel approaches for control. PMID:26886411

  9. Characterization and comparative genomic analysis of bacteriophages infecting members of the Bacillus cereus group.

    PubMed

    Lee, Ju-Hoon; Shin, Hakdong; Ryu, Sangryeol

    2014-05-01

    The Bacillus cereus group phages infecting B. cereus, B. anthracis, and B. thuringiensis (Bt) have been studied at the molecular level and, recently, at the genomic level to control the pathogens B. cereus and B. anthracis and to prevent phage contamination of the natural insect pesticide Bt. A comparative phylogenetic analysis has revealed three different major phage groups with different morphologies (Myoviridae for group I, Siphoviridae for group II, and Tectiviridae for group III), genome size (group I > group II > group III), and lifestyle (virulent for group I and temperate for group II and III). A subsequent phage genome comparison using a dot plot analysis showed that phages in each group are highly homologous, substantiating the grouping of B. cereus phages. Endolysin is a host lysis protein that contains two conserved domains: a cell-wall-binding domain (CBD) and an enzymatic activity domain (EAD). In B. cereus sensu lato phage group I, four different endolysin groups have been detected, according to combinations of two types of CBD and four types of EAD. Group I phages have two copies of tail lysins and one copy of endolysin, but the functions of the tail lysins are still unknown. In the B. cereus sensu lato phage group II, the B. anthracis phages have been studied and applied for typing and rapid detection of pathogenic host strains. In the B. cereus sensu lato phage group III, the B. thuringiensis phages Bam35 and GIL01 have been studied to understand phage entry and lytic switch regulation mechanisms. In this review, we suggest that further study of the B. cereus group phages would be useful for various phage applications, such as biocontrol, typing, and rapid detection of the pathogens B. cereus and B. anthracis and for the prevention of phage contamination of the natural insect pesticide Bt.

  10. Comparative Genomic Analysis of Sulfurospirillum cavolei MES Reconstructed from the Metagenome of an Electrosynthetic Microbiome

    PubMed Central

    Ross, Daniel E.; Marshall, Christopher W.; May, Harold D.; Norman, R. Sean

    2016-01-01

    Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this

  11. Comparative Genomic Analysis of Sulfurospirillum cavolei MES Reconstructed from the Metagenome of an Electrosynthetic Microbiome.

    PubMed

    Ross, Daniel E; Marshall, Christopher W; May, Harold D; Norman, R Sean

    2016-01-01

    Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this

  12. A Comparative Genomic Analysis of Diverse Clonal Types of Enterotoxigenic Escherichia coli Reveals Pathovar-Specific Conservation▿ †

    PubMed Central

    Sahl, Jason W.; Steinsland, Hans; Redman, Julia C.; Angiuoli, Samuel V.; Nataro, James P.; Sommerfelt, Halvor; Rasko, David A.

    2011-01-01

    Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrheal illness in children less than 5 years of age in low- and middle-income nations, whereas it is an emerging enteric pathogen in industrialized nations. Despite being an important cause of diarrhea, little is known about the genomic composition of ETEC. To address this, we sequenced the genomes of five ETEC isolates obtained from children in Guinea-Bissau with diarrhea. These five isolates represent distinct and globally dominant ETEC clonal groups. Comparative genomic analyses utilizing a gene-independent whole-genome alignment method demonstrated that sequenced ETEC strains share approximately 2.7 million bases of genomic sequence. Phylogenetic analysis of this “core genome” confirmed the diverse history of the ETEC pathovar and provides a finer resolution of the E. coli relationships than multilocus sequence typing. No identified genomic regions were conserved exclusively in all ETEC genomes; however, we identified more genomic content conserved among ETEC genomes than among non-ETEC E. coli genomes, suggesting that ETEC isolates share a genomic core. Comparisons of known virulence and of surface-exposed and colonization factor genes across all sequenced ETEC genomes not only identified variability but also indicated that some antigens are restricted to the ETEC pathovar. Overall, the generation of these five genome sequences, in addition to the two previously generated ETEC genomes, highlights the genomic diversity of ETEC. These studies increase our understanding of ETEC evolution, as well as provide insight into virulence factors and conserved proteins, which may be targets for vaccine development. PMID:21078854

  13. SOLiD sequencing of four Vibrio vulnificus genomes enables comparative genomic analysis and identification of candidate clade-specific virulence genes

    PubMed Central

    2010-01-01

    Background Vibrio vulnificus is the leading cause of reported death from consumption of seafood in the United States. Despite several decades of research on molecular pathogenesis, much remains to be learned about the mechanisms of virulence of this opportunistic bacterial pathogen. The two complete and annotated genomic DNA sequences of V. vulnificus belong to strains of clade 2, which is the predominant clade among clinical strains. Clade 2 strains generally possess higher virulence potential in animal models of disease compared with clade 1, which predominates among environmental strains. SOLiD sequencing of four V. vulnificus strains representing different clades (1 and 2) and biotypes (1 and 2) was used for comparative genomic analysis. Results Greater than 4,100,000 bases were sequenced of each strain, yielding approximately 100-fold coverage for each of the four genomes. Although the read lengths of SOLiD genomic sequencing were only 35 nt, we were able to make significant conclusions about the unique and shared sequences among the genomes, including identification of single nucleotide polymorphisms. Comparative analysis of the newly sequenced genomes to the existing reference genomes enabled the identification of 3,459 core V. vulnificus genes shared among all six strains and 80 clade 2-specific genes. We identified 523,161 SNPs among the six genomes. Conclusions We were able to glean much information about the genomic content of each strain using next generation sequencing. Flp pili, GGDEF proteins, and genomic island XII were identified as possible virulence factors because of their presence in virulent sequenced strains. Genomic comparisons also point toward the involvement of sialic acid catabolism in pathogenesis. PMID:20863407

  14. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-01

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

  15. Comparative genomic analysis of bacteriophages specific to the channel catfish pathogen Edwardsiella ictaluri

    PubMed Central

    2011-01-01

    Background The bacterial pathogen Edwardsiella ictaluri is a primary cause of mortality in channel catfish raised commercially in aquaculture farms. Additional treatment and diagnostic regimes are needed for this enteric pathogen, motivating the discovery and characterization of bacteriophages specific to E. ictaluri. Results The genomes of three Edwardsiella ictaluri-specific bacteriophages isolated from geographically distant aquaculture ponds, at different times, were sequenced and analyzed. The genomes for phages eiAU, eiDWF, and eiMSLS are 42.80 kbp, 42.12 kbp, and 42.69 kbp, respectively, and are greater than 95% identical to each other at the nucleotide level. Nucleotide differences were mostly observed in non-coding regions and in structural proteins, with significant variability in the sequences of putative tail fiber proteins. The genome organization of these phages exhibit a pattern shared by other Siphoviridae. Conclusions These E. ictaluri-specific phage genomes reveal considerable conservation of genomic architecture and sequence identity, even with considerable temporal and spatial divergence in their isolation. Their genomic homogeneity is similarly observed among E. ictaluri bacterial isolates. The genomic analysis of these phages supports the conclusion that these are virulent phages, lacking the capacity for lysogeny or expression of virulence genes. This study contributes to our knowledge of phage genomic diversity and facilitates studies on the diagnostic and therapeutic applications of these phages. PMID:21214923

  16. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc.

  17. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat

    PubMed Central

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  18. Comparative genomic analysis and phenazine production of Pseudomonas chlororaphis, a plant growth-promoting rhizobacterium

    PubMed Central

    Chen, Yawen; Shen, Xuemei; Peng, Huasong; Hu, Hongbo; Wang, Wei; Zhang, Xuehong

    2015-01-01

    Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30–84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869–6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield. PMID:26484173

  19. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc. PMID:27353064

  20. Comparative genomic analysis and phenazine production of Pseudomonas chlororaphis, a plant growth-promoting rhizobacterium.

    PubMed

    Chen, Yawen; Shen, Xuemei; Peng, Huasong; Hu, Hongbo; Wang, Wei; Zhang, Xuehong

    2015-06-01

    Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30-84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869-6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield. PMID:26484173

  1. Emergence and Evolutionary Analysis of the Human DDR Network: Implications in Comparative Genomics and Downstream Analyses

    PubMed Central

    Arcas, Aida; Fernández-Capetillo, Oscar; Cases, Ildefonso; Rojas, Ana M.

    2014-01-01

    The DNA damage response (DDR) is a crucial signaling network that preserves the integrity of the genome. This network is an ensemble of distinct but often overlapping subnetworks, where different components fulfill distinct functions in precise spatial and temporal scenarios. To understand how these elements have been assembled together in humans, we performed comparative genomic analyses in 47 selected species to trace back their emergence using systematic phylogenetic analyses and estimated gene ages. The emergence of the contribution of posttranslational modifications to the complex regulation of DDR was also investigated. This is the first time a systematic analysis has focused on the evolution of DDR subnetworks as a whole. Our results indicate that a DDR core, mostly constructed around metabolic activities, appeared soon after the emergence of eukaryotes, and that additional regulatory capacities appeared later through complex evolutionary process. Potential key posttranslational modifications were also in place then, with interacting pairs preferentially appearing at the same evolutionary time, although modifications often led to the subsequent acquisition of new targets afterwards. We also found extensive gene loss in essential modules of the regulatory network in fungi, plants, and arthropods, important for their validation as model organisms for DDR studies. PMID:24441036

  2. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat.

    PubMed

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  3. The Mycobacterium DosR regulon structure and diversity revealed by comparative genomic analysis.

    PubMed

    Chen, Tian; He, Liming; Deng, Wanyan; Xie, Jianping

    2013-01-01

    Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), which claims approximately two million people annually, remains a global health concern. The non-replicating or dormancy like state of this pathogen which is impervious to anti-tuberculosis drugs is widely recognized as the culprit for this scenario. The dormancy survival regulator (DosR) regulon, composed of 48 co-regulated genes, is held as essential for Mtb persistence. The DosR regulon is regulated by a two-component regulatory system consisting of two sensor kinases-DosS (Rv3132c) and DosT (Rv2027c), and a response regulator DosR (Rv3133c). The underlying regulatory mechanism of DosR regulon expression is very complex. Many factors are involved, particularly the oxygen tension. The DosR regulon enables the pathogen to persist during lengthy hypoxia. Comparative genomic analysis demonstrated that the DosR regulon is widely distributed among the mycobacterial genomes, ranging from the pathogenic strains to the environmental strains. In-depth studies on the DosR response should provide insights into its role in TB latency in vivo and shape new measures to combat this exceeding recalcitrant pathogen.

  4. Comparative genomic analysis of clinical and environmental Vibrio vulnificus isolates revealed biotype 3 evolutionary relationships

    PubMed Central

    Koton, Yael; Gordon, Michal; Chalifa-Caspi, Vered; Bisharat, Naiel

    2015-01-01

    In 1996 a common-source outbreak of severe soft tissue and bloodstream infections erupted among Israeli fish farmers and fish consumers due to changes in fish marketing policies. The causative pathogen was a new strain of Vibrio vulnificus, named biotype 3, which displayed a unique biochemical and genotypic profile. Initial observations suggested that the pathogen erupted as a result of genetic recombination between two distinct populations. We applied a whole genome shotgun sequencing approach using several V. vulnificus strains from Israel in order to study the pan genome of V. vulnificus and determine the phylogenetic relationship of biotype 3 with existing populations. The core genome of V. vulnificus based on 16 draft and complete genomes consisted of 3068 genes, representing between 59 and 78% of the whole genome of 16 strains. The accessory genome varied in size from 781 to 2044 kbp. Phylogenetic analysis based on whole, core, and accessory genomes displayed similar clustering patterns with two main clusters, clinical (C) and environmental (E), all biotype 3 strains formed a distinct group within the E cluster. Annotation of accessory genomic regions found in biotype 3 strains and absent from the core genome yielded 1732 genes, of which the vast majority encoded hypothetical proteins, phage-related proteins, and mobile element proteins. A total of 1916 proteins (including 713 hypothetical proteins) were present in all human pathogenic strains (both biotype 3 and non-biotype 3) and absent from the environmental strains. Clustering analysis of the non-hypothetical proteins revealed 148 protein clusters shared by all human pathogenic strains; these included transcriptional regulators, arylsulfatases, methyl-accepting chemotaxis proteins, acetyltransferases, GGDEF family proteins, transposases, type IV secretory system (T4SS) proteins, and integrases. Our study showed that V. vulnificus biotype 3 evolved from environmental populations and formed a genetically

  5. Gene expression profiles in squamous cell cervical carcinoma using array-based comparative genomic hybridization analysis.

    PubMed

    Choi, Y-W; Bae, S M; Kim, Y-W; Lee, H N; Kim, Y W; Park, T C; Ro, D Y; Shin, J C; Shin, S J; Seo, J-S; Ahn, W S

    2007-01-01

    Our aim was to identify novel genomic regions of interest and provide highly dynamic range information on correlation between squamous cell cervical carcinoma and its related gene expression patterns by a genome-wide array-based comparative genomic hybridization (array-CGH). We analyzed 15 cases of cervical cancer from KangNam St Mary's Hospital of the Catholic University of Korea. Microdissection assay was performed to obtain DNA samples from paraffin-embedded cervical tissues of cancer as well as of the adjacent normal tissues. The bacterial artificial chromosome (BAC) array used in this study consisted of 1440 human BACs and the space among the clones was 2.08 Mb. All the 15 cases of cervical cancer showed the differential changes of the cervical cancer-associated genetic alterations. The analysis limit of average gains and losses was 53%. A significant positive correlation was found in 8q24.3, 1p36.32, 3q27.1, 7p21.1, 11q13.1, and 3p14.2 changes through the cervical carcinogenesis. The regions of high level of gain were 1p36.33-1p36.32, 8q24.3, 16p13.3, 1p36.33, 3q27.1, and 7p21.1. And the regions of homozygous loss were 2q12.1, 22q11.21, 3p14.2, 6q24.3, 7p15.2, and 11q25. In the high level of gain regions, GSDMDC1, RECQL4, TP73, ABCF3, ALG3, HDAC9, ESRRA, and RPS6KA4 were significantly correlated with cervical cancer. The genes encoded by frequently lost clones were PTPRG, GRM7, ZDHHC3, EXOSC7, LRP1B, and NR3C2. Therefore, array-CGH analyses showed that specific genomic alterations were maintained in cervical cancer that were critical to the malignant phenotype and may give a chance to find out possible target genes present in the gained or lost clones.

  6. Comparative Analysis of the Complete Chloroplast Genomes of Five Quercus Species

    PubMed Central

    Yang, Yanci; Zhou, Tao; Duan, Dong; Yang, Jia; Feng, Li; Zhao, Guifang

    2016-01-01

    Quercus is considered economically and ecologically one of the most important genera in the Northern Hemisphere. Oaks are taxonomically perplexing because of shared interspecific morphological traits and intraspecific morphological variation, which are mainly attributed to hybridization. Universal plastid markers cannot provide a sufficient number of variable sites to explore the phylogeny of this genus, and chloroplast genome-scale data have proven to be useful in resolving intractable phylogenetic relationships. In this study, the complete chloroplast genomes of four Quercus species were sequenced, and one published chloroplast genome of Quercus baronii was retrieved for comparative analyses. The five chloroplast genomes ranged from 161,072 bp (Q. baronii) to 161,237 bp (Q. dolicholepis) in length, and their gene organization and order, and GC content, were similar to those of other Fagaceae species. We analyzed nucleotide substitutions, indels, and repeats in the chloroplast genomes, and found 19 relatively highly variable regions that will potentially provide plastid markers for further taxonomic and phylogenetic studies within Quercus. We observed that four genes (ndhA, ndhK, petA, and ycf1) were subject to positive selection. The phylogenetic relationships of the Quercus species inferred from the chloroplast genomes obtained moderate-to-high support, indicating that chloroplast genome data may be useful in resolving relationships in this genus. PMID:27446185

  7. Assembly and comparative analysis of transposable elements from low coverage genomic sequence data in Asparagales.

    PubMed

    Hertweck, Kate L

    2013-09-01

    The research field of comparative genomics is moving from a focus on genes to a more holistic view including the repetitive complement. This study aimed to characterize relative proportions of the repetitive fraction of large, complex genomes in a nonmodel system. The monocotyledonous plant order Asparagales (onion, asparagus, agave) comprises some of the largest angiosperm genomes and represents variation in both genome size and structure (karyotype). Anonymous, low coverage, single-end Illumina data from 11 exemplar Asparagales taxa were assembled using a de novo method. Resulting contigs were annotated using a reference library of available monocot repetitive sequences. Mapping reads to contigs provided rough estimates of relative proportions of each type of transposon in the nuclear genome. The results were parsed into general repeat types and synthesized with genome size estimates and a phylogenetic context to describe the pattern of transposable element evolution among these lineages. The major finding is that although some lineages in Asparagales exhibit conservation in repeat proportions, there is generally wide variation in types and frequency of repeats. This approach is an appropriate first step in characterizing repeats in evolutionary lineages with a paucity of genomic resources. PMID:24168669

  8. A three-way comparative genomic analysis of Mannheimia haemolytica isolates

    PubMed Central

    2010-01-01

    Background Mannhemia haemolytica is a Gram-negative bacterium and the principal etiological agent associated with bovine respiratory disease complex. They transform from a benign commensal to a deadly pathogen, during stress such as viral infection and transportation to feedlots and cause acute pleuropneumonia commonly known as shipping fever. The U.S beef industry alone loses more than one billion dollars annually due to shipping fever. Despite its enormous economic importance there are no specific and accurate genetic markers, which will aid in understanding the pathogenesis and epidemiology of M. haemolytica at molecular level and assist in devising an effective control strategy. Description During our comparative genomic sequence analysis of three Mannheimia haemolytica isolates, we identified a number of genes that are unique to each strain. These genes are "high value targets" for future studies that attempt to correlate the variable gene pool with phenotype. We also identified a number of high confidence single nucleotide polymorphisms (hcSNPs) spread throughout the genome and focused on non-synonymous SNPs in known virulence genes. These SNPs will be used to design new hcSNP arrays to study variation across strains, and will potentially aid in understanding gene regulation and the mode of action of various virulence factors. Conclusions During our analysis we identified previously unknown possible type III secretion effector proteins, clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated sequences (Cas). The presence of CRISPR regions is indicative of likely co-evolution with an associated phage. If proven functional, the presence of a type III secretion system in M. haemolytica will help us re-evaluate our approach to study host-pathogen interactions. We also identified various adhesins containing immuno-dominant domains, which may interfere with host-innate immunity and which could potentially serve as effective vaccine

  9. [Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

    PubMed

    Kai, Xia; Xinle, Liang; Yudong, Li

    2015-12-01

    The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.

  10. A Comparative Genomic Analysis of Energy Metabolism in Sulfate Reducing Bacteria and Archaea

    PubMed Central

    Pereira, Inês A. Cardoso; Ramos, Ana Raquel; Grein, Fabian; Marques, Marta Coimbra; da Silva, Sofia Marques; Venceslau, Sofia Santos

    2011-01-01

    The number of sequenced genomes of sulfate reducing organisms (SRO) has increased significantly in the recent years, providing an opportunity for a broader perspective into their energy metabolism. In this work we carried out a comparative survey of energy metabolism genes found in 25 available genomes of SRO. This analysis revealed a higher diversity of possible energy conserving pathways than classically considered to be present in these organisms, and permitted the identification of new proteins not known to be present in this group. The Deltaproteobacteria (and Thermodesulfovibrio yellowstonii) are characterized by a large number of cytochromes c and cytochrome c-associated membrane redox complexes, indicating that periplasmic electron transfer pathways are important in these bacteria. The Archaea and Clostridia groups contain practically no cytochromes c or associated membrane complexes. However, despite the absence of a periplasmic space, a few extracytoplasmic membrane redox proteins were detected in the Gram-positive bacteria. Several ion-translocating complexes were detected in SRO including H+-pyrophosphatases, complex I homologs, Rnf, and Ech/Coo hydrogenases. Furthermore, we found evidence that cytoplasmic electron bifurcating mechanisms, recently described for other anaerobes, are also likely to play an important role in energy metabolism of SRO. A number of cytoplasmic [NiFe] and [FeFe] hydrogenases, formate dehydrogenases, and heterodisulfide reductase-related proteins are likely candidates to be involved in energy coupling through electron bifurcation, from diverse electron donors such as H2, formate, pyruvate, NAD(P)H, β-oxidation, and others. In conclusion, this analysis indicates that energy metabolism of SRO is far more versatile than previously considered, and that both chemiosmotic and flavin-based electron bifurcating mechanisms provide alternative strategies for energy conservation. PMID:21747791

  11. [Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

    PubMed

    Kai, Xia; Xinle, Liang; Yudong, Li

    2015-12-01

    The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria. PMID:26704949

  12. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    SciTech Connect

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott. E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael W.; Tsang, Adrian

    2011-05-16

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  13. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and thielavia terrestris

    SciTech Connect

    Berka, Randy; Grigoriev, Igor V.; Otillar, Robert P.; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; john, tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott E.; Magnuson, Jon K.; LaBoissiere, Sylvie; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  14. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris.

    PubMed

    Berka, Randy M; Grigoriev, Igor V; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M; Lombard, Vincent; Natvig, Donald O; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P; Allijn, Iris E; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J; Paulsen, Ian T; Elbourne, Liam D H; Baker, Scott E; Magnuson, Jon; Laboissiere, Sylvie; Clutterbuck, A John; Martinez, Diego; Wogulis, Mark; de Leon, Alfredo Lopez; Rey, Michael W; Tsang, Adrian

    2011-10-01

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics. PMID:21964414

  15. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris.

    PubMed

    Berka, Randy M; Grigoriev, Igor V; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M; Lombard, Vincent; Natvig, Donald O; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P; Allijn, Iris E; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J; Paulsen, Ian T; Elbourne, Liam D H; Baker, Scott E; Magnuson, Jon; Laboissiere, Sylvie; Clutterbuck, A John; Martinez, Diego; Wogulis, Mark; de Leon, Alfredo Lopez; Rey, Michael W; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  16. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    PubMed

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome.

  17. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    PubMed

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome. PMID:27079962

  18. Comparative genomic analysis of phylogenetically closely related Hydrogenobaculum sp. isolates from Yellowstone National Park.

    PubMed

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L; McDermott, Timothy R

    2013-05-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥ 99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized.

  19. Comparative Genomic Analysis of Phylogenetically Closely Related Hydrogenobaculum sp. Isolates from Yellowstone National Park

    PubMed Central

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L.

    2013-01-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891

  20. Comparative genomic analysis of phylogenetically closely related Hydrogenobaculum sp. isolates from Yellowstone National Park.

    PubMed

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L; McDermott, Timothy R

    2013-05-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥ 99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891

  1. The ASAP II database: analysis and comparative genomics of alternative splicing in 15 animal species.

    PubMed

    Kim, Namshin; Alekseyenko, Alexander V; Roy, Meenakshi; Lee, Christopher

    2007-01-01

    We have greatly expanded the Alternative Splicing Annotation Project (ASAP) database: (i) its human alternative splicing data are expanded approximately 3-fold over the previous ASAP database, to nearly 90,000 distinct alternative splicing events; (ii) it now provides genome-wide alternative splicing analyses for 15 vertebrate, insect and other animal species; (iii) it provides comprehensive comparative genomics information for comparing alternative splicing and splice site conservation across 17 aligned genomes, based on UCSC multigenome alignments; (iv) it provides an approximately 2- to 3-fold expansion in detection of tissue-specific alternative splicing events, and of cancer versus normal specific alternative splicing events. We have also constructed a novel database linking orthologous exons and orthologous introns between genomes, based on multigenome alignment of 17 animal species. It can be a valuable resource for studies of gene structure evolution. ASAP II provides a new web interface enabling more detailed exploration of the data, and integrating comparative genomics information with alternative splicing data. We provide a set of tools for advanced data-mining of ASAP II with Pygr (the Python Graph Database Framework for Bioinformatics) including powerful features such as graph query, multigenome alignment query, etc. ASAP II is available at http://www.bioinformatics.ucla.edu/ASAP2.

  2. Genome-wide comparative analysis of the transposable elements in the related species Arabidopsis thaliana and Brassica oleracea.

    PubMed

    Zhang, Xiaoyu; Wessler, Susan R

    2004-04-13

    Transposable elements (TEs) are the major component of plant genomes where they contribute significantly to the >1,000-fold genome size variation. To understand the dynamics of TE-mediated genome expansion, we have undertaken a comparative analysis of the TEs in two related organisms: the weed Arabidopsis thaliana (125 megabases) and Brassica oleracea ( approximately 600 megabases), a species with many crop plants. Comparison of the whole genome sequence of A. thaliana with a partial draft of B. oleracea has permitted an estimation of the patterns of TE amplification, diversification, and loss that has occurred in related species since their divergence from a common ancestor. Although we find that nearly all TE lineages are shared, the number of elements in each lineage is almost always greater in B. oleracea. Class 1 (retro) elements are the most abundant TE class in both species with LTR and non-LTR elements comprising the largest fraction of each genome. However, several families of class 2 (DNA) elements have amplified to very high copy number in B. oleracea where they have contributed significantly to genome expansion. Taken together, the results of this analysis indicate that amplification of both class 1 and class 2 TEs is responsible, in part, for B. oleracea genome expansion since divergence from a common ancestor with A. thaliana. In addition, the observation that B. oleracea and A. thaliana share virtually all TE lineages makes it unlikely that wholesale removal of TEs is responsible for the compact genome of A. thaliana. PMID:15064405

  3. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  4. Comparative analysis of the peanut witches'-broom phytoplasma genome reveals horizontal transfer of potential mobile units and effectors.

    PubMed

    Chung, Wan-Chia; Chen, Ling-Ling; Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution.

  5. Comparative Analysis of the Peanut Witches'-Broom Phytoplasma Genome Reveals Horizontal Transfer of Potential Mobile Units and Effectors

    PubMed Central

    Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution. PMID:23626855

  6. Identification of transposon insertion polymorphisms by computational comparative analysis of next generation personal genome data

    NASA Astrophysics Data System (ADS)

    Luo, Xuemei; Dehne, Frank; Liang, Ping

    2011-11-01

    Structural variations (SVs) in a genome are now known as a prominent and important type of genetic variation. Among all types of SVs, the identification of transposon insertion polymorphisms (TIPs) is more challenging due to the highly repetitive nature of transposon sequences. We developed a computational method, TIP-finder, to identify TIPs through analysis of next generation personal genome data and their extremely large copy numbers. We tested the efficiency of TIP-finder with simulated data and are able to detect about 88% of TIPs with precision of ≥91%. Using TIP-finder to analyze the Solexa pair-end sequence data at deep coverage for six genomes representing two trio families, we identified a total of 5569 TIPs, consisting of 4881, 456, 91, and 141 insertions from Alu, L1, SVA and HERV, respectively, representing the most comprehensive analysis of such type of genetic variation.

  7. Cytogenetic analysis of myxoid liposarcoma and myxofibrosarcoma by array‐based comparative genomic hybridisation

    PubMed Central

    Ohguri, T; Hisaoka, M; Kawauchi, S; Sasaki, K; Aoki, T; Kanemitsu, S; Matsuyama, A; Korogi, Y; Hashimoto, H

    2006-01-01

    Aim To investigate overall chromosomal alterations using array‐based comparative genomic hybridisation (CGH) of myxoid liposarcomas (MLSs) and myxofibrosarcomas (MFSs). Materials and methods Genomic DNA extracted from fresh‐frozen tumour tissues was labelled with fluorochromes and then hybridised on to an array consisting of 1440 bacterial artificial chromosome clones representing regions throughout the entire human genome important in cytogenetics and oncology. Results DNA copy number aberrations (CNAs) were found in all the 8 MFSs, but no alterations were found in 7 (70%) of 10 MLSs. In MFSs, the most frequent CNAs were gains at 7p21.1–p22.1 and 12q15–q21.1 and a loss at 13q14.3–q34. The second most frequent CNAs were gains at 7q33–q35, 9q22.31–q22.33, 12p13.32–pter, 17q22–q23, Xp11.2 and Xq12 and losses at 10p13–p14, 10q25, 11p11–p14, 11q23.3–q25, 20p11–p12 and 21q22.13–q22.2, which were detected in 38% of the MFSs examined. In MLSs, only a few CNAs were found in two sarcomas with gains at 8p21.2–p23.3, 8q11.22–q12.2 and 8q23.1–q24.3, and in one with gains at 5p13.2–p14.3 and 5q11.2–5q35.2 and a loss at 21q22.2–qter. Conclusions MFS has more frequent and diverse CNAs than MLS, which reinforces the hypothesis that MFS is genetically different from MLS. Out‐array CGH analysis may also provide several entry points for the identification of candidate genes associated with oncogenesis and progression in MFS. PMID:16751306

  8. Comparative genomic hybridization array analysis and real-time PCR reveals genomic copy number alteration for lung adenocarcinomas.

    PubMed

    Choi, Jin Soo; Zheng, Long Tai; Ha, Eunyoung; Lim, Yun Jeong; Kim, Yeul Hong; Wang, Young-Pil; Lim, Young

    2006-01-01

    Genomic alterations in lung cancer tissues have been observed in various studies. To analyze the aberrations in the genome of lung cancer patients, we used array comparative genomic hybridization (array CGH) in 15 lung adenocarcinoma (AdC) tissues. Copy number gains and losses in chromosomal regions were detected and corresponding genes were confirmed by real-time polymerase chain reaction (PCR). As for the results, several frequently altered loci, including gain of 16p (46% of samples), were found, and the most common losses were found in 14q32.33 (26% of samples). High-level DNA amplifications (> 0.8 log(2) ratio) were detected at 1p, 5p, 7p, 9p, 11p, 11q, 12q, 14q, 16p, 17q, 19q, 20p, 21q, and 22q. A subset of genes, gained or lost, was checked for over- or underrepresentation by means of real-time PCR. The degree of fold change was highest in ECGF1 (22q13.33), HOXA9 (7p15.2), MAFG (17q25.3), TSC2 (16p13.3), and ICAM1 (19p13.2) genes and the 16p chromosome terminal region (16p13.3pter). Taken together, these results show that array CGH could be used as a powerful tool for identification of genomic alteration for lung cancer, and the above-mentioned genes may represent potential candidate genes in the study of lung cancer pathogenesis and diagnosis.

  9. Comparative cytogenetic analysis of Avena macrostachya and diploid C-genome Avena species.

    PubMed

    Badaeva, Ekaterina D; Shelukhina, Olga Yu; Diederichsen, Axel; Loskutov, Igor G; Pukhalskiy, Vitaly A

    2010-02-01

    The chromosome set of Avena macrostachya Balansa ex Coss. et Durieu was analyzed using C-banding and fluorescence in situ hybridization with 5S and 18S-5.8S-26S rRNA gene probes, and the results were compared with the C-genome diploid Avena L. species. The location of major nucleolar organizer regions and 5S rDNA sites on different chromosomes confirmed the affiliation of A. macrostachya with the C-genome group. However, the symmetric karyotype, the absence of "diffuse heterochromatin" and the location of large C-band complexes in proximal chromosome regions pointed to an isolated position of A. macrostachya from other Avena species. Based on the distribution of rDNA loci on the C-genome chromosomes of diploid and polyploid Avena species, we propose a model of the chromosome alterations that occurred during the evolution of oat species.

  10. Complete genome sequencing and comparative analysis of three dengue virus type 2 Pakistani isolates.

    PubMed

    Akram, Madiha; Idrees, Muhammad

    2016-03-01

    Dengue is currently one of the most important arthropod borne human viral diseases caused by a flavivirus named as dengue virus. It is now endemic in Pakistan since many dengue fever outbreaks have been observed in Pakistan during the last three decades. Major serotype of dengue virus circulating in Pakistan is serotype 2. Complete genome sequences of three Pakistani dengue virus serotype 2 isolates were generated. Analysis of complete genome sequences showed that Pakistani isolates of dengue virus serotype 2 belonged to cosmopolitan genotype. This study identifies a number of amino acid substitutions that were introduced in local dengue virus serotype 2 isolate over the years. The study provides a significant insight into the evolution of serotype 2 of dengue virus in Pakistan. This is the first report of complete genome sequence information of dengue virus from the most recent outbreak (2013) in Punjab, Pakistan.

  11. Complete chloroplast genome sequence of Omani lime (Citrus aurantiifolia) and comparative analysis within the rosids.

    PubMed

    Su, Huei-Jiun; Hogenhout, Saskia A; Al-Sadi, Abdullah M; Kuo, Chih-Horng

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia). The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis) chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs) that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution.

  12. Complete Chloroplast Genome Sequence of Omani Lime (Citrus aurantiifolia) and Comparative Analysis within the Rosids

    PubMed Central

    Su, Huei-Jiun; Hogenhout, Saskia A.; Al-Sadi, Abdullah M.; Kuo, Chih-Horng

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia). The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis) chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs) that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution. PMID:25398081

  13. Sentra : a database of signal transduction proteins for comparative genome analysis.

    SciTech Connect

    D'Souza, M.; Glass, E. M.; Syed, M. H.; Zhang, Y.; Rodriguez, A.; Maltsev, N.; Galerpin, M. Y.; Mathematics and Computer Science; Univ. of Chicago; NIH

    2007-01-01

    Sentra (http://compbio.mcs.anl.gov/sentra), a database of signal transduction proteins encoded in completely sequenced prokaryotic genomes, has been updated to reflect recent advances in understanding signal transduction events on a whole-genome scale. Sentra consists of two principal components, a manually curated list of signal transduction proteins in 202 completely sequenced prokaryotic genomes and an automatically generated listing of predicted signaling proteins in 235 sequenced genomes that are awaiting manual curation. In addition to two-component histidine kinases and response regulators, the database now lists manually curated Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases, as defined in several recent reviews. All entries in Sentra are extensively annotated with relevant information from public databases (e.g. UniProt, KEGG, PDB and NCBI). Sentra's infrastructure was redesigned to support interactive cross-genome comparisons of signal transduction capabilities of prokaryotic organisms from a taxonomic and phenotypic perspective and in the framework of signal transduction pathways from KEGG. Sentra leverages the PUMA2 system to support interactive analysis and annotation of signal transduction proteins by the users.

  14. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi

    PubMed Central

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  15. Comparative genomic analysis of upstream miRNA regulatory motifs in Caenorhabditis.

    PubMed

    Jovelin, Richard; Krizus, Aldis; Taghizada, Bakhtiyar; Gray, Jeremy C; Phillips, Patrick C; Claycomb, Julie M; Cutter, Asher D

    2016-07-01

    MicroRNAs (miRNAs) comprise a class of short noncoding RNA molecules that play diverse developmental and physiological roles by controlling mRNA abundance and protein output of the vast majority of transcripts. Despite the importance of miRNAs in regulating gene function, we still lack a complete understanding of how miRNAs themselves are transcriptionally regulated. To fill this gap, we predicted regulatory sequences by searching for abundant short motifs located upstream of miRNAs in eight species of Caenorhabditis nematodes. We identified three conserved motifs across the Caenorhabditis phylogeny that show clear signatures of purifying selection from comparative genomics, patterns of nucleotide changes in motifs of orthologous miRNAs, and correlation between motif incidence and miRNA expression. We then validated our predictions with transgenic green fluorescent protein reporters and site-directed mutagenesis for a subset of motifs located in an enhancer region upstream of let-7 We demonstrate that a CT-dinucleotide motif is sufficient for proper expression of GFP in the seam cells of adult C. elegans, and that two other motifs play incremental roles in combination with the CT-rich motif. Thus, functional tests of sequence motifs identified through analysis of molecular evolutionary signatures provide a powerful path for efficiently characterizing the transcriptional regulation of miRNA genes. PMID:27140965

  16. Genome-Wide Comparative Analysis of Flowering-Related Genes in Arabidopsis, Wheat, and Barley

    PubMed Central

    Peng, Fred Y.; Hu, Zhiqiu; Yang, Rong-Cai

    2015-01-01

    Early flowering is an important trait influencing grain yield and quality in wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) in short-season cropping regions. However, due to large and complex genomes of these species, direct identification of flowering genes and their molecular characterization remain challenging. Here, we used a bioinformatic approach to predict flowering-related genes in wheat and barley from 190 known Arabidopsis (Arabidopsis thaliana (L.) Heynh.) flowering genes. We identified 900 and 275 putative orthologs in wheat and barley, respectively. The annotated flowering-related genes were clustered into 144 orthologous groups with one-to-one, one-to-many, many-to-one, and many-to-many orthology relationships. Our approach was further validated by domain and phylogenetic analyses of flowering-related proteins and comparative analysis of publicly available microarray data sets for in silico expression profiling of flowering-related genes in 13 different developmental stages of wheat and barley. These further analyses showed that orthologous gene pairs in three critical flowering gene families (PEBP, MADS, and BBX) exhibited similar expression patterns among 13 developmental stages in wheat and barley, suggesting similar functions among the orthologous genes with sequence and expression similarities. The predicted candidate flowering genes can be confirmed and incorporated into molecular breeding for early flowering wheat and barley in short-season cropping regions. PMID:26435710

  17. Cytogenetic analysis of trophoblasts by comparative genomic hybridization in embryo-fetal development anomalies.

    PubMed

    Tabet, A C; Aboura, A; Dauge, M C; Audibert, F; Coulomb, A; Batallan, A; Couturier-Turpin, M H; Feldmann, G; Tachdjian, G

    2001-08-01

    Cytogenetic studies of spontaneous abortions or intrauterine fetal death depend on conventional tissue culturing and karyotyping. This technique has limitations such as culture failure and selective growth of maternal cells. Fluorescent in situ hybridization (FISH) using specific probes permits diagnosis of aneuploidies but is limited to one or a few chromosomal regions. Comparative genomic hybridization (CGH) provides an overview of chromosomal gains and losses in a single hybridization directly from DNA samples. In a prospective study, we analyzed by CGH trophoblast cells from 21 fetuses in cases of spontaneous abortions, intrauterine fetal death or polymalformed syndrome. Six numerical chromosomal abnormalities including one trisomy 7, one trisomy 10, three trisomies 18, one trisomy 21 and one monosomy X have been correctly identified by CGH. One structural abnormality of the long arm of chromosome 1 has been characterized by CGH. One triploidy and two balanced pericentromeric inversions of chromosome 9 have not been identified by CGH. Sexual chromosomal constitutions were concordant by both classical cytogenetic technique and CGH. Contribution of trophoblast analysis by CGH in embryo-fetal development anomalies is discussed. PMID:11536256

  18. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi.

    PubMed

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  19. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    SciTech Connect

    Wu X.; van der Lelie D.; Monchy, S.; Taghavi, S.; Zhu, W.; Ramos, J.

    2011-03-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands.

  20. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    PubMed Central

    Wu, Xiao; Monchy, Sébastien; Taghavi, Safiyh; Zhu, Wei; Ramos, Juan; van der Lelie, Daniel

    2011-01-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands. PMID:20796030

  1. Comparative genomic analysis of the swine pathogen Bordetella bronchisepticastrain KM22.

    PubMed

    Nicholson, Tracy L; Shore, Sarah M; Register, Karen B; Bayles, Darrell O; Kingsley, Robert A; Brunelle, Brain W

    2016-01-15

    The well-characterized Bordetella bronchiseptica strain KM22, originally isolated from a pig with atrophic rhinitis, has been used to develop a reproducible swine respiratory disease model. The goal of this study was to identify genetic features unique to KM22 by comparing the genome sequence of KM22 to the laboratory reference strain RB50. To gain a broader perspective of the genetic relationship of KM22 among other B. bronchiseptica strains, selected genes of KM22 were then compared to five other B. bronchiseptica strains isolated from different hosts. Overall, the KM22 genome sequence is more similar to the genome sequences of the strains isolated from animals than the strains isolated from humans. The majority of virulence gene expression in Bordetella is positively regulated by the two-component sensory transduction system BvgAS. bopN, bvgA, fimB, and fimC were the most highly conserved BvgAS-regulated genes present in all seven strains analyzed. In contrast, the BvgAS-regulated genes present in all seven strains with the highest sequence divergence werefimN, fim2, fhaL, andfhaS. A total of eight major fimbrial subunit genes were identified in KM22. Quantitative real-time PCR data demonstrated that seven of the eight fimbrial subunit genes identified in KM22 are expressed and regulated by BvgAS. The annotation of the KM22 genome sequence, coupled with the comparative genomic analyses reported in this study, can be used to facilitate the development of vaccines with improved efficacy towards B. bronchiseptica in swine to decrease the prevalence and disease burden caused by this pathogen. PMID:26711033

  2. Comparative genomic analysis of the swine pathogen Bordetella bronchisepticastrain KM22.

    PubMed

    Nicholson, Tracy L; Shore, Sarah M; Register, Karen B; Bayles, Darrell O; Kingsley, Robert A; Brunelle, Brain W

    2016-01-01

    The well-characterized Bordetella bronchiseptica strain KM22, originally isolated from a pig with atrophic rhinitis, has been used to develop a reproducible swine respiratory disease model. The goal of this study was to identify genetic features unique to KM22 by comparing the genome sequence of KM22 to the laboratory reference strain RB50. To gain a broader perspective of the genetic relationship of KM22 among other B. bronchiseptica strains, selected genes of KM22 were then compared to five other B. bronchiseptica strains isolated from different hosts. Overall, the KM22 genome sequence is more similar to the genome sequences of the strains isolated from animals than the strains isolated from humans. The majority of virulence gene expression in Bordetella is positively regulated by the two-component sensory transduction system BvgAS. bopN, bvgA, fimB, and fimC were the most highly conserved BvgAS-regulated genes present in all seven strains analyzed. In contrast, the BvgAS-regulated genes present in all seven strains with the highest sequence divergence werefimN, fim2, fhaL, andfhaS. A total of eight major fimbrial subunit genes were identified in KM22. Quantitative real-time PCR data demonstrated that seven of the eight fimbrial subunit genes identified in KM22 are expressed and regulated by BvgAS. The annotation of the KM22 genome sequence, coupled with the comparative genomic analyses reported in this study, can be used to facilitate the development of vaccines with improved efficacy towards B. bronchiseptica in swine to decrease the prevalence and disease burden caused by this pathogen.

  3. [Comparative analysis of variable regions in the genomes of variola virus].

    PubMed

    Babkin, I V; Nepomniashchikh, T S; Maksiutov, R A; Gutorov, V V; Babkina, I N; Shchelkunov, S N

    2008-01-01

    Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable.

  4. Comparative Analysis of Codon Usage Bias and Codon Context Patterns between Dipteran and Hymenopteran Sequenced Genomes

    PubMed Central

    Behura, Susanta K.; Severson, David W.

    2012-01-01

    Background Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias. Methods and Principal Findings Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO) vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera) shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3′- and 5′-context of start and stop codons, respectively. Conclusions Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny. PMID:22912801

  5. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  6. Putative drug and vaccine target protein identification using comparative genomic analysis of KEGG annotated metabolic pathways of Mycoplasma hyopneumoniae.

    PubMed

    Damte, Dereje; Suh, Joo-Won; Lee, Seung-Jin; Yohannes, Sileshi Belew; Hossain, Md Akil; Park, Seung-Chun

    2013-07-01

    In the present study, a computational comparative and subtractive genomic/proteomic analysis aimed at the identification of putative therapeutic target and vaccine candidate proteins from Kyoto Encyclopedia of Genes and Genomes (KEGG) annotated metabolic pathways of Mycoplasma hyopneumoniae was performed for drug design and vaccine production pipelines against M.hyopneumoniae. The employed comparative genomic and metabolic pathway analysis with a predefined computational systemic workflow extracted a total of 41 annotated metabolic pathways from KEGG among which five were unique to M. hyopneumoniae. A total of 234 proteins were identified to be involved in these metabolic pathways. Although 125 non homologous and predicted essential proteins were found from the total that could serve as potential drug targets and vaccine candidates, additional prioritizing parameters characterize 21 proteins as vaccine candidate while druggability of each of the identified proteins evaluated by the DrugBank database prioritized 42 proteins suitable for drug targets.

  7. Comparative analysis of two phenotypically-similar but genomically-distinct Burkholderia cenocepacia-specific bacteriophages

    PubMed Central

    2012-01-01

    Background Genomic analysis of bacteriophages infecting the Burkholderia cepacia complex (BCC) is an important preliminary step in the development of a phage therapy protocol for these opportunistic pathogens. The objective of this study was to characterize KL1 (vB_BceS_KL1) and AH2 (vB_BceS_AH2), two novel Burkholderia cenocepacia-specific siphoviruses isolated from environmental samples. Results KL1 and AH2 exhibit several unique phenotypic similarities: they infect the same B. cenocepacia strains, they require prolonged incubation at 30°C for the formation of plaques at low titres, and they do not form plaques at similar titres following incubation at 37°C. However, despite these similarities, we have determined using whole-genome pyrosequencing that these phages show minimal relatedness to one another. The KL1 genome is 42,832 base pairs (bp) in length and is most closely related to Pseudomonas phage 73 (PA73). In contrast, the AH2 genome is 58,065 bp in length and is most closely related to Burkholderia phage BcepNazgul. Using both BLASTP and HHpred analysis, we have identified and analyzed the putative virion morphogenesis, lysis, DNA binding, and MazG proteins of these two phages. Notably, MazG homologs identified in cyanophages have been predicted to facilitate infection of stationary phase cells and may contribute to the unique plaque phenotype of KL1 and AH2. Conclusions The nearly indistinguishable phenotypes but distinct genomes of KL1 and AH2 provide further evidence of both vast diversity and convergent evolution in the BCC-specific phage population. PMID:22676492

  8. Identification of genetic bases of vibrio fluvialis species-specific biochemical pathways and potential virulence factors by comparative genomic analysis.

    PubMed

    Lu, Xin; Liang, Weili; Wang, Yunduan; Xu, Jialiang; Zhu, Jun; Kan, Biao

    2014-03-01

    Vibrio fluvialis is an important food-borne pathogen that causes diarrheal illness and sometimes extraintestinal infections in humans. In this study, we sequenced the genome of a clinical V. fluvialis strain and determined its phylogenetic relationships with other Vibrio species by comparative genomic analysis. We found that the closest relationship was between V. fluvialis and V. furnissii, followed by those with V. cholerae and V. mimicus. Moreover, based on genome comparisons and gene complementation experiments, we revealed genetic mechanisms of the biochemical tests that differentiate V. fluvialis from closely related species. Importantly, we identified a variety of genes encoding potential virulence factors, including multiple hemolysins, transcriptional regulators, and environmental survival and adaptation apparatuses, and the type VI secretion system, which is indicative of complex regulatory pathways modulating pathogenesis in this organism. The availability of V. fluvialis genome sequences may promote our understanding of pathogenic mechanisms for this emerging pathogen.

  9. Cloud computing for comparative genomics

    PubMed Central

    2010-01-01

    Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems. PMID:20482786

  10. Study of Modern Human Evolution via Comparative Analysis with the Neanderthal Genome

    PubMed Central

    Ahmed, Musaddeque

    2013-01-01

    Many other human species appeared in evolution in the last 6 million years that have not been able to survive to modern times and are broadly known as archaic humans, as opposed to the extant modern humans. It has always been considered fascinating to compare the modern human genome with that of archaic humans to identify modern human-specific sequence variants and figure out those that made modern humans different from their predecessors or cousin species. Neanderthals are the latest humans to become extinct, and many factors made them the best representatives of archaic humans. Even though a number of comparisons have been made sporadically between Neanderthals and modern humans, mostly following a candidate gene approach, the major breakthrough took place with the sequencing of the Neanderthal genome. The initial genome-wide comparison, based on the first draft of the Neanderthal genome, has generated some interesting inferences regarding variations in functional elements that are not shared by the two species and the debated admixture question. However, there are certain other genetic elements that were not included or included at a smaller scale in those studies, and they should be compared comprehensively to better understand the molecular make-up of modern humans and their phenotypic characteristics. Besides briefly discussing the important outcomes of the comparative analyses made so far between modern humans and Neanderthals, we propose that future comparative studies may include retrotransposons, pseudogenes, and conserved non-coding regions, all of which might have played significant roles during the evolution of modern humans. PMID:24465235

  11. Microsatellites in Brassica unigenes: relative abundance, marker design, and use in comparative physical mapping and genome analysis.

    PubMed

    Parida, Swarup K; Yadava, Devendra K; Mohapatra, Trilochan

    2010-01-01

    Microsatellites present in the transcribed regions of the genome have the potential to reveal functional diversity. Unigene sequence databases are the sources of such genic microsatellites with unique flanking sequences and genomic locations even in complex polyploids. The present study was designed to assay the unigenes of Brassica napus and B. rapa for various microsatellite repeats, and to design markers and use them in comparative genome analysis and study of evolution. The average frequency of microsatellites in Brassica unigenes was one in every 7.25 kb of sequence, as compared with one in every 8.57 kb of sequence in Arabidopsis thaliana. Trinucleotide motifs coding for serine and the dinucleotide motif GA were most abundant. We designed 2374 and 347 unigene-based microsatellite (UGMS) markers including 541 and 58 class I types in B. napus and B. rapa, respectively, and evaluated their use across diverse species and genera. Most of these markers (93.3%) gave successful amplification of target microsatellite motifs, which was confirmed by sequencing. Interspecific polymorphism between B. napus and B. rapa detected in silico for the UGMS markers was 4.16 times higher in 5' untranslated regions than in coding sequences. Physical anchoring of Brassica UGMS markers on the A. thaliana genome indicated their significance in studying the evolutionary history of A. thaliana genomic duplications in relation to speciation. Comparative physical mapping identified 85% of Brassica unigenes as single copy and gave clues for the presence of conserved primordial gene order. Complex chromosomal rearrangements such as inversions, tandem and segmental duplications, and insertions/deletions were evident between A. thaliana and B. rapa genomes. The results obtained have encouraging implications for the use of UGMS markers in comparative genome analysis and for understanding evolutionary complexities in the family Brassicaceae.

  12. Comparative Genomics of Helicobacter pylori: Analysis of the Outer Membrane Protein Families

    PubMed Central

    Alm, Richard A.; Bina, James; Andrews, Beth M.; Doig, Peter; Hancock, Robert E. W.; Trust, Trevor J.

    2000-01-01

    The two complete genomic sequences of Helicobacter pylori J99 and 26695 were used to compare the paralogous families (related genes within one genome, likely to have related function) of genes predicted to encode outer membrane proteins which were present in each strain. We identified five paralogous gene families ranging in size from 3 to 33 members; two of these families contained members specific for either H. pylori J99 or H. pylori 26695. Most orthologous protein pairs (equivalent genes between two genomes, same function) shared considerable identity between the two strains. The unusual set of outer membrane proteins and the specialized outer membrane may be a reflection of the adaptation of H. pylori to the unique gastric environment where it is found. One subfamily of proteins, which contains both channel-forming and adhesin molecules, is extremely highly related at the sequence level and has likely arisen due to ancestral gene duplication. In addition, the largest paralogous family contained two essentially identical pairs of genes in both strains. The presence and genomic organization of these two pairs of duplicated genes were analyzed in a panel of independent H. pylori isolates. While one pair was present in every strain examined, one allele of the other pair appeared partially deleted in several isolates. PMID:10858232

  13. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    PubMed

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  14. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    PubMed

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae.

  15. Comparative analysis of field-isolate and monkey-adapted Plasmodium vivax genomes.

    PubMed

    Chan, Ernest R; Barnwell, John W; Zimmerman, Peter A; Serre, David

    2015-03-01

    Significant insights into the biology of Plasmodium vivax have been gained from the ability to successfully adapt human infections to non-human primates. P. vivax strains grown in monkeys serve as a renewable source of parasites for in vitro and ex vivo experimental studies and functional assays, or for studying in vivo the relapse characteristics, mosquito species compatibilities, drug susceptibility profiles or immune responses towards potential vaccine candidates. Despite the importance of these studies, little is known as to how adaptation to a different host species may influence the genome of P. vivax. In addition, it is unclear whether these monkey-adapted strains consist of a single clonal population of parasites or if they retain the multiclonal complexity commonly observed in field isolates. Here we compare the genome sequences of seven P. vivax strains adapted to New World monkeys with those of six human clinical isolates collected directly in the field. We show that the adaptation of P. vivax parasites to monkey hosts, and their subsequent propagation, did not result in significant modifications of their genome sequence and that these monkey-adapted strains recapitulate the genomic diversity of field isolates. Our analyses also reveal that these strains are not always genetically homogeneous and should be analyzed cautiously. Overall, our study provides a framework to better leverage this important research material and fully utilize this resource for improving our understanding of P. vivax biology.

  16. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication

    PubMed Central

    Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L.; Searle, Steven M. J.; Minx, Patrick; Hillier, LaDeana W.; Koboldt, Daniel C.; Davis, Brian W.; Driscoll, Carlos A.; Barr, Christina S.; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W. C.; Hahn, Matthew W.; Menotti-Raymond, Marilyn; O’Brien, Stephen J.; Wilson, Richard K.; Lyons, Leslie A.; Murphy, William J.; Warren, Wesley C.

    2014-01-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  17. The complete chloroplast genome sequence of Gentiana lawrencei var. farreri (Gentianaceae) and comparative analysis with its congeneric species

    PubMed Central

    Fu, Peng-Cheng; Zhang, Yan-Zhao; Geng, Hui-Min

    2016-01-01

    Background The chloroplast (cp) genome is useful in plant systematics, genetic diversity analysis, molecular identification and divergence dating. The genus Gentiana contains 362 species, but there are only two valuable complete cp genomes. The purpose of this study is to report the characterization of complete cp genome of G. lawrencei var. farreri, which is endemic to the Qinghai-Tibetan Plateau (QTP). Methods Using high throughput sequencing technology, we got the complete nucleotide sequence of the G. lawrencei var. farreri cp genome. The comparison analysis including genome difference and gene divergence was performed with its congeneric species G. straminea. The simple sequence repeats (SSRs) and phylogenetics were studied as well. Results The cp genome of G. lawrencei var. farreri is a circular molecule of 138,750 bp, containing a pair of 24,653 bp inverted repeats which are separated by small and large single-copy regions of 11,365 and 78,082 bp, respectively. The cp genome contains 130 known genes, including 85 protein coding genes (PCGs), eight ribosomal RNA genes and 37 tRNA genes. Comparative analyses indicated that G. lawrencei var. farreri is 10,241 bp shorter than its congeneric species G. straminea. Four large gaps were detected that are responsible for 85% of the total sequence loss. Further detailed analyses revealed that 10 PCGs were included in the four gaps that encode nine NADH dehydrogenase subunits. The cp gene content, order and orientation are similar to those of its congeneric species, but with some variation among the PCGs. Three genes, ndhB, ndhF and clpP, have high nonsynonymous to synonymous values. There are 34 SSRs in the G. lawrencei var. farreri cp genome, of which 25 are mononucleotide repeats: no dinucleotide repeats were detected. Comparison with the G. straminea cp genome indicated that five SSRs have length polymorphisms and 23 SSRs are species-specific. The phylogenetic analysis of 48 PCGs from 12 Gentianales taxa cp genomes

  18. Comparative Reannotation of 21 Aspergillus Genomes

    SciTech Connect

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  19. Comparative genome-scale analysis of niche-based stress-responsive genes in Lactobacillus helveticus strains.

    PubMed

    Senan, Suja; Prajapati, Jashbhai B; Joshi, Chaitanya G

    2014-04-01

    Next generation sequencing technologies with advanced bioinformatic tools present a unique opportunity to compare genomes from diverse niches. The identification of niche-specific stress-responsive genes can help in characterizing robust strains for multiple applications. In this study, we attempted to compare the stress-responsive genes of a potential probiotic strain, Lactobacillus helveticus MTCC 5463, and a cheese starter strain, Lactobacillus helveticus DPC 4571, from a gut and dairy niche, respectively. Sequencing of MTCC 5463 was done using 454 GS FLX, and contigs were assembled using GS Assembler software. Genome analysis was done using BLAST hits and the prokaryotic annotation server RAST. The MTCC 5463 genome carried multiple orthologs of genes governing stress responses, whereas the DPC 4571 genome lacked in the number of major stress-response proteins. The absence of the bile salt hydrolase gene in DPC 4571 and its presence in MTCC 5463 clearly indicated niche adaptation. Further, MTCC 5463 carried higher copy numbers of genes contributing towards heat, cold, osmotic, and oxidative stress resistance as compared with DPC 4571. Through comparative genomics, we could thus identify stress-responsive gene sets required to adapt to gut and dairy niches.

  20. Comparative genomic analysis reveals a distant liver enhancer upstream of the COUP-TFII gene

    SciTech Connect

    Baroukh, Nadine; Ahituv, Nadav; Chang, Jessie; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len A.

    2004-08-20

    COUP-TFII is a central nuclear hormone receptor that tightly regulates the expression of numerous target lipid metabolism genes in vertebrates. However, it remains unclear how COUP-TFII itself is transcriptionally controlled since studies with its promoter and upstream region fail to recapitulate the genes liver expression. In an attempt to identify liver enhancers in the vicinity of COUP-TFII, we employed a comparative genomic approach. Initial comparisons between humans and mice of the 3,470kb gene poor region surrounding COUP-TFII revealed 2,023 conserved non-coding elements. To prioritize a subset of these elements for functional studies, we performed further genomic comparisons with the orthologous pufferfish (Fugu rubripes) locus and uncovered two anciently conserved non-coding sequences (CNS) upstream of COUP-TFII (CNS-62kb and CNS-66kb). Testing these two elements using reporter constructs in liver (HepG2) cells revealed that CNS-66kb, but not CNS-62kb, yielded robust in vitro enhancer activity. In addition, an in vivo reporter assay using naked DNA transfer with CNS-66kb linked to luciferase displayed strong reproducible liver expression in adult mice, further supporting its role as a liver enhancer. Together, these studies further support the utility of comparative genomics to uncover gene regulatory sequences based on evolutionary conservation and provide the substrates to better understand the regulation and expression of COUP-TFII.

  1. Map-based comparative genomic analysis of virulent haemophilus parasuis serovars 4 and 5.

    PubMed

    Lawrence, Paulraj; Bey, Russell

    2015-01-01

    Haemophilus parasuis is a commensal bacterium of the upper respiratory tract of healthy pigs. However, in conjunction with viral infections in immunocompromised animals H. parasuis can transform into a pathogen that is responsible for causing Glasser's disease which is typically characterized by fibrinous polyserositis, polyarthritis, meningitis and sometimes acute pneumonia and septicemia in pigs. Haemophilus parasuis serovar 5 is highly virulent and more frequently isolated from respiratory and systemic infection in pigs. Recently a highly virulent H. parasuis serovar 4 was isolated from the tissues of diseased pigs. To understand the differences in virulence and virulence-associated genes between H. parasuis serovar 5 and highly virulent H. parasuis serovar 4 strains, a genomic library was generated by TruSeq preparation and sequenced on Illumina HiSeq 2000 obtaining 50 bp PE reads. A three-way comparative genomic analysis was conducted between two highly virulent H. parasuis serovar 4 strains and H. parasuis serovar 5. Haemophilus parasuis serovar 5 GenBank isolate SH0165 (GenBank accession number CP001321.1) was used as reference strain for assembly. Results of these analysis revealed the highly virulent H. parasuis serovar 4 lacks genes encoding for, glycosyl transferases, polysaccharide biosynthesis protein capD, spore coat polysaccharide biosynthesis protein C, polysaccharide export protein and sialyltransferase which can modify the lipopolysaccharide forming a short-chain LPS lacking O-specific polysaccharide chains often referred to as lipooligosaccharide (LOS). In addition, it can modify the outer membrane protein (OMP) structure. The lack of sialyltransferase significantly reduced the amount of sialic acid incorporated into LOS, a major and essential component of the cell wall and an important virulence determinant. These molecules may be involved in various stages of pathogenesis through molecular mimicry and by causing host cell cytotoxicity, reduced

  2. Map-Based Comparative Genomic Analysis of Virulent Haemophilus Parasuis Serovars 4 and 5

    PubMed Central

    Lawrence, Paulraj; Bey, Russell

    2015-01-01

    Haemophilus parasuis is a commensal bacterium of the upper respiratory tract of healthy pigs. However, in conjunction with viral infections in immunocompromised animals H. parasuis can transform into a pathogen that is responsible for causing Glasser's disease which is typically characterized by fibrinous polyserositis, polyarthritis, meningitis and sometimes acute pneumonia and septicemia in pigs. Haemophilus parasuis serovar 5 is highly virulent and more frequently isolated from respiratory and systemic infection in pigs. Recently a highly virulent H. parasuis serovar 4 was isolated from the tissues of diseased pigs. To understand the differences in virulence and virulence-associated genes between H. parasuis serovar 5 and highly virulent H. parasuis serovar 4 strains, a genomic library was generated by TruSeq preparation and sequenced on Illumina HiSeq 2000 obtaining 50 bp PE reads. A three-way comparative genomic analysis was conducted between two highly virulent H. parasuis serovar 4 strains and H. parasuis serovar 5. Haemophilus parasuis serovar 5 GenBank isolate SH0165 (GenBank accession number CP001321.1) was used as reference strain for assembly. Results of these analysis revealed the highly virulent H. parasuis serovar 4 lacks genes encoding for, glycosyl transferases, polysaccharide biosynthesis protein capD, spore coat polysaccharide biosynthesis protein C, polysaccharide export protein and sialyltransferase which can modify the lipopolysaccharide forming a short-chain LPS lacking O-specific polysaccharide chains often referred to as lipooligosaccharide (LOS). In addition, it can modify the outer membrane protein (OMP) structure. The lack of sialyltransferase significantly reduced the amount of sialic acid incorporated into LOS, a major and essential component of the cell wall and an important virulence determinant. These molecules may be involved in various stages of pathogenesis through molecular mimicry and by causing host cell cytotoxicity, reduced

  3. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

    PubMed

    Borziak, Kirill; Posner, Mareike G; Upadhyay, Abhishek; Danson, Michael J; Bagby, Stefan; Dorus, Steve

    2014-01-01

    Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea

  4. Comparative Genomic Analysis Reveals a Possible Novel Non-Tuberculous Mycobacterium Species with High Pathogenic Potential.

    PubMed

    Choo, Siew Woh; Dutta, Avirup; Wong, Guat Jah; Wee, Wei Yee; Ang, Mia Yang; Siow, Cheuk Chuen

    2016-01-01

    Mycobacteria have been reported to cause a wide range of human diseases. We present the first whole-genome study of a Non-Tuberculous Mycobacterium, Mycobacterium sp. UM_CSW (referred to hereafter as UM_CSW), isolated from a patient diagnosed with bronchiectasis. Our data suggest that this clinical isolate is likely a novel mycobacterial species, supported by clear evidence from molecular phylogenetic, comparative genomic, ANI and AAI analyses. UM_CSW is closely related to the Mycobacterium avium complex. While it has characteristic features of an environmental bacterium, it also shows a high pathogenic potential with the presence of a wide variety of putative genes related to bacterial virulence and shares very similar pathogenomic profiles with the known pathogenic mycobacterial species. Thus, we conclude that this possible novel Mycobacterium species should be tightly monitored for its possible causative role in human infections. PMID:27035710

  5. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    PubMed

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.

  6. Comparative genomic analysis of human Chlamydia pneumoniae isolates from respiratory, brain and cardiac tissues.

    PubMed

    Roulis, Eileen; Bachmann, Nathan L; Myers, Garry S A; Huston, Wilhelmina; Summersgill, James; Hudson, Alan; Dreses-Werringloer, Ute; Polkinghorne, Adam; Timms, Peter

    2015-12-01

    Chlamydia pneumoniae is an obligate intracellular bacterium implicated in a wide range of human diseases including atherosclerosis and Alzheimer's disease. Efforts to understand the relationships between C. pneumoniae detected in these diseases have been hindered by the availability of sequence data for non-respiratory strains. In this study, we sequenced the whole genomes for C. pneumoniae isolates from atherosclerosis and Alzheimer's disease, and compared these to previously published C. pneumoniae genomes. Phylogenetic analyses of these new C. pneumoniae strains indicate two sub-groups within human C. pneumoniae, and suggest that both recombination and mutation events have driven the evolution of human C. pneumoniae. Further fine-detailed analyses of these new C. pneumoniae sequences show several genetically variable loci. This suggests that similar strains of C. pneumoniae are found in the brain, lungs and cardiovascular system and that only minor genetic differences may contribute to the adaptation of particular strains in human disease.

  7. Complete Sequence and Comparative Analysis of the Chloroplast Genome of Coconut Palm (Cocos nucifera)

    PubMed Central

    Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703

  8. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    PubMed

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703

  9. Comparative genomic analysis of aspartic proteases in eight parasitic platyhelminths: insights into functions and evolution.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Wang, Sen; Hu, Songnian; Cai, Xuepeng

    2015-03-15

    We performed genome-wide identifications and comparative genomic analyses of the predicted aspartic proteases (APs) from eight parasitic flatworms, focusing on their evolution, potentials as drug targets and expression patterns. The results revealed that: i) More members of family A01 were identified from the schistosomes than from the cestodes; some evidence implied gene loss events along the class Cestoda, which may be related to the different ways to ingest host nutrition; ii) members in family A22 were evolutionarily highly conserved among all the parasites; iii) one retroviral-like AP in family A28 shared a highly similar predicted 3D structure with the HIV protease, implying its potential to be inhibited by HIV inhibitor-like molecules; and iiii) retrotransposon-associated APs were extensively expanded among these parasites. These results implied that the evolutionary histories of some APs in these parasites might relate to adaptations to their parasitism and some APs might have potential serving as intervention targets.

  10. Comparative Genomic Analysis Reveals a Possible Novel Non-Tuberculous Mycobacterium Species with High Pathogenic Potential

    PubMed Central

    Choo, Siew Woh; Dutta, Avirup; Wong, Guat Jah; Wee, Wei Yee; Ang, Mia Yang; Siow, Cheuk Chuen

    2016-01-01

    Mycobacteria have been reported to cause a wide range of human diseases. We present the first whole-genome study of a Non-Tuberculous Mycobacterium, Mycobacterium sp. UM_CSW (referred to hereafter as UM_CSW), isolated from a patient diagnosed with bronchiectasis. Our data suggest that this clinical isolate is likely a novel mycobacterial species, supported by clear evidence from molecular phylogenetic, comparative genomic, ANI and AAI analyses. UM_CSW is closely related to the Mycobacterium avium complex. While it has characteristic features of an environmental bacterium, it also shows a high pathogenic potential with the presence of a wide variety of putative genes related to bacterial virulence and shares very similar pathogenomic profiles with the known pathogenic mycobacterial species. Thus, we conclude that this possible novel Mycobacterium species should be tightly monitored for its possible causative role in human infections. PMID:27035710

  11. Gramene: a growing plant comparative genomics resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  12. Comparative Genomic Analysis of Rapid Evolution of an Extreme-Drug-Resistant Acinetobacter baumannii Clone

    PubMed Central

    Tan, Sean Yang-Yi; Chua, Song Lin; Liu, Yang; Høiby, Niels; Andersen, Leif Percival; Givskov, Michael; Song, Zhijun; Yang, Liang

    2013-01-01

    The emergence of extreme-drug-resistant (EDR) bacterial strains in hospital and nonhospital clinical settings is a big and growing public health threat. Understanding the antibiotic resistance mechanisms at the genomic levels can facilitate the development of next-generation agents. Here, comparative genomics has been employed to analyze the rapid evolution of an EDR Acinetobacter baumannii clone from the intensive care unit (ICU) of Rigshospitalet at Copenhagen. Two resistant A. baumannii strains, 48055 and 53264, were sequentially isolated from two individuals who had been admitted to ICU within a 1-month interval. Multilocus sequence typing indicates that these two isolates belonged to ST208. The A. baumannii 53264 strain gained colistin resistance compared with the 48055 strain and became an EDR strain. Genome sequencing indicates that A. baumannii 53264 and 48055 have almost identical genomes—61 single-nucleotide polymorphisms (SNPs) were found between them. The A. baumannii 53264 strain was assembled into 130 contigs, with a total length of 3,976,592 bp with 38.93% GC content. The A. baumannii 48055 strain was assembled into 135 contigs, with a total length of 4,049,562 bp with 39.00% GC content. Genome comparisons showed that this A. baumannii clone is classified as an International clone II strain and has 94% synteny with the A. baumannii ACICU strain. The ResFinder server identified a total of 14 antibiotic resistance genes in the A. baumannii clone. Proteomic analyses revealed that a putative porin protein was down-regulated when A. baumannii 53264 was exposed to antimicrobials, which may reduce the entry of antibiotics into the bacterial cell. PMID:23538992

  13. Association between chromosomal aberration of COX8C and tethered spinal cord syndrome: array-based comparative genomic hybridization analysis

    PubMed Central

    Zhao, Qiu-jiong; Bai, Shao-cong; Cheng, Cheng; Tao, Ben-zhang; Wang, Le-kai; Liang, Shuang; Yin, Ling; Hang, Xing-yi; Shang, Ai-jia

    2016-01-01

    Copy number variations have been found in patients with neural tube abnormalities. In this study, we performed genome-wide screening using high-resolution array-based comparative genomic hybridization in three children with tethered spinal cord syndrome and two healthy parents. Of eight copy number variations, four were non-polymorphic. These non-polymorphic copy number variations were associated with Angelman and Prader-Willi syndromes, and microcephaly. Gene function enrichment analysis revealed that COX8C, a gene associated with metabolic disorders of the nervous system, was located in the copy number variation region of Patient 1. Our results indicate that array-based comparative genomic hybridization can be used to diagnose tethered spinal cord syndrome. Our results may help determine the pathogenesis of tethered spinal cord syndrome and prevent occurrence of this disease.

  14. Association between chromosomal aberration of COX8C and tethered spinal cord syndrome: array-based comparative genomic hybridization analysis

    PubMed Central

    Zhao, Qiu-jiong; Bai, Shao-cong; Cheng, Cheng; Tao, Ben-zhang; Wang, Le-kai; Liang, Shuang; Yin, Ling; Hang, Xing-yi; Shang, Ai-jia

    2016-01-01

    Copy number variations have been found in patients with neural tube abnormalities. In this study, we performed genome-wide screening using high-resolution array-based comparative genomic hybridization in three children with tethered spinal cord syndrome and two healthy parents. Of eight copy number variations, four were non-polymorphic. These non-polymorphic copy number variations were associated with Angelman and Prader-Willi syndromes, and microcephaly. Gene function enrichment analysis revealed that COX8C, a gene associated with metabolic disorders of the nervous system, was located in the copy number variation region of Patient 1. Our results indicate that array-based comparative genomic hybridization can be used to diagnose tethered spinal cord syndrome. Our results may help determine the pathogenesis of tethered spinal cord syndrome and prevent occurrence of this disease. PMID:27651783

  15. Association between chromosomal aberration of COX8C and tethered spinal cord syndrome: array-based comparative genomic hybridization analysis.

    PubMed

    Zhao, Qiu-Jiong; Bai, Shao-Cong; Cheng, Cheng; Tao, Ben-Zhang; Wang, Le-Kai; Liang, Shuang; Yin, Ling; Hang, Xing-Yi; Shang, Ai-Jia

    2016-08-01

    Copy number variations have been found in patients with neural tube abnormalities. In this study, we performed genome-wide screening using high-resolution array-based comparative genomic hybridization in three children with tethered spinal cord syndrome and two healthy parents. Of eight copy number variations, four were non-polymorphic. These non-polymorphic copy number variations were associated with Angelman and Prader-Willi syndromes, and microcephaly. Gene function enrichment analysis revealed that COX8C, a gene associated with metabolic disorders of the nervous system, was located in the copy number variation region of Patient 1. Our results indicate that array-based comparative genomic hybridization can be used to diagnose tethered spinal cord syndrome. Our results may help determine the pathogenesis of tethered spinal cord syndrome and prevent occurrence of this disease. PMID:27651783

  16. Comparative genomics of mycobacterial proteases.

    PubMed

    Ribeiro-Guimarães, Michelle Lopes; Pessolani, Maria Cristina Vidal

    2007-01-01

    Although proteases are recognized as important virulent factors in pathogenic microorganisms, little information is available so far regarding the potential role of these enzymes in diseases caused by mycobacteria. Here we use bioinformatic tools to compare the protease-coding genes present in the genome of Mycobacterium leprae, Mycobacterium tuberculosis, Mycobacterium bovis and Mycobacterium avium paratuberculosis. This analysis allowed a review of the nomenclature of the protease family present in mycobacteria. A special attention was devoted to the 'decaying genome' of M. leprae where a relatively high level of conservation of protease-coding genes was observed when compared to other genes families. A total of 39 genes out of the 49 found in M. bovis were identified in M. leprae. Of relevance, a core of well-conserved 38 protease genes shared by the four species was defined. This set of proteases is probably essential for survival in the host and disease outcome and may constitute novel targets for drug development leading to a more effective control of mycobacterial diseases.

  17. Comparative genomic hybridization analysis of newly established retinoblastoma cell lines of adherent growth compared with Y79 of nonadherent growth.

    PubMed

    Kim, Jeong Hun; Kim, Jin Hyoung; Yu, Young Suk; Kim, Dong Hun; Kim, Yong Kyu; Kim, Kyu-Won

    2008-08-01

    Retinoblastoma (RB) shows cytogenetic aberrations involving genes other than RB gene located on 13q14. We analyzed genomic aberration in newly established RB cell lines SNUOT-RB1 and SNUOT-RB4 of adherent growth and Y79 cell line of nonadherent growth by microarray comparative genomic hybridization. SNUOT-RB1 showed 44 significant copy number changes (gain in 11 and loss in 33, P<0.0005). SNUOT-RB4 showed 42 significant copy number changes (gain in 8 and loss in 34, P<0.0005). Y79 cell line had the greatest gain of 19.65-fold in the locus of MYCN gene 2p24.1, whereas SNUOT-RB1 and SNUOT-RB4 showed no significant gain. SNUOT-RB1 and SNUOT-RB4 gained chromosomal copy numbers commonly in chromosome 11, especially in locus 11q13, which is responsible for cancer-related genes such as CCND1, MEN1, and FGF3. Losses of copy numbers occurred in chromosomes 3, 9, 10, 11, 16, and 17. In summary, SNUOT-RB1 and SNUOT-RB4 represented similar pattern in gain and loss of chromosomal copy number changes, while different from Y79. The loss of CYLD gene of tumor suppressor gene, 16q12-q13, was only on locus of common involvement in 3 cell lines. PMID:18799932

  18. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    PubMed

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense. PMID:27031249

  19. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis

    PubMed Central

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense. PMID:27031249

  20. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    PubMed

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.

  1. Comparative genomic analysis of the Hsp70s from five diverse photosynthetic eukaryotes

    PubMed Central

    Renner, Tanya; Waters, Elizabeth R.

    2007-01-01

    We have identified 24 members of the DnaK subfamily of heat shock 70 proteins (Hsp70s) in the complete genomes of 5 diverse photosynthetic eukaryotes. The Hsp70s are a ubiquitous protein family that is highly conserved across all domains of life. Eukaryotic Hsp70s are found in a number of subcellular compartments in the cell: cytoplasm, mitochondrion (MT), chloroplast (CP), and endoplasmic reticulum (ER). Although the Hsp70s have been the subject of intense study in model organisms, very little is known of the Hsp70s from early diverging photosynthetic lineages. The sequencing of the complete genomes of Thalassiosira pseudonana (a diatom), Cyanidioschyzon merolae (a red alga), and 3 green algae (Chlamydomonas reinhardtii, Ostreococcus lucimarinus, Ostreococcus tauri) allow us to conduct comparative genomics of the Hsp70s present in these diverse photosynthetic eukaryotes. We have found that the distinct lineages of Hsp70s (MT, CP, ER, and cytoplasmic) each have different evolutionary histories. In general, evolutionary patterns of the mitochondrial and endoplasmic reticulum Hsp70s are relatively stable even among very distantly related organisms. This is not true of the chloroplast Hsp70s and we discuss the distinct evolutionary patterns between “green” and “red” plastids. Finally, we find that, in contrast to the angiosperms Arabidopsis thaliana and Oryza sativa that have numerous cytoplasmic Hsp70, the 5 algal species have only 1 cytoplasmic Hsp70 each. The evolutionary and functional implications of these differences are discussed. PMID:17688196

  2. Comparative genome analysis reveals the molecular basis of nicotine degradation and survival capacities of Arthrobacter

    PubMed Central

    Yao, Yuxiang; Tang, Hongzhi; Su, Fei; Xu, Ping

    2015-01-01

    Arthrobacter is one of the most prevalent genera of nicotine-degrading bacteria; however, studies of nicotine degradation in Arthrobacter species remain at the plasmid level (plasmid pAO1). Here, we report the bioinformatic analysis of a nicotine-degrading Arthrobacter aurescens M2012083, and show that the moeB and mogA genes that are essential for nicotine degradation in Arthrobacter are absent from plasmid pAO1. Homologues of all the nicotine degradation-related genes of plasmid pAO1 were found to be located on a 68,622-bp DNA segment (nic segment-1) in the M2012083 genome, showing 98.1% nucleotide acid sequence identity to the 69,252-bp nic segment of plasmid pAO1. However, the rest sequence of plasmid pAO1 other than the nic segment shows no significant similarity to the genome sequence of strain M2012083. Taken together, our data suggest that the nicotine degradation-related genes of strain M2012083 are located on the chromosome or a plasmid other than pAO1. Based on the genomic sequence comparison of strain M2012083 and six other Arthrobacter strains, we have identified 17 σ70 transcription factors reported to be involved in stress responses and 109 genes involved in environmental adaptability of strain M2012083. These results reveal the molecular basis of nicotine degradation and survival capacities of Arthrobacter species. PMID:25721465

  3. The use of clustering software for the classification of comparative genomic hybridization data. an analysis of 109 malignant fibrous histiocytomas.

    PubMed

    Chibon, Frédéric; Mariani, Odette; Mairal, Aline; Derré, Josette; Coindre, Jean-Michel; Terrier, Philippe; Lagacé, Réal; Sastre, Xavier; Aurias, Alain

    2003-02-01

    Malignant fibrous histiocytoma (MFH) is considered the most frequent soft-tissue sarcoma of late adult life. Nevertheless, the validity of this entity has been recurrently questioned by pathologists. Preliminary analyses by comparative genomic hybridization (CGH) of series of MFH have suggested that this tumor group is heterogeneous at the genomic level, and that at least two main genetic subgroups exist. We report an analysis by CGH of a large series of 109 MFH and on the use of clustering software for an objective classification of these tumors. We confirm our preliminary CGH results and demonstrate that two main clusters of tumors are present in the series analyzed. PMID:12581902

  4. Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein

    PubMed Central

    Kankainen, Matti; Paulin, Lars; Tynkkynen, Soile; von Ossowski, Ingemar; Reunanen, Justus; Partanen, Pasi; Satokari, Reetta; Vesterlund, Satu; Hendrickx, Antoni P. A.; Lebeer, Sarah; De Keersmaecker, Sigrid C. J.; Vanderleyden, Jos; Hämäläinen, Tuula; Laukkanen, Suvi; Salovuori, Noora; Ritari, Jarmo; Alatalo, Edward; Korpela, Riitta; Mattila-Sandholm, Tiina; Lassig, Anna; Hatakka, Katja; Kinnunen, Katri T.; Karjalainen, Heli; Saxelin, Maija; Laakso, Kati; Surakka, Anu; Palva, Airi; Salusjärvi, Tuomas; Auvinen, Petri; de Vos, Willem M.

    2009-01-01

    To unravel the biological function of the widely used probiotic bacterium Lactobacillus rhamnosus GG, we compared its 3.0-Mbp genome sequence with the similarly sized genome of L. rhamnosus LC705, an adjunct starter culture exhibiting reduced binding to mucus. Both genomes demonstrated high sequence identity and synteny. However, for both strains, genomic islands, 5 in GG and 4 in LC705, punctuated the colinearity. A significant number of strain-specific genes were predicted in these islands (80 in GG and 72 in LC705). The GG-specific islands included genes coding for bacteriophage components, sugar metabolism and transport, and exopolysaccharide biosynthesis. One island only found in L. rhamnosus GG contained genes for 3 secreted LPXTG-like pilins (spaCBA) and a pilin-dedicated sortase. Using anti-SpaC antibodies, the physical presence of cell wall-bound pili was confirmed by immunoblotting. Immunogold electron microscopy showed that the SpaC pilin is located at the pilus tip but also sporadically throughout the structure. Moreover, the adherence of strain GG to human intestinal mucus was blocked by SpaC antiserum and abolished in a mutant carrying an inactivated spaC gene. Similarly, binding to mucus was demonstrated for the purified SpaC protein. We conclude that the presence of SpaC is essential for the mucus interaction of L. rhamnosus GG and likely explains its ability to persist in the human intestinal tract longer than LC705 during an intervention trial. The presence of mucus-binding pili on the surface of a nonpathogenic Gram-positive bacterial strain reveals a previously undescribed mechanism for the interaction of selected probiotic lactobacilli with host tissues. PMID:19805152

  5. Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein.

    PubMed

    Kankainen, Matti; Paulin, Lars; Tynkkynen, Soile; von Ossowski, Ingemar; Reunanen, Justus; Partanen, Pasi; Satokari, Reetta; Vesterlund, Satu; Hendrickx, Antoni P A; Lebeer, Sarah; De Keersmaecker, Sigrid C J; Vanderleyden, Jos; Hämäläinen, Tuula; Laukkanen, Suvi; Salovuori, Noora; Ritari, Jarmo; Alatalo, Edward; Korpela, Riitta; Mattila-Sandholm, Tiina; Lassig, Anna; Hatakka, Katja; Kinnunen, Katri T; Karjalainen, Heli; Saxelin, Maija; Laakso, Kati; Surakka, Anu; Palva, Airi; Salusjärvi, Tuomas; Auvinen, Petri; de Vos, Willem M

    2009-10-01

    To unravel the biological function of the widely used probiotic bacterium Lactobacillus rhamnosus GG, we compared its 3.0-Mbp genome sequence with the similarly sized genome of L. rhamnosus LC705, an adjunct starter culture exhibiting reduced binding to mucus. Both genomes demonstrated high sequence identity and synteny. However, for both strains, genomic islands, 5 in GG and 4 in LC705, punctuated the colinearity. A significant number of strain-specific genes were predicted in these islands (80 in GG and 72 in LC705). The GG-specific islands included genes coding for bacteriophage components, sugar metabolism and transport, and exopolysaccharide biosynthesis. One island only found in L. rhamnosus GG contained genes for 3 secreted LPXTG-like pilins (spaCBA) and a pilin-dedicated sortase. Using anti-SpaC antibodies, the physical presence of cell wall-bound pili was confirmed by immunoblotting. Immunogold electron microscopy showed that the SpaC pilin is located at the pilus tip but also sporadically throughout the structure. Moreover, the adherence of strain GG to human intestinal mucus was blocked by SpaC antiserum and abolished in a mutant carrying an inactivated spaC gene. Similarly, binding to mucus was demonstrated for the purified SpaC protein. We conclude that the presence of SpaC is essential for the mucus interaction of L. rhamnosus GG and likely explains its ability to persist in the human intestinal tract longer than LC705 during an intervention trial. The presence of mucus-binding pili on the surface of a nonpathogenic Gram-positive bacterial strain reveals a previously undescribed mechanism for the interaction of selected probiotic lactobacilli with host tissues.

  6. Sequencing and Comparative Analysis of the Straw Mushroom (Volvariella volvacea) Genome

    PubMed Central

    Bao, Dapeng; Gong, Ming; Zheng, Huajun; Chen, Mingjie; Zhang, Liang; Wang, Hong; Jiang, Jianping; Wu, Lin; Zhu, Yongqiang; Zhu, Gang; Zhou, Yan; Li, Chuanhua; Wang, Shengyue; Zhao, Yan; Zhao, Guoping; Tan, Qi

    2013-01-01

    Volvariella volvacea, the edible straw mushroom, is a highly nutritious food source that is widely cultivated on a commercial scale in many parts of Asia using agricultural wastes (rice straw, cotton wastes) as growth substrates. However, developments in V. volvacea cultivation have been limited due to a low biological efficiency (i.e. conversion of growth substrate to mushroom fruit bodies), sensitivity to low temperatures, and an unclear sexuality pattern that has restricted the breeding of improved strains. We have now sequenced the genome of V. volvacea and assembled it into 62 scaffolds with a total genome size of 35.7 megabases (Mb), containing 11,084 predicted gene models. Comparative analyses were performed with the model species in basidiomycete on mating type system, carbohydrate active enzymes, and fungal oxidative lignin enzymes. We also studied transcriptional regulation of the response to low temperature (4°C). We found that the genome of V. volvacea has many genes that code for enzymes, which are involved in the degradation of cellulose, hemicellulose, and pectin. The molecular genetics of the mating type system in V. volvacea was also found to be similar to the bipolar system in basidiomycetes, suggesting that it is secondary homothallism. Sensitivity to low temperatures could be due to the lack of the initiation of the biosynthesis of unsaturated fatty acids, trehalose and glycogen biosyntheses in this mushroom. Genome sequencing of V. volvacea has improved our understanding of the biological characteristics related to the degradation of the cultivating compost consisting of agricultural waste, the sexual reproduction mechanism, and the sensitivity to low temperatures at the molecular level which in turn will enable us to increase the industrial production of this mushroom. PMID:23526973

  7. Comparative Genomic and Functional Analysis of 100 Lactobacillus rhamnosus Strains and Their Comparison with Strain GG

    PubMed Central

    Pietilä, Taija E.; Järvinen, Hanna M.; Messing, Marcel; Randazzo, Cinzia L.; Paulin, Lars; Laine, Pia; Ritari, Jarmo; Caggia, Cinzia; Lähteinen, Tanja; Brouns, Stan J. J.; Satokari, Reetta; von Ossowski, Ingemar; Reunanen, Justus; Palva, Airi; de Vos, Willem M.

    2013-01-01

    Lactobacillus rhamnosus is a lactic acid bacterium that is found in a large variety of ecological habitats, including artisanal and industrial dairy products, the oral cavity, intestinal tract or vagina. To gain insights into the genetic complexity and ecological versatility of the species L. rhamnosus, we examined the genomes and phenotypes of 100 L. rhamnosus strains isolated from diverse sources. The genomes of 100 L. rhamnosus strains were mapped onto the L. rhamnosus GG reference genome. These strains were phenotypically characterized for a wide range of metabolic, antagonistic, signalling and functional properties. Phylogenomic analysis showed multiple groupings of the species that could partly be associated with their ecological niches. We identified 17 highly variable regions that encode functions related to lifestyle, i.e. carbohydrate transport and metabolism, production of mucus-binding pili, bile salt resistance, prophages and CRISPR adaptive immunity. Integration of the phenotypic and genomic data revealed that some L. rhamnosus strains possibly resided in multiple niches, illustrating the dynamics of bacterial habitats. The present study showed two distinctive geno-phenotypes in the L. rhamnosus species. The geno-phenotype A suggests an adaptation to stable nutrient-rich niches, i.e. milk-derivative products, reflected by the alteration or loss of biological functions associated with antimicrobial activity spectrum, stress resistance, adaptability and fitness to a distinctive range of habitats. In contrast, the geno-phenotype B displays adequate traits to a variable environment, such as the intestinal tract, in terms of nutrient resources, bacterial population density and host effects. PMID:23966868

  8. Comparative Genomics Analysis of Streptococcus Isolates from the Human Small Intestine Reveals their Adaptation to a Highly Dynamic Ecosystem

    PubMed Central

    Van den Bogert, Bartholomeus; Boekhorst, Jos; Herrmann, Ruth; Smid, Eddy J.; Zoetendal, Erwin G.; Kleerebezem, Michiel

    2013-01-01

    The human small-intestinal microbiota is characterised by relatively large and dynamic Streptococcus populations. In this study, genome sequences of small-intestinal streptococci from S. mitis, S. bovis, and S. salivarius species-groups were determined and compared with those from 58 Streptococcus strains in public databases. The Streptococcus pangenome consists of 12,403 orthologous groups of which 574 are shared among all sequenced streptococci and are defined as the Streptococcus core genome. Genome mining of the small-intestinal streptococci focused on functions playing an important role in the interaction of these streptococci in the small-intestinal ecosystem, including natural competence and nutrient-transport and metabolism. Analysis of the small-intestinal Streptococcus genomes predicts a high capacity to synthesize amino acids and various vitamins as well as substantial divergence in their carbohydrate transport and metabolic capacities, which is in agreement with observed physiological differences between these Streptococcus strains. Gene-specific PCR-strategies enabled evaluation of conservation of Streptococcus populations in intestinal samples from different human individuals, revealing that the S. salivarius strains were frequently detected in the small-intestine microbiota, supporting the representative value of the genomes provided in this study. Finally, the Streptococcus genomes allow prediction of the effect of dietary substances on Streptococcus population dynamics in the human small-intestine. PMID:24386196

  9. Comparative genomics analysis of Streptococcus isolates from the human small intestine reveals their adaptation to a highly dynamic ecosystem.

    PubMed

    Van den Bogert, Bartholomeus; Boekhorst, Jos; Herrmann, Ruth; Smid, Eddy J; Zoetendal, Erwin G; Kleerebezem, Michiel

    2013-01-01

    The human small-intestinal microbiota is characterised by relatively large and dynamic Streptococcus populations. In this study, genome sequences of small-intestinal streptococci from S. mitis, S. bovis, and S. salivarius species-groups were determined and compared with those from 58 Streptococcus strains in public databases. The Streptococcus pangenome consists of 12,403 orthologous groups of which 574 are shared among all sequenced streptococci and are defined as the Streptococcus core genome. Genome mining of the small-intestinal streptococci focused on functions playing an important role in the interaction of these streptococci in the small-intestinal ecosystem, including natural competence and nutrient-transport and metabolism. Analysis of the small-intestinal Streptococcus genomes predicts a high capacity to synthesize amino acids and various vitamins as well as substantial divergence in their carbohydrate transport and metabolic capacities, which is in agreement with observed physiological differences between these Streptococcus strains. Gene-specific PCR-strategies enabled evaluation of conservation of Streptococcus populations in intestinal samples from different human individuals, revealing that the S. salivarius strains were frequently detected in the small-intestine microbiota, supporting the representative value of the genomes provided in this study. Finally, the Streptococcus genomes allow prediction of the effect of dietary substances on Streptococcus population dynamics in the human small-intestine.

  10. Theobroma cacao: A genetically integrated physical map and genome-scale comparative synteny analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive integrated genomic framework is considered a centerpiece of genomic research. In collaboration with the USDA-ARS (SHRS) and Mars Inc., the Clemson University Genomics Institute (CUGI) has developed a genetically anchored physical map of the T. cacao genome. Three BAC libraries contai...

  11. Comparative genomic analysis of primary tumors and metastases in breast cancer

    PubMed Central

    Bertucci, François; Carbuccia, Nadine; Monneur, Audrey; Charafe-Jauffret, Emmanuelle; Goncalves, Anthony; Viens, Patrice; Birnbaum, Daniel; Chaffanet, Max

    2016-01-01

    Personalized medicine uses genomic information for selecting therapy in patients with metastatic cancer. An issue is the optimal tissue source (primary tumor or metastasis) for testing. We compared the DNA copy number and mutational profiles of primary breast cancers and paired metastases from 23 patients using whole-genome array-comparative genomic hybridization and next-generation sequencing of 365 “cancer-associated” genes. Primary tumors and metastases harbored copy number alterations (CNAs) and mutations common in breast cancer and showed concordant profiles. The global concordance regarding CNAs was shown by clustering and correlation matrix, which showed that each metastasis correlated more strongly with its paired tumor than with other samples. Genes with recurrent amplifications in breast cancer showed 100% (ERBB2, FGFR1), 96% (CCND1), and 88% (MYC) concordance for the amplified/non-amplified status. Among all samples, 499 mutations were identified, including 39 recurrent (AKT1, ERBB2, PIK3CA, TP53) and 460 non-recurrent variants. The tumors/metastases concordance of variants was 75%, higher for recurrent (92%) than for non-recurrent (73%) variants. Further mutational discordance came from very different variant allele frequencies for some variants. We showed that the chosen targeted therapy in two clinical trials of personalized medicine would be concordant in all but one patient (96%) when based on the molecular profiling of tumor and paired metastasis. Our results suggest that the genotyping of primary tumor may be acceptable to guide systemic treatment if the metastatic sample is not obtainable. However, given the rare but potentially relevant divergences for some actionable driver genes, the profiling of metastatic sample is recommended. PMID:27028851

  12. Comparative Genomic Analysis of Pseudomonas chlororaphis PCL1606 Reveals New Insight into Antifungal Compounds Involved in Biocontrol.

    PubMed

    Calderón, Claudia E; Ramos, Cayo; de Vicente, Antonio; Cazorla, Francisco M

    2015-03-01

    Pseudomonas chlororaphis PCL1606 is a rhizobacterium that has biocontrol activity against many soilborne phytopathogenic fungi. The whole genome sequence of this strain was obtained using the Illumina Hiseq 2000 sequencing platform and was assembled using SOAP denovo software. The resulting 6.66-Mb complete sequence of the PCL1606 genome was further analyzed. A comparative genomic analysis using 10 plant-associated strains within the fluorescent Pseudomonas group, including the complete genome of P. chlororaphis PCL1606, revealed a diverse spectrum of traits involved in multitrophic interactions with plants and microbes as well as biological control. Phylogenetic analysis of these strains using eight housekeeping genes clearly placed strain PCL1606 into the P. chlororaphis group. The genome sequence of P. chlororaphis PCL1606 revealed the presence of sequences that were homologous to biosynthetic genes for the antifungal compounds 2-hexyl, 5-propyl resorcinol (HPR), hydrogen cyanide, and pyrrolnitrin; this is the first report of pyrrolnitrin encoding genes in this P. chlororaphis strain. Single-, double-, and triple-insertional mutants in the biosynthetic genes of each antifungal compound were used to test their roles in the production of these antifungal compounds and in antagonism and biocontrol of two fungal pathogens. The results confirmed the function of HPR in the antagonistic phenotype and in the biocontrol activity of P. chlororaphis PCL1606.

  13. Comparative Genomic Analysis of Pseudomonas chlororaphis PCL1606 Reveals New Insight into Antifungal Compounds Involved in Biocontrol.

    PubMed

    Calderón, Claudia E; Ramos, Cayo; de Vicente, Antonio; Cazorla, Francisco M

    2015-03-01

    Pseudomonas chlororaphis PCL1606 is a rhizobacterium that has biocontrol activity against many soilborne phytopathogenic fungi. The whole genome sequence of this strain was obtained using the Illumina Hiseq 2000 sequencing platform and was assembled using SOAP denovo software. The resulting 6.66-Mb complete sequence of the PCL1606 genome was further analyzed. A comparative genomic analysis using 10 plant-associated strains within the fluorescent Pseudomonas group, including the complete genome of P. chlororaphis PCL1606, revealed a diverse spectrum of traits involved in multitrophic interactions with plants and microbes as well as biological control. Phylogenetic analysis of these strains using eight housekeeping genes clearly placed strain PCL1606 into the P. chlororaphis group. The genome sequence of P. chlororaphis PCL1606 revealed the presence of sequences that were homologous to biosynthetic genes for the antifungal compounds 2-hexyl, 5-propyl resorcinol (HPR), hydrogen cyanide, and pyrrolnitrin; this is the first report of pyrrolnitrin encoding genes in this P. chlororaphis strain. Single-, double-, and triple-insertional mutants in the biosynthetic genes of each antifungal compound were used to test their roles in the production of these antifungal compounds and in antagonism and biocontrol of two fungal pathogens. The results confirmed the function of HPR in the antagonistic phenotype and in the biocontrol activity of P. chlororaphis PCL1606. PMID:25679537

  14. Comparative analysis of the complete genome of an Acinetobacter calcoaceticus strain adapted to a phenol-polluted environment.

    PubMed

    Zhan, Yuhua; Yan, Yongliang; Zhang, Wei; Chen, Ming; Lu, Wei; Ping, Shuzhen; Lin, Min

    2012-01-01

    The complete genome sequence of Acinetobacter calcoaceticus PHEA-2, a non-pathogenic phenol-degrading bacterium previously isolated from industrial wastewater of an oil refinery in China, has been established. This is the first sequence of an A. calcoaceticus strain. We report here a comparative genomic analysis of PHEA-2 with two other Acinetobacter species having different lifestyles, Acinetobacter baumannii AYE, a pathogenic human-adapted strain, and Acinetobacter baylyi ADP1, a soil-living strain. For a long time, A. calcoaceticus could not be easily distinguished from A. baumannii strains. Indeed, whole-genome comparison revealed high synteny between A. calcoaceticus and A. baumannii genomes, but most genes for multiple drug resistance as well as those presumably involved in pathogenicity were not present in the PHEA-2 genome and phylogenetic analysis showed that A. calcoaceticus differed from A. baumannii antibiotic-susceptible strains. It also revealed that many genes associated with environmental adaptation were acquired by horizontal gene transfer, including an 8-kb phenol degradation gene cluster. A relatively higher proportion of transport-related proteins were found in PHEA-2 than in ADP1 and AYE. Overall, these findings highlight the remarkable capacity of A. calcoaceticus PHEA-2 to effectively adapt to a phenol-polluted wastewater environment.

  15. Comparative genome sequence analysis underscores mycoparasitism as the ancestral life style of Trichoderma

    PubMed Central

    2011-01-01

    Background Mycoparasitism, a lifestyle where one fungus is parasitic on another fungus, has special relevance when the prey is a plant pathogen, providing a strategy for biological control of pests for plant protection. Probably, the most studied biocontrol agents are species of the genus Hypocrea/Trichoderma. Results Here we report an analysis of the genome sequences of the two biocontrol species Trichoderma atroviride (teleomorph Hypocrea atroviridis) and Trichoderma virens (formerly Gliocladium virens, teleomorph Hypocrea virens), and a comparison with Trichoderma reesei (teleomorph Hypocrea jecorina). These three Trichoderma species display a remarkable conservation of gene order (78 to 96%), and a lack of active mobile elements probably due to repeat-induced point mutation. Several gene families are expanded in the two mycoparasitic species relative to T. reesei or other ascomycetes, and are overrepresented in non-syntenic genome regions. A phylogenetic analysis shows that T. reesei and T. virens are derived relative to T. atroviride. The mycoparasitism-specific genes thus arose in a common Trichoderma ancestor but were subsequently lost in T. reesei. Conclusions The data offer a better understanding of mycoparasitism, and thus enforce the development of improved biocontrol strains for efficient and environmentally friendly protection of plants. PMID:21501500

  16. Genome-wide Comparative Analysis of Atopic Dermatitis and Psoriasis Gives Insight into Opposing Genetic Mechanisms

    PubMed Central

    Baurecht, Hansjörg; Hotze, Melanie; Brand, Stephan; Büning, Carsten; Cormican, Paul; Corvin, Aiden; Ellinghaus, David; Ellinghaus, Eva; Esparza-Gordillo, Jorge; Fölster-Holst, Regina; Franke, Andre; Gieger, Christian; Hubner, Norbert; Illig, Thomas; Irvine, Alan D.; Kabesch, Michael; Lee, Young A.E.; Lieb, Wolfgang; Marenholz, Ingo; McLean, W.H. Irwin; Morris, Derek W.; Mrowietz, Ulrich; Nair, Rajan; Nöthen, Markus M.; Novak, Natalija; O’Regan, Grainne M.; Schreiber, Stefan; Smith, Catherine; Strauch, Konstantin; Stuart, Philip E.; Trembath, Richard; Tsoi, Lam C.; Weichenthal, Michael; Barker, Jonathan; Elder, James T.; Weidinger, Stephan; Cordell, Heather J.; Brown, Sara J.

    2015-01-01

    Atopic dermatitis and psoriasis are the two most common immune-mediated inflammatory disorders affecting the skin. Genome-wide studies demonstrate a high degree of genetic overlap, but these diseases have mutually exclusive clinical phenotypes and opposing immune mechanisms. Despite their prevalence, atopic dermatitis and psoriasis very rarely co-occur within one individual. By utilizing genome-wide association study and ImmunoChip data from >19,000 individuals and methodologies developed from meta-analysis, we have identified opposing risk alleles at shared loci as well as independent disease-specific loci within the epidermal differentiation complex (chromosome 1q21.3), the Th2 locus control region (chromosome 5q31.1), and the major histocompatibility complex (chromosome 6p21–22). We further identified previously unreported pleiotropic alleles with opposing effects on atopic dermatitis and psoriasis risk in PRKRA and ANXA6/TNIP1. In contrast, there was no evidence for shared loci with effects operating in the same direction on both diseases. Our results show that atopic dermatitis and psoriasis have distinct genetic mechanisms with opposing effects in shared pathways influencing epidermal differentiation and immune response. The statistical analysis methods developed in the conduct of this study have produced additional insight from previously published data sets. The approach is likely to be applicable to the investigation of the genetic basis of other complex traits with overlapping and distinct clinical features. PMID:25574825

  17. Comparative genomic analysis of a neurotoxigenic Clostridium species using partial genome sequence: Phylogenetic analysis of a few conserved proteins involved in cellular processes and metabolism.

    PubMed

    Alam, Syed Imteyaz; Dixit, Aparna; Tomar, Arvind; Singh, Lokendra

    2010-04-01

    Clostridial organisms produce neurotoxins, which are generally regarded as the most potent toxic substances of biological origin and potential biological warfare agents. Clostridium tetani produces tetanus neurotoxin and is responsible for the fatal tetanus disease. In spite of the extensive immunization regimen, the disease is an important cause of death especially among neonates. Strains of C. tetani have not been genetically characterized except the complete genome sequencing of strain E88. The present study reports the genetic makeup and phylogenetic affiliations of an environmental strain of this bacterium with respect to C. tetani E88 and other clostridia. A shot gun library was constructed from the genomic DNA of C. tetani drde, isolated from decaying fish sample. Unique clones were sequenced and sequences compared with its closest relative C. tetani E88. A total of 275 clones were obtained and 32,457 bases of non-redundant sequence were generated. A total of 150 base changes were observed over the entire length of sequence obtained, including, additions, deletions and base substitutions. Of the total 120 ORFs detected, 48 exhibited closest similarity to E88 proteins of which three are hypothetical proteins. Eight of the ORFs exhibited similarity with hypothetical proteins from other organisms and 10 aligned with other proteins from unrelated organisms. There is an overall conservation of protein sequences among the two strains of C. tetani and. Selected ORFs involved in cellular processes and metabolism were subjected to phylogenetic analysis. PMID:19527791

  18. A comparative analysis of the complete mitochondrial genome of the Eurasian otter Lutra lutra (Carnivora; Mustelidae).

    PubMed

    Ki, Jang-Seu; Hwang, Dae-Sik; Park, Tae-Jin; Han, Sang-Hoon; Lee, Jae-Seong

    2010-04-01

    Otter populations are declining throughout the world and most otter species are considered endangered. Molecular methods are suitable tools for population genetic research on endangered species. In the present study, we analyzed the complete mitochondrial genome (mitogenome) sequence of the Eurasian otter Lutra lutra. The mitochondrial DNA sequence of the Eurasian otter is 16,505 bp in length and consists of 13 protein-coding genes, 22 tRNAs, 2 rRNAs, and a control region (CR). The CR sequence of otters from Europe and Asia showed nearly identical numbers and nucleotide sequences of minisatellites. Phylogenetic analysis of Mustelidae mitogenomes, including individual genes, revealed that Lutrinae and Mustelinae form a clade, and that L. lutra and Enhydra lutris are sister taxa within the Lutrinae. Phylogenetic analyses revealed that of the 13 mitochondrial protein-coding genes, ND5 is the most reliable marker for analysis of phylogenetic relationships within the Mustelidae.

  19. Comparative genomics for biodiversity conservation.

    PubMed

    Grueber, Catherine E

    2015-01-01

    Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem.

  20. Comparative genomics for biodiversity conservation

    PubMed Central

    Grueber, Catherine E.

    2015-01-01

    Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem. PMID:26106461

  1. A Multi-Platform Draft de novo Genome Assembly and Comparative Analysis for the Scarlet Macaw (Ara macao)

    PubMed Central

    Seabury, Christopher M.; Dowd, Scot E.; Seabury, Paul M.; Raudsepp, Terje; Brightsmith, Donald J.; Liboriussen, Poul; Halley, Yvette; Fisher, Colleen A.; Owens, Elaine; Viswanathan, Ganesh; Tizard, Ian R.

    2013-01-01

    Data deposition to NCBI Genomes This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly). The version described in this paper is the first version (AMXX01000000). The scaffolded assembly (SMACv1.1) has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000). Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw). Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb) includes more than 997 Mb of unambiguous sequence data (excluding N’s). Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7), which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity) which were independently supported by the results of previous human GWAS studies. We

  2. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    PubMed

    Fouts, Derrick E; Matthias, Michael A; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L; Haake, David A; Haft, Daniel H; Hartskeerl, Rudy; Ko, Albert I; Levett, Paul N; Matsunaga, James; Mechaly, Ariel E; Monk, Jonathan M; Nascimento, Ana L T; Nelson, Karen E; Palsson, Bernhard; Peacock, Sharon J; Picardeau, Mathieu; Ricaldi, Jessica N; Thaipandungpanit, Janjira; Wunder, Elsio A; Yang, X Frank; Zhang, Jun-Jie; Vinetz, Joseph M

    2016-02-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  3. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira.

    PubMed

    Fouts, Derrick E; Matthias, Michael A; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L; Haake, David A; Haft, Daniel H; Hartskeerl, Rudy; Ko, Albert I; Levett, Paul N; Matsunaga, James; Mechaly, Ariel E; Monk, Jonathan M; Nascimento, Ana L T; Nelson, Karen E; Palsson, Bernhard; Peacock, Sharon J; Picardeau, Mathieu; Ricaldi, Jessica N; Thaipandungpanit, Janjira; Wunder, Elsio A; Yang, X Frank; Zhang, Jun-Jie; Vinetz, Joseph M

    2016-02-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade's refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  4. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

    PubMed Central

    Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N.; Matsunaga, James; Mechaly, Ariel E.; Monk, Jonathan M.; Nascimento, Ana L. T.; Nelson, Karen E.; Palsson, Bernhard; Peacock, Sharon J.; Picardeau, Mathieu; Ricaldi, Jessica N.; Thaipandungpanit, Janjira; Wunder, Elsio A.; Yang, X. Frank; Zhang, Jun-Jie; Vinetz, Joseph M.

    2016-01-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade’s refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  5. Genome-wide comparative analysis of ABC systems in the Bdellovibrio-and-like organisms

    PubMed Central

    Li, Nan; Chen, Huan; Williams, Henry N.

    2015-01-01

    Bdellovibrio -and-like organisms (BALOs) are gram-negative, predatory bacteria with wide variations in genome sizes and GC content and ecological habitats. The ATP-binding cassette (ABC) systems have been identified in several prokaryotes, fungi and plants and have a role in transport of materials in and out of cells and in cellular processes. However, knowledge of the ABC systems of BALOs remains obscure. A total of 269 putative ABC proteins were identified in BALOs. The genes encoding these ABC systems occupy nearly 1.3% of the gene content in freshwater Bdellovibrio strains and about 0.7% in their saltwater counterparts. The proteins found belong to 25 ABC system families based on their structural characteristics and functions. Among these, 16 families function as importers, 6 as exporters and 3 are involved in various cellular processes. Eight of these 25 ABC system families were deduced to be the core set of ABC systems conserved in all BALOs. All Bacteriovorax strains have 28 or less ABC systems. On the contrary, the freshwater Bdellovibrio strains have more ABC systems, typically around 51. In the genome of Bdellovibrio exovorus JSS (CP003537.1), 53 putative ABC systems were detected, representing the highest number among all the BALO genomes examined in this study. Unexpected high numbers of ABC systems involved in cellular processes were found in all BALOs. Phylogenetic analysis suggests that the majority of ABC proteins can be assigned into many separate families with high bootstrap supports (>50%). In this study, a general framework of sequence–structure–function connections for the ABC systems in BALOs was revealed providing novel insights for future investigations. PMID:25707746

  6. Comparative genomic in situ hybridization (cGISH) analysis of the genomic relationships among Sinapis arvensis, Brassica rapa and Brassica nigra.

    PubMed

    Mao, Shufang; Han, Yonghua; Wu, Xiaoming; An, Tingting; Tang, Jiali; Shen, Junjun; Li, Zongyun

    2012-06-01

    To further understand the relationships between the SS genome of Sinapis arvensis and the AA, BB genomes in Brassica, genomic DNA of Sinapis arvensis was hybridized to the metaphase chromosomes of Brassica nigra (BB genome), and the metaphase chromosomes and interphase nucleus of Brassica rapa (AA genome) by comparative genomic in situ hybridization (cGISH). As a result, every chromosome of B. nigra had signals along the whole chromosomal length. However, only half of the condensed heterochromatic areas in the interphase nucleus and the chromosomes showed rich signals in Brassica rapa. Interphase nucleus and the metaphase chromosomes of S. arvensis were simultaneously hybridized with digoxigenin-labeled genomic DNA of B. nigra and biotin-labeled genomic DNA of B. rapa. Signals of genomic DNA of B. nigra hybridized throughout the length of all chromosomes and all the condensed heterochromatic areas in the interphase nucleus, except chromosome 4, of which signals were weak in centromeric regions. Signals of the genomic DNA of B. rapa patterned the most areas of ten chromosomes and ten condensed heterochromatic areas, others had less signals. The results showed that the SS genome had homology with AA and BB genomes, but the homology between SS genome and AA genome was clearly lower than that between the SS genome and BB genome.

  7. Comparative genomic in situ hybridization (cGISH) analysis of the genomic relationships among Sinapis arvensis, Brassica rapa and Brassica nigra.

    PubMed

    Mao, Shufang; Han, Yonghua; Wu, Xiaoming; An, Tingting; Tang, Jiali; Shen, Junjun; Li, Zongyun

    2012-06-01

    To further understand the relationships between the SS genome of Sinapis arvensis and the AA, BB genomes in Brassica, genomic DNA of Sinapis arvensis was hybridized to the metaphase chromosomes of Brassica nigra (BB genome), and the metaphase chromosomes and interphase nucleus of Brassica rapa (AA genome) by comparative genomic in situ hybridization (cGISH). As a result, every chromosome of B. nigra had signals along the whole chromosomal length. However, only half of the condensed heterochromatic areas in the interphase nucleus and the chromosomes showed rich signals in Brassica rapa. Interphase nucleus and the metaphase chromosomes of S. arvensis were simultaneously hybridized with digoxigenin-labeled genomic DNA of B. nigra and biotin-labeled genomic DNA of B. rapa. Signals of genomic DNA of B. nigra hybridized throughout the length of all chromosomes and all the condensed heterochromatic areas in the interphase nucleus, except chromosome 4, of which signals were weak in centromeric regions. Signals of the genomic DNA of B. rapa patterned the most areas of ten chromosomes and ten condensed heterochromatic areas, others had less signals. The results showed that the SS genome had homology with AA and BB genomes, but the homology between SS genome and AA genome was clearly lower than that between the SS genome and BB genome. PMID:22804340

  8. Genome-Wide Comparative Analysis of 20 Miniature Inverted-Repeat Transposable Element Families in Brassica rapa and B. oleracea

    PubMed Central

    Sampath, Perumal; Murukarthick, Jayakodi; Izzah, Nur Kholilatul; Lee, Jonghoon; Choi, Hong-Il; Shirasawa, Kenta; Choi, Beom-Soon; Liu, Shengyi; Nou, Ill-Sup; Yang, Tae-Jin

    2014-01-01

    Miniature inverted-repeat transposable elements (MITEs) are ubiquitous, non-autonomous class II transposable elements. Here, we conducted genome-wide comparative analysis of 20 MITE families in B. rapa, B. oleracea, and Arabidopsis thaliana. A total of 5894 and 6026 MITE members belonging to the 20 families were found in the whole genome pseudo-chromosome sequences of B. rapa and B. oleracea, respectively. Meanwhile, only four of the 20 families, comprising 573 members, were identified in the Arabidopsis genome, indicating that most of the families were activated in the Brassica genus after divergence from Arabidopsis. Copy numbers varied from 4 to 1459 for each MITE family, and there was up to 6-fold variation between B. rapa and B. oleracea. In particular, analysis of intact members showed that whereas eleven families were present in similar copy numbers in B. rapa and B. oleracea, nine families showed copy number variation ranging from 2- to 16-fold. Four of those families (BraSto-3, BraTo-3, 4, 5) were more abundant in B. rapa, and the other five (BraSto-1, BraSto-4, BraTo-1, 7 and BraHAT-1) were more abundant in B. oleracea. Overall, 54% and 51% of the MITEs resided in or within 2 kb of a gene in the B. rapa and B. oleracea genomes, respectively. Notably, 92 MITEs were found within the CDS of annotated genes, suggesting that MITEs might play roles in diversification of genes in the recently triplicated Brassica genome. MITE insertion polymorphism (MIP) analysis of 289 MITE members showed that 52% and 23% were polymorphic at the inter- and intra-species levels, respectively, indicating that there has been recent MITE activity in the Brassica genome. These recently activated MITE families with abundant MIP will provide useful resources for molecular breeding and identification of novel functional genes arising from MITE insertion. PMID:24747717

  9. Genome-wide comparative analysis of 20 miniature inverted-repeat transposable element families in Brassica rapa and B. oleracea.

    PubMed

    Sampath, Perumal; Murukarthick, Jayakodi; Izzah, Nur Kholilatul; Lee, Jonghoon; Choi, Hong-Il; Shirasawa, Kenta; Choi, Beom-Soon; Liu, Shengyi; Nou, Ill-Sup; Yang, Tae-Jin

    2014-01-01

    Miniature inverted-repeat transposable elements (MITEs) are ubiquitous, non-autonomous class II transposable elements. Here, we conducted genome-wide comparative analysis of 20 MITE families in B. rapa, B. oleracea, and Arabidopsis thaliana. A total of 5894 and 6026 MITE members belonging to the 20 families were found in the whole genome pseudo-chromosome sequences of B. rapa and B. oleracea, respectively. Meanwhile, only four of the 20 families, comprising 573 members, were identified in the Arabidopsis genome, indicating that most of the families were activated in the Brassica genus after divergence from Arabidopsis. Copy numbers varied from 4 to 1459 for each MITE family, and there was up to 6-fold variation between B. rapa and B. oleracea. In particular, analysis of intact members showed that whereas eleven families were present in similar copy numbers in B. rapa and B. oleracea, nine families showed copy number variation ranging from 2- to 16-fold. Four of those families (BraSto-3, BraTo-3, 4, 5) were more abundant in B. rapa, and the other five (BraSto-1, BraSto-4, BraTo-1, 7 and BraHAT-1) were more abundant in B. oleracea. Overall, 54% and 51% of the MITEs resided in or within 2 kb of a gene in the B. rapa and B. oleracea genomes, respectively. Notably, 92 MITEs were found within the CDS of annotated genes, suggesting that MITEs might play roles in diversification of genes in the recently triplicated Brassica genome. MITE insertion polymorphism (MIP) analysis of 289 MITE members showed that 52% and 23% were polymorphic at the inter- and intra-species levels, respectively, indicating that there has been recent MITE activity in the Brassica genome. These recently activated MITE families with abundant MIP will provide useful resources for molecular breeding and identification of novel functional genes arising from MITE insertion.

  10. Genome-wide comparative analysis of 20 miniature inverted-repeat transposable element families in Brassica rapa and B. oleracea.

    PubMed

    Sampath, Perumal; Murukarthick, Jayakodi; Izzah, Nur Kholilatul; Lee, Jonghoon; Choi, Hong-Il; Shirasawa, Kenta; Choi, Beom-Soon; Liu, Shengyi; Nou, Ill-Sup; Yang, Tae-Jin

    2014-01-01

    Miniature inverted-repeat transposable elements (MITEs) are ubiquitous, non-autonomous class II transposable elements. Here, we conducted genome-wide comparative analysis of 20 MITE families in B. rapa, B. oleracea, and Arabidopsis thaliana. A total of 5894 and 6026 MITE members belonging to the 20 families were found in the whole genome pseudo-chromosome sequences of B. rapa and B. oleracea, respectively. Meanwhile, only four of the 20 families, comprising 573 members, were identified in the Arabidopsis genome, indicating that most of the families were activated in the Brassica genus after divergence from Arabidopsis. Copy numbers varied from 4 to 1459 for each MITE family, and there was up to 6-fold variation between B. rapa and B. oleracea. In particular, analysis of intact members showed that whereas eleven families were present in similar copy numbers in B. rapa and B. oleracea, nine families showed copy number variation ranging from 2- to 16-fold. Four of those families (BraSto-3, BraTo-3, 4, 5) were more abundant in B. rapa, and the other five (BraSto-1, BraSto-4, BraTo-1, 7 and BraHAT-1) were more abundant in B. oleracea. Overall, 54% and 51% of the MITEs resided in or within 2 kb of a gene in the B. rapa and B. oleracea genomes, respectively. Notably, 92 MITEs were found within the CDS of annotated genes, suggesting that MITEs might play roles in diversification of genes in the recently triplicated Brassica genome. MITE insertion polymorphism (MIP) analysis of 289 MITE members showed that 52% and 23% were polymorphic at the inter- and intra-species levels, respectively, indicating that there has been recent MITE activity in the Brassica genome. These recently activated MITE families with abundant MIP will provide useful resources for molecular breeding and identification of novel functional genes arising from MITE insertion. PMID:24747717

  11. Comparative Genome Sequence Analysis of Multidrug-Resistant Acinetobacter baumannii▿ †

    PubMed Central

    Adams, Mark D.; Goglin, Karrie; Molyneaux, Neil; Hujer, Kristine M.; Lavender, Heather; Jamison, Jennifer J.; MacDonald, Ian J.; Martin, Kristienna M.; Russo, Thomas; Campagnari, Anthony A.; Hujer, Andrea M.; Bonomo, Robert A.; Gill, Steven R.

    2008-01-01

    The recent emergence of multidrug resistance (MDR) in Acinetobacter baumannii has raised concern in health care settings worldwide. In order to understand the repertoire of resistance determinants and their organization and origins, we compared the genome sequences of three MDR and three drug-susceptible A. baumannii isolates. The entire MDR phenotype can be explained by the acquisition of discrete resistance determinants distributed throughout the genome. A comparison of closely related MDR and drug-susceptible isolates suggests that drug efflux may be a less significant contributor to resistance to certain classes of antibiotics than inactivation enzymes are. A resistance island with a variable composition of resistance determinants interspersed with transposons, integrons, and other mobile genetic elements is a significant but not universal contributor to the MDR phenotype. Four hundred seventy-five genes are shared among all six clinical isolates but absent from the related environmental species Acinetobacter baylyi ADP1. These genes are enriched for transcription factors and transporters and suggest physiological features of A. baumannii that are related to adaptation for growth in association with humans. PMID:18931120

  12. Microdeletion and Microduplication Analysis of Chinese Conotruncal Defects Patients with Targeted Array Comparative Genomic Hybridization

    PubMed Central

    Ma, Xiaojing; Wu, Dandan; Zhang, Ting; He, Li; Qin, Shengying; Li, Xiaotian

    2013-01-01

    Objective The current study aimed to develop a reliable targeted array comparative genomic hybridization (aCGH) to detect microdeletions and microduplications in congenital conotruncal defects (CTDs), especially on 22q11.2 region, and for some other chromosomal aberrations, such as 5p15-5p, 7q11.23 and 4p16.3. Methods Twenty-seven patients with CTDs, including 12 pulmonary atresia (PA), 10 double-outlet right ventricle (DORV), 3 transposition of great arteries (TGA), 1 tetralogy of Fallot (TOF) and one ventricular septal defect (VSD), were enrolled in this study and screened for pathogenic copy number variations (CNVs), using Agilent 8 x 15K targeted aCGH. Real-time quantitative polymerase chain reaction (qPCR) was performed to test the molecular results of targeted aCGH. Results Four of 27 patients (14.8%) had 22q11.2 CNVs, 1 microdeletion and 3 microduplications. qPCR test confirmed the microdeletion and microduplication detected by the targeted aCGH. Conclusion Chromosomal abnormalities were a well-known cause of multiple congenital anomalies (MCA). This aCGH using arrays with high-density coverage in the targeted regions can detect genomic imbalances including 22q11.2 and other 10 kinds CNVs effectively and quickly. This approach has the potential to be applied to detect aneuploidy and common microdeletion/microduplication syndromes on a single microarray. PMID:24098474

  13. Characterization and comparative analysis of a simian foamy virus complete genome isolated from Brazilian capuchin monkeys.

    PubMed

    Troncoso, Lian L; Muniz, Cláudia P; Siqueira, Juliana D; Curty, Gislaine; Schrago, Carlos G; Augusto, Anderson; Fedullo, Luiz; Soares, Marcelo A; Santos, André F

    2015-10-01

    Foamy viruses infect a wide range of placental mammals, including primates. However, despite of great diversity of New World primates, only three strains of neotropical simian foamy viruses (SFV) have been described. Only after 40 years since serological characterization, the complete sequence of an SFVcap strain infecting a family of six capuchin monkeys (Sapajus xanthosternos) was obtained. Co-culture of primate peripheral blood mononuclear cells with Cf2Th canine cells was established and monitored for the appearance of cytopathic effects, PCR amplification of integrated SFV proviral genome and viral reverse transcriptase activity. The novel SFVcap was fully sequenced through a next-generation sequencing protocol. Phylogenetic analysis of the complete genome grouped SFVcap and SFVmar, both infecting primate species of the Cebidae family with a genetic similarity of approximately 85%. Similar ORF sizes were observed among SFV from neotropical primates, and env and pol genes were the most conserved. Neotropical SFV presented the smallest LTRs among exogenous mammalians. The novel SFVcap strain provides a valuable research tool for the FV community.

  14. The LINEs and SINEs of Entamoeba histolytica: comparative analysis and genomic distribution.

    PubMed

    Bakre, Abhijeet A; Rawal, Kamal; Ramaswamy, Ram; Bhattacharya, Alok; Bhattacharya, Sudha

    2005-07-01

    Autonomous non-long terminal repeat retrotransposons are commonly referred to as long interspersed elements (LINEs). Short non-autonomous elements that borrow the LINE machinery are called SINES. The Entamoeba histolytica genome contains three classes of LINEs and SINEs. Together the EhLINEs/SINEs account for about 6% of the genome. The recognizable functional domains in all three EhLINEs included reverse transcriptase and endonuclease. A novel feature was the presence of two types of members-some with a single long ORF (less frequent) and some with two ORFs (more frequent) in both EhLINE1 and 2. The two ORFs were generated by conserved changes leading to stop codon. Computational analysis of the immediate flanking sequences for each element showed that they inserted in AT-rich sequences, with a preponderance of Ts in the upstream site. The elements were very frequently located close to protein-coding genes and other EhLINEs/SINEs. The possible influence of these elements on expression of neighboring genes needs to be determined.

  15. The Sinorhizobium fredii HH103 Genome: A Comparative Analysis With S. fredii Strains Differing in Their Symbiotic Behavior With Soybean.

    PubMed

    Vinardell, José-María; Acosta-Jurado, Sebastián; Zehner, Susanne; Göttfert, Michael; Becker, Anke; Baena, Irene; Blom, Jochem; Crespo-Rivas, Juan Carlos; Goesmann, Alexander; Jaenicke, Sebastian; Krol, Elizaveta; McIntosh, Matthew; Margaret, Isabel; Pérez-Montaño, Francisco; Schneiker-Bekel, Susanne; Serranía, Javier; Szczepanowski, Rafael; Buendía, Ana-María; Lloret, Javier; Bonilla, Ildefonso; Pühler, Alfred; Ruiz-Sainz, José-Enrique; Weidner, Stefan

    2015-07-01

    Sinorhizobium fredii HH103 is a fast-growing rhizobial strain infecting a broad range of legumes including both American and Asiatic soybeans. In this work, we present the sequencing and annotation of the HH103 genome (7.25 Mb), consisting of one chromosome and six plasmids and representing the structurally most complex sinorhizobial genome sequenced so far. Comparative genomic analyses of S. fredii HH103 with strains USDA257 and NGR234 showed that the core genome of these three strains contains 4,212 genes (61.7% of the HH103 genes). Synteny plot analysis revealed that the much larger chromosome of USDA257 (6.48 Mb) is colinear to the HH103 (4.3 Mb) and NGR324 chromosomes (3.9 Mb). An additional region of the USDA257 chromosome of about 2 Mb displays similarity to plasmid pSfHH103e. Remarkable differences exist between HH103 and NGR234 concerning nod genes, flavonoid effect on surface polysaccharide production, and quorum-sensing systems. Furthermore a number of protein secretion systems have been found. Two genes coding for putative type III-secreted effectors not previously described in S. fredii, nopI and gunA, have been located on the HH103 genome. These differences could be important to understand the different symbiotic behavior of S. fredii strains HH103, USDA257, and NGR234 with soybean.

  16. Structural RNAs of known and unknown function identified in malaria parasites by comparative genomics and RNA analysis

    PubMed Central

    Chakrabarti, Kausik; Pearson, Michael; Grate, Leslie; Sterne-Weiler, Timothy; Deans, Jonathan; Donohue, John Paul; Ares, Manuel

    2007-01-01

    As the genomes of more eukaryotic pathogens are sequenced, understanding how molecular differences between parasite and host might be exploited to provide new therapies has become a major focus. Central to cell function are RNA-containing complexes involved in gene expression, such as the ribosome, the spliceosome, snoRNAs, RNase P, and telomerase, among others. In this article we identify by comparative genomics and validate by RNA analysis numerous previously unknown structural RNAs encoded by the Plasmodium falciparum genome, including the telomerase RNA, U3, 31 snoRNAs, as well as previously predicted spliceosomal snRNAs, SRP RNA, MRP RNA, and RNAse P RNA. Furthermore, we identify six new RNA coding genes of unknown function. To investigate the relationships of the RNA coding genes to other genomic features in related parasites, we developed a genome browser for P. falciparum (http://areslab.ucsc.edu/cgi-bin/hgGateway). Additional experiments provide evidence supporting the prediction that snoRNAs guide methylation of a specific position on U4 snRNA, as well as predicting an snRNA promoter element particular to Plasmodium sp. These findings should allow detailed structural comparisons between the RNA components of the gene expression machinery of the parasite and its vertebrate hosts. PMID:17901154

  17. The mitochondrial genome of the red alga Kappaphycus striatus ("Green Sacol" variety): complete nucleotide sequence, genome structure and organization, and comparative analysis.

    PubMed

    Tablizo, Francis A; Lluisma, Arturo O

    2014-12-01

    The complete mitochondrial (mt) DNA sequence of the rhodophyte Kappaphycus striatus ("Green Sacol" variety) was determined. The mtDNA is circular, 25,242 bases long (A+T content: 69.94%), and contains 50 densely packed genes comprising 93.22% of the mitochondrial genome, with genes encoded on both strands. Through comparative analysis, the overall sequence, genome structure, and organization of K. striatus mtDNA were seen to be highly similar with other fully sequenced mitochondrial genomes of the class Florideophyceae. On the other hand, certain degrees of genome rearrangements and greater sequence dissimilarities were observed for the mtDNAs of other evolutionarily distant red algae, such as those from the class Bangiophyceae and Cyanidiophyceae, compared to that of K. striatus. Furthermore, a trend was observed wherein the red algal mtDNAs tend to encode lesser number of protein-coding genes, albeit not necessarily shorter, as the organism becomes more morphologically complex. This trend is supported by the phylogenetic tree inferred from the concatenated amino acid sequences of the deduced protein products of cytochrome c oxidase subunit genes (cox1, 2, and 3).

  18. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    PubMed

    Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  19. Comparative Genomic Analysis Indicates that Niche Adaptation of Terrestrial Flavobacteria Is Strongly Linked to Plant Glycan Metabolism

    PubMed Central

    Kolton, Max; Sela, Noa; Elad, Yigal; Cytryn, Eddie

    2013-01-01

    Flavobacteria are important members of aquatic and terrestrial bacterial communities, displaying extreme variations in lifestyle, geographical distribution and genome size. They are ubiquitous in soil, but are often strongly enriched in the rhizosphere and phyllosphere of plants. In this study, we compared the genome of a root-associated Flavobacterium that we recently isolated, physiologically characterized and sequenced, to 14 additional Flavobacterium genomes, in order to pinpoint characteristics associated with its high abundance in the rhizosphere. Interestingly, flavobacterial genomes vary in size by approximately two-fold, with terrestrial isolates having predominantly larger genomes than those from aquatic environments. Comparative functional gene analysis revealed that terrestrial and aquatic Flavobacteria generally segregated into two distinct clades. Members of the aquatic clade had a higher ratio of peptide and protein utilization genes, whereas members of the terrestrial clade were characterized by a significantly higher abundance and diversity of genes involved in metabolism of carbohydrates such as xylose, arabinose and pectin. Interestingly, genes encoding glycoside hydrolase (GH) families GH78 and GH106, responsible for rhamnogalacturonan utilization (exclusively associated with terrestrial plant hemicelluloses), were only present in terrestrial clade genomes, suggesting adaptation of the terrestrial strains to plant-related carbohydrate metabolism. The Peptidase/GH ratio of aquatic clade Flavobacteria was significantly higher than that of terrestrial strains (1.7±0.7 and 9.7±4.7, respectively), supporting the concept that this relation can be used to infer Flavobacterium lifestyles. Collectively, our research suggests that terrestrial Flavobacteria are highly adapted to plant carbohydrate metabolism, which appears to be a key to their profusion in plant environments. PMID:24086761

  20. The complete nucleotide sequence and RNA editing content of the mitochondrial genome of rapeseed (Brassica napus L.): comparative analysis of the mitochondrial genomes of rapeseed and Arabidopsis thaliana.

    PubMed

    Handa, Hirokazu

    2003-10-15

    The entire mitochondrial genome of rapeseed (Brassica napus L.) was sequenced and compared with that of Arabidopsis thaliana. The 221 853 bp genome contains 34 protein-coding genes, three rRNA genes and 17 tRNA genes. This gene content is almost identical to that of Arabidopsis: However the rps14 gene, which is a pseudo-gene in Arabidopsis, is intact in rapeseed. On the other hand, five tRNA genes are missing in rapeseed compared to Arabidopsis, although the set of mitochondrially encoded tRNA species is identical in the two Cruciferae. RNA editing events were systematically investigated on the basis of the sequence of the rapeseed mitochondrial genome. A total of 427 C to U conversions were identified in ORFs, which is nearly identical to the number in Arabidopsis (441 sites). The gene sequences and intron structures are mostly conserved (more than 99% similarity for protein-coding regions); however, only 358 editing sites (83% of total editings) are shared by rapeseed and Arabidopsis: Non-coding regions are mostly divergent between the two plants. One-third (about 78.7 kb) and two-thirds (about 223.8 kb) of the rapeseed and Arabidopsis mitochondrial genomes, respectively, cannot be aligned with each other and most of these regions do not show any homology to sequences registered in the DNA databases. The results of the comparative analysis between the rapeseed and Arabidopsis mitochondrial genomes suggest that higher plant mitochondria are extremely conservative with respect to coding sequences and somewhat conservative with respect to RNA editing, but that non-coding parts of plant mitochondrial DNA are extraordinarily dynamic with respect to structural changes, sequence acquisition and/or sequence loss.

  1. Array comparative genomic hybridization analysis of small supernumerary marker chromosomes in human infertility.

    PubMed

    Guediche, N; Tosca, L; Kara Terki, A; Bas, C; Lecerf, L; Young, J; Briand-Suleau, A; Tou, B; Bouligand, J; Brisset, S; Misrahi, M; Guiochon-Mantel, A; Goossens, M; Tachdjian, G

    2012-01-01

    Small supernumerary marker chromosomes (sSMC) are structurally abnormal chromosomes that cannot be unambiguously identified by conventional banding cytogenetics. This study describes four patients with sSMC in relation with infertility. Patient 1 had primary infertility. His brother, fertile, carried the same sSMC (patient 2). Patient 3 presented polycystic ovary syndrome and patient 4 primary ovarian insufficiency. Cytogenetic studies, array comparative genomic hybridization (CGH) and sperm analyses were compared with cases previously reported. sSMC corresponded to the 15q11.2 region (patients 1 and 2), the centromeric chromosome 15 region (patient 3) and the 21p11.2 region (patient 4). Array CGH showed 3.6-Mb gain for patients 1 and 2 and 0.266-Mb gain for patient 4. Sperm fluorescent in-situ hybridization analyses found ratios of 0.37 and 0.30 of sperm nuclei with sSMC(15) for patients 1 and 2, respectively (P < 0.001). An increase of sperm nuclei with disomy X, Y and 18 was noted for patient 1 compared with control and patient 2 (P < 0.001). Among the genes mapped in the unbalanced chromosomal regions, POTE B and BAGE are related to the testis and ovary, respectively. The implication of sSMC in infertility could be due to duplication, but also to mechanical effects perturbing meiosis.

  2. Genomicus: five genome browsers for comparative genomics in eukaryota.

    PubMed

    Louis, Alexandra; Muffato, Matthieu; Roest Crollius, Hugues

    2013-01-01

    Genomicus (http://www.dyogen.ens.fr/genomicus/) is a database and an online tool that allows easy comparative genomic visualization in >150 eukaryote genomes. It provides a way to explore spatial information related to gene organization within and between genomes and temporal relationships related to gene and genome evolution. For the specific vertebrate phylum, it also provides access to ancestral gene order reconstructions and conserved non-coding elements information. We extended the Genomicus database originally dedicated to vertebrate to four new clades, including plants, non-vertebrate metazoa, protists and fungi. This visualization tool allows evolutionary phylogenomics analysis and exploration. Here, we describe the graphical modules of Genomicus and show how it is capable of revealing differential gene loss and gain, segmental or genome duplications and study the evolution of a locus through homology relationships.

  3. Comparative genomic analysis of the swine pathogen Bordetella bronchiseptica strain KM22 to other B. bronchiseptica sequenced genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    B. bronchiseptica is pervasive in swine and plays multiple roles in respiratory disease as well as enhancing colonization by other bacterial pathogens and increasing the severity of disease associated with both viral and bacterial pathogens. The goal of this study was to use the genome sequence of K...

  4. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    PubMed

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  5. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    PubMed

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden. PMID:27217166

  6. Comparative Genomics and Metabolic Analysis Reveals Peculiar Characteristics of Rhodococcus opacus Strain M213 Particularly for Naphthalene Degradation

    PubMed Central

    Blom, Jochen; Indest, Karl J.; Jung, Carina M.; Stothard, Paul; Bera, Gopal; Green, Stefan J.; Ogram, Andrew

    2016-01-01

    The genome of Rhodococcus opacus strain M213, isolated from a fuel-oil contaminated soil, was sequenced and annotated which revealed a genome size of 9,194,165 bp encoding 8680 putative genes and a G+C content of 66.72%. Among the protein coding genes, 71.77% were annotated as clusters of orthologous groups of proteins (COGs); 55% of the COGs were present as paralog clusters. Pulsed field gel electrophoresis (PFGE) analysis of M213 revealed the presence of three different sized replicons- a circular chromosome and two megaplasmids (pNUO1 and pNUO2) estimated to be of 750Kb 350Kb in size, respectively. Conversely, using an alternative approach of optical mapping, the plasmid replicons appeared as a circular ~1.2 Mb megaplasmid and a linear, ~0.7 Mb megaplasmid. Genome-wide comparative analysis of M213 with a cohort of sequenced Rhodococcus species revealed low syntenic affiliation with other R. opacus species including strains B4 and PD630. Conversely, a closer affiliation of M213, at the functional (COG) level, was observed with the catabolically versatile R. jostii strain RHA1 and other Rhodococcii such as R. wratislaviensis strain IFP 2016, R. imtechensis strain RKJ300, Rhodococcus sp. strain JVH1, and Rhodococcus sp. strain DK17, respectively. An in-depth, genome-wide comparison between these functional relatives revealed 971 unique genes in M213 representing 11% of its total genome; many associating with catabolic functions. Of major interest was the identification of as many as 154 genomic islands (GEIs), many with duplicated catabolic genes, in particular for PAHs; a trait that was confirmed by PCR-based identification of naphthalene dioxygenase (NDO) as a representative gene, across PFGE-resolved replicons of strain M213. Interestingly, several plasmid/GEI-encoded genes, that likely participate in degrading naphthalene (NAP) via a peculiar pathway, were also identified in strain M213 using a combination of bioinformatics, metabolic analysis and gene

  7. Comparative Genomics and Metabolic Analysis Reveals Peculiar Characteristics of Rhodococcus opacus Strain M213 Particularly for Naphthalene Degradation.

    PubMed

    Pathak, Ashish; Chauhan, Ashvini; Blom, Jochen; Indest, Karl J; Jung, Carina M; Stothard, Paul; Bera, Gopal; Green, Stefan J; Ogram, Andrew

    2016-01-01

    The genome of Rhodococcus opacus strain M213, isolated from a fuel-oil contaminated soil, was sequenced and annotated which revealed a genome size of 9,194,165 bp encoding 8680 putative genes and a G+C content of 66.72%. Among the protein coding genes, 71.77% were annotated as clusters of orthologous groups of proteins (COGs); 55% of the COGs were present as paralog clusters. Pulsed field gel electrophoresis (PFGE) analysis of M213 revealed the presence of three different sized replicons- a circular chromosome and two megaplasmids (pNUO1 and pNUO2) estimated to be of 750Kb 350Kb in size, respectively. Conversely, using an alternative approach of optical mapping, the plasmid replicons appeared as a circular ~1.2 Mb megaplasmid and a linear, ~0.7 Mb megaplasmid. Genome-wide comparative analysis of M213 with a cohort of sequenced Rhodococcus species revealed low syntenic affiliation with other R. opacus species including strains B4 and PD630. Conversely, a closer affiliation of M213, at the functional (COG) level, was observed with the catabolically versatile R. jostii strain RHA1 and other Rhodococcii such as R. wratislaviensis strain IFP 2016, R. imtechensis strain RKJ300, Rhodococcus sp. strain JVH1, and Rhodococcus sp. strain DK17, respectively. An in-depth, genome-wide comparison between these functional relatives revealed 971 unique genes in M213 representing 11% of its total genome; many associating with catabolic functions. Of major interest was the identification of as many as 154 genomic islands (GEIs), many with duplicated catabolic genes, in particular for PAHs; a trait that was confirmed by PCR-based identification of naphthalene dioxygenase (NDO) as a representative gene, across PFGE-resolved replicons of strain M213. Interestingly, several plasmid/GEI-encoded genes, that likely participate in degrading naphthalene (NAP) via a peculiar pathway, were also identified in strain M213 using a combination of bioinformatics, metabolic analysis and gene

  8. Comparative mapping of Raphanus sativus genome using Brassica markers and quantitative trait loci analysis for the Fusarium wilt resistance trait.

    PubMed

    Yu, Xiaona; Choi, Su Ryun; Ramchiary, Nirala; Miao, Xinyang; Lee, Su Hee; Sun, Hae Jeong; Kim, Sunggil; Ahn, Chun Hee; Lim, Yong Pyo

    2013-10-01

    Fusarium wilt (FW), caused by the soil-borne fungal pathogen Fusarium oxysporum is a serious disease in cruciferous plants, including the radish (Raphanus sativus). To identify quantitative trait loci (QTL) or gene(s) conferring resistance to FW, we constructed a genetic map of R. sativus using an F2 mapping population derived by crossing the inbred lines '835' (susceptible) and 'B2' (resistant). A total of 220 markers distributed in 9 linkage groups (LGs) were mapped in the Raphanus genome, covering a distance of 1,041.5 cM with an average distance between adjacent markers of 4.7 cM. Comparative analysis of the R. sativus genome with that of Arabidopsis thaliana and Brassica rapa revealed 21 and 22 conserved syntenic regions, respectively. QTL mapping detected a total of 8 loci conferring FW resistance that were distributed on 4 LGs, namely, 2, 3, 6, and 7 of the Raphanus genome. Of the detected QTL, 3 QTLs (2 on LG 3 and 1 on LG 7) were constitutively detected throughout the 2-year experiment. QTL analysis of LG 3, flanked by ACMP0609 and cnu_mBRPGM0085, showed a comparatively higher logarithm of the odds (LOD) value and percentage of phenotypic variation. Synteny analysis using the linked markers to this QTL showed homology to A. thaliana chromosome 3, which contains disease-resistance gene clusters, suggesting conservation of resistance genes between them.

  9. Comparative mapping of Raphanus sativus genome using Brassica markers and quantitative trait loci analysis for the Fusarium wilt resistance trait.

    PubMed

    Yu, Xiaona; Choi, Su Ryun; Ramchiary, Nirala; Miao, Xinyang; Lee, Su Hee; Sun, Hae Jeong; Kim, Sunggil; Ahn, Chun Hee; Lim, Yong Pyo

    2013-10-01

    Fusarium wilt (FW), caused by the soil-borne fungal pathogen Fusarium oxysporum is a serious disease in cruciferous plants, including the radish (Raphanus sativus). To identify quantitative trait loci (QTL) or gene(s) conferring resistance to FW, we constructed a genetic map of R. sativus using an F2 mapping population derived by crossing the inbred lines '835' (susceptible) and 'B2' (resistant). A total of 220 markers distributed in 9 linkage groups (LGs) were mapped in the Raphanus genome, covering a distance of 1,041.5 cM with an average distance between adjacent markers of 4.7 cM. Comparative analysis of the R. sativus genome with that of Arabidopsis thaliana and Brassica rapa revealed 21 and 22 conserved syntenic regions, respectively. QTL mapping detected a total of 8 loci conferring FW resistance that were distributed on 4 LGs, namely, 2, 3, 6, and 7 of the Raphanus genome. Of the detected QTL, 3 QTLs (2 on LG 3 and 1 on LG 7) were constitutively detected throughout the 2-year experiment. QTL analysis of LG 3, flanked by ACMP0609 and cnu_mBRPGM0085, showed a comparatively higher logarithm of the odds (LOD) value and percentage of phenotypic variation. Synteny analysis using the linked markers to this QTL showed homology to A. thaliana chromosome 3, which contains disease-resistance gene clusters, suggesting conservation of resistance genes between them. PMID:23864230

  10. Copy number analysis of the low-copy repeats at the primate NPHP1 locus by array comparative genomic hybridization.

    PubMed

    Yuan, Bo; Liu, Pengfei; Rogers, Jeffrey; Lupski, James R

    2016-06-01

    Array comparative genomic hybridization (aCGH) has been widely used to detect copy number variants (CNVs) in both research and clinical settings. A customizable aCGH platform may greatly facilitate copy number analyses in genomic regions with higher-order complexity, such as low-copy repeats (LCRs). Here we present the aCGH analyses focusing on the 45 kb LCRs [1] at the NPHP1 region with diverse copy numbers in humans. Also, the interspecies aCGH analysis comparing human and nonhuman primates revealed dynamic copy number transitions of the human 45 kb LCR orthologues during primate evolution and therefore shed light on the origin of complexity at this locus. The original aCGH data are available at GEO under GSE73962. PMID:27222811

  11. Comparative genomic analyses in Asparagus.

    PubMed

    Kuhl, Joseph C; Havey, Michael J; Martin, William J; Cheung, Foo; Yuan, Qiaoping; Landherr, Lena; Hu, Yi; Leebens-Mack, James; Town, Christopher D; Sink, Kenneth C

    2005-12-01

    Garden asparagus (Asparagus officinalis L.) belongs to the monocot family Asparagaceae in the order Asparagales. Onion (Allium cepa L.) and Asparagus officinalis are 2 of the most economically important plants of the core Asparagales, a well supported monophyletic group within the Asparagales. Coding regions in onion have lower GC contents than the grasses. We compared the GC content of 3374 unique expressed sequence tags (ESTs) from A. officinalis with Lycoris longituba and onion (both members of the core Asparagales), Acorus americanus (sister to all other monocots), the grasses, and Arabidopsis. Although ESTs in A. officinalis and Acorus had a higher average GC content than Arabidopsis, Lycoris, and onion, all were clearly lower than the grasses. The Asparagaceae have the smallest nuclear genomes among all plants in the core Asparagales, which typically have huge genomes. Within the Asparagaceae, European Asparagus species have approximately twice the nuclear DNA of that of southern African Asparagus species. We cloned and sequenced 20 genomic amplicons from European A. officinalis and the southern African species Asparagus plumosus and observed no clear evidence for a recent genome doubling in A. officinalis relative to A. plumosus. These results indicate that members of the genus Asparagus with smaller genomes may be useful genomic models for plants in the core Asparagales. PMID:16391674

  12. Evolution of a Cellular Immune Response in Drosophila: A Phenotypic and Genomic Comparative Analysis

    PubMed Central

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-01-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila. PMID:24443439

  13. Enhancer Identification through Comparative Genomics

    SciTech Connect

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  14. Comparative (Meta)genomic Analysis and Ecological Profiling of Human Gut-Specific Bacteriophage φB124-14

    PubMed Central

    Ogilvie, Lesley A.; Caplin, Jonathan; Dedi, Cinzia; Diston, David; Cheek, Elizabeth; Bowler, Lucas; Taylor, Huw; Ebdon, James; Jones, Brian V.

    2012-01-01

    Bacteriophage associated with the human gut microbiome are likely to have an important impact on community structure and function, and provide a wealth of biotechnological opportunities. Despite this, knowledge of the ecology and composition of bacteriophage in the gut bacterial community remains poor, with few well characterized gut-associated phage genomes currently available. Here we describe the identification and in-depth (meta)genomic, proteomic, and ecological analysis of a human gut-specific bacteriophage (designated φB124-14). In doing so we illuminate a fraction of the biological dark matter extant in this ecosystem and its surrounding eco-genomic landscape, identifying a novel and uncharted bacteriophage gene-space in this community. φB124-14 infects only a subset of closely related gut-associated Bacteroides fragilis strains, and the circular genome encodes functions previously found to be rare in viral genomes and human gut viral metagenome sequences, including those which potentially confer advantages upon phage and/or host bacteria. Comparative genomic analyses revealed φB124-14 is most closely related to φB40-8, the only other publically available Bacteroides sp. phage genome, whilst comparative metagenomic analysis of both phage failed to identify any homologous sequences in 136 non-human gut metagenomic datasets searched, supporting the human gut-specific nature of this phage. Moreover, a potential geographic variation in the carriage of these and related phage was revealed by analysis of their distribution and prevalence within 151 human gut microbiomes and viromes from Europe, America and Japan. Finally, ecological profiling of φB124-14 and φB40-8, using both gene-centric alignment-driven phylogenetic analyses, as well as alignment-free gene-independent approaches was undertaken. This not only verified the human gut-specific nature of both phage, but also indicated that these phage populate a distinct and unexplored ecological landscape

  15. Comparative Analysis of Chlamydia psittaci Genomes Reveals the Recent Emergence of a Pathogenic Lineage with a Broad Host Range

    PubMed Central

    Read, Timothy D.; Joseph, Sandeep J.; Didelot, Xavier; Liang, Brooke; Patel, Lisa; Dean, Deborah

    2013-01-01

    ABSTRACT Chlamydia psittaci is an obligate intracellular bacterium. Interest in Chlamydia stems from its high degree of virulence as an intestinal and pulmonary pathogen across a broad range of animals, including humans. C. psittaci human pulmonary infections, referred to as psittacosis, can be life-threatening, which is why the organism was developed as a bioweapon in the 20th century and is listed as a CDC biothreat agent. One remarkable recent result from comparative genomics is the finding of frequent homologous recombination across the genome of the sexually transmitted and trachoma pathogen Chlamydia trachomatis. We sought to determine if similar evolutionary dynamics occurred in C. psittaci. We analyzed 20 C. psittaci genomes from diverse strains representing the nine known serotypes of the organism as well as infections in a range of birds and mammals, including humans. Genome annotation revealed a core genome in all strains of 911 genes. Our analyses showed that C. psittaci has a history of frequently switching hosts and undergoing recombination more often than C. trachomatis. Evolutionary history reconstructions showed genome-wide homologous recombination and evidence of whole-plasmid exchange. Tracking the origins of recombinant segments revealed that some strains have imported DNA from as-yet-unsampled or -unsequenced C. psittaci lineages or other Chlamydiaceae species. Three ancestral populations of C. psittaci were predicted, explaining the current population structure. Molecular clock analysis found that certain strains are part of a clonal epidemic expansion likely introduced into North America by South American bird traders, suggesting that psittacosis is a recently emerged disease originating in New World parrots. PMID:23532978

  16. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus

    PubMed Central

    Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F. Jerry; Glöckner, Frank O.; Crowley, Susan P.; O'Sullivan, Orla; Cotter, Paul D.; Adams, Claire; Dobson, Alan D. W.; O'Gara, Fergal

    2016-01-01

    Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its

  17. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus.

    PubMed

    Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F Jerry; Glöckner, Frank O; Crowley, Susan P; O'Sullivan, Orla; Cotter, Paul D; Adams, Claire; Dobson, Alan D W; O'Gara, Fergal

    2016-01-01

    Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its

  18. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    PubMed

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  19. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium

    PubMed Central

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  20. Genome-wide analysis of the cyclin family in Arabidopsis and comparative phylogenetic analysis of plant cyclin-like proteins.

    PubMed

    Wang, Guanfang; Kong, Hongzhi; Sun, Yujin; Zhang, Xiaohong; Zhang, Wei; Altman, Naomi; DePamphilis, Claude W; Ma, Hong

    2004-06-01

    Cyclins are primary regulators of the activity of cyclin-dependent kinases, which are known to play critical roles in controlling eukaryotic cell cycle progression. While there has been extensive research on cell cycle mechanisms and cyclin function in animals and yeasts, only a small number of plant cyclins have been characterized functionally. In this paper, we describe an exhaustive search for cyclin genes in the Arabidopsis genome and among available sequences from other vascular plants. Based on phylogenetic analysis, we define 10 classes of plant cyclins, four of which are plant-specific, and a fifth is shared between plants and protists but not animals. Microarray and reverse transcriptase-polymerase chain reaction analyses further provide expression profiles of cyclin genes in different tissues of wild-type Arabidopsis plants. Comparative phylogenetic studies of 174 plant cyclins were also performed. The phylogenetic results imply that the cyclin gene family in plants has experienced more gene duplication events than in animals. Expression patterns and phylogenetic analyses of Arabidopsis cyclin genes suggest potential gene redundancy among members belonging to the same group. We discuss possible divergence and conservation of some plant cyclins. Our study provides an opportunity to rapidly assess the position of plant cyclin genes in terms of evolution and classification, serving as a guide for further functional study of plant cyclins.

  1. Comparative Genomic Hybridization Analysis of Yersinia enterocolitica and Yersinia pseudotuberculosis Identifies Genetic Traits to Elucidate Their Different Ecologies

    PubMed Central

    Jaakkola, Kaisa; Somervuo, Panu; Korkeala, Hannu

    2015-01-01

    Enteropathogenic Yersinia enterocolitica and Yersinia pseudotuberculosis are both etiological agents for intestinal infection known as yersiniosis, but their epidemiology and ecology bear many differences. Swine are the only known reservoir for Y. enterocolitica 4/O:3 strains, which are the most common cause of human disease, while Y. pseudotuberculosis has been isolated from a variety of sources, including vegetables and wild animals. Infections caused by Y. enterocolitica mainly originate from swine, but fresh produce has been the source for widespread Y. pseudotuberculosis outbreaks within recent decades. A comparative genomic hybridization analysis with a DNA microarray based on three Yersinia enterocolitica and four Yersinia pseudotuberculosis genomes was conducted to shed light on the genomic differences between enteropathogenic Yersinia. The hybridization results identified Y. pseudotuberculosis strains to carry operons linked with the uptake and utilization of substances not found in living animal tissues but present in soil, plants, and rotting flesh. Y. pseudotuberculosis also harbors a selection of type VI secretion systems targeting other bacteria and eukaryotic cells. These genetic traits are not found in Y. enterocolitica, and it appears that while Y. pseudotuberculosis has many tools beneficial for survival in varied environments, the Y. enterocolitica genome is more streamlined and adapted to their preferred animal reservoir. PMID:26605338

  2. Comparative Genomic Analysis Reveals a Critical Role of De Novo Nucleotide Biosynthesis for Saccharomyces cerevisiae Virulence

    PubMed Central

    Pérez-Torrado, Roberto; Llopis, Silvia; Perrone, Benedetta; Gómez-Pastor, Rocío; Hube, Bernhard; Querol, Amparo

    2015-01-01

    In recent years, the number of human infection cases produced by the food related species Saccharomyces cerevisiae has increased. Whereas many strains of this species are considered safe, other ‘opportunistic’ strains show a high degree of potential virulence attributes and can cause infections in immunocompromised patients. Here we studied the genetic characteristics of selected opportunistic strains isolated from dietary supplements and also from patients by array comparative genomic hybridization. Our results show increased copy numbers of IMD genes in opportunistic strains, which are implicated in the de novo biosynthesis of the purine nucleotides pathway. The importance of this pathway for virulence of S. cerevisiae was confirmed by infections in immunodeficient murine models using a GUA1 mutant, a key gene of this pathway. We show that exogenous guanine, an end product of this pathway in its triphosphorylated form, increases the survival of yeast strains in ex vivo blood infections. Finally, we show the importance of the DNA damage response that activates dNTP biosynthesis in yeast cells during ex vivo blood infections. We conclude that opportunistic yeasts may use an enhanced de novo biosynthesis of the purine nucleotides pathway to increase survival and favor infections in the host. PMID:25816288

  3. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    PubMed Central

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress. PMID:26738846

  4. Comparative Analysis of Wolbachia Genomes Reveals Streamlining and Divergence of Minimalist Two-Component Systems

    PubMed Central

    Christensen, Steen; Serbus, Laura Renee

    2015-01-01

    Two-component regulatory systems are commonly used by bacteria to coordinate intracellular responses with environmental cues. These systems are composed of functional protein pairs consisting of a sensor histidine kinase and cognate response regulator. In contrast to the well-studied Caulobacter crescentus system, which carries dozens of these pairs, the streamlined bacterial endosymbiont Wolbachia pipientis encodes only two pairs: CckA/CtrA and PleC/PleD. Here, we used bioinformatic tools to compare characterized two-component system relays from C. crescentus, the related Anaplasmataceae species Anaplasma phagocytophilum and Ehrlichia chaffeensis, and 12 sequenced Wolbachia strains. We found the core protein pairs and a subset of interacting partners to be highly conserved within Wolbachia and these other Anaplasmataceae. Genes involved in two-component signaling were positioned differently within the various Wolbachia genomes, whereas the local context of each gene was conserved. Unlike Anaplasma and Ehrlichia, Wolbachia two-component genes were more consistently found clustered with metabolic genes. The domain architecture and key functional residues standard for two-component system proteins were well-conserved in Wolbachia, although residues that specify cognate pairing diverged substantially from other Anaplasmataceae. These findings indicate that Wolbachia two-component signaling pairs share considerable functional overlap with other α-proteobacterial systems, whereas their divergence suggests the potential for regulatory differences and cross-talk. PMID:25809075

  5. Epidermodysplasia verruciformis-associated human papillomavirus 8: genomic sequence and comparative analysis.

    PubMed Central

    Fuchs, P G; Iftner, T; Weninger, J; Pfister, H

    1986-01-01

    Human papillomavirus (HPV) 8 induces skin tumors which are at high risk for malignant conversion. The nucleotide sequence of HPV8 has been determined and compared to sequences of papillomaviruses with different oncogenic potential. The general organization of the HPV8 genome is similar to that of other types. Highly conserved, genus-specific sequences were found in open reading frames (ORFs) E1, E2, and L1. In ORFs E6, E7, and L2, HPV8 is more distantly related, but it was possible to differentiate subgenera in which HPV8 belonged to the HPV1-cottontail rabbit papillomavirus group. Sequences within ORF E4 and part of ORF L2 are rather type specific. HPV8 stands out by several unique features: the considerably reduced size of the noncoding region (397 base pairs), with a seemingly low potential for forming complex secondary structures; a cluster of putative promoter elements in the 3' half of ORF E1; an RNA polymerase III promoter-like sequence close to the C terminus of ORF E2; and of particular interest, the homology between the putative protein encoded by ORF E4 and the Epstein-Barr virus nuclear antigen 2 protein, which may reflect similar mechanisms in virus-mediated transformation. PMID:3009874

  6. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges.

    PubMed

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress. PMID:26738846

  7. Comparative genomic analysis reveals the evolutionary conservation of Pax gene family.

    PubMed

    Wang, Wei; Zhong, Jing; Wang, Yi-Quan

    2010-01-01

    The Pax gene family encodes a group of transcription factors whose evolution has accompanied the major morphological and functional innovations of vertebrate species. The evolutionary conservation throughout diverse lineages of metazoan and the functional importance in development rendered Pax family an ideal system to address the relationship inside Chordata phylum. In the present study, we sequenced and annotated four genomic regions containing Chinese amphioxus (Branchiostoma belcheri) Pax genes, and retrieved homologous sequences from public database. In comparison with vertebrate homologues, the predicted amphioxus Pax proteins display high sequence conservation. Evidences from the molecular phylogenetic studies and gene organization analyses supports cephalochordates have a much closer relationship to vertebrates than that between tunicates and vertebrates, contrasting to urochordate relatives hypothesis proposed by several latest studies. Analysis of phylogenetic topology derived from concatenated subfamily datasets uncovered a potential statistical bias of supermatrix approach. Furthermore, we deduced an evolutionary scenario of Pax gene family. This scenario provided a plausible explanation for the origin and dynamics of the Pax gene members.

  8. Genome-wide comparative analysis reveals possible common ancestors of nucleotide-binding sites domain containing genes in hybrid Citrus sinensis genome and original Citrus clementina genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified and re-annotated candidate disease resistance (R) genes with nucleotide-binding sites (NBS) domain from a Citrus clementina genome and two complete Citrus sinensis genome sequences (one from the USA and one from China). We found similar numbers of NBS genes from three citrus genomes, r...

  9. Comparative genomic analysis of nine Sphingobium strains: Insights into their evolution and hexachlorocyclohexane (HCH) degradation pathways

    DOE PAGESBeta

    Verma, Helianthous; Kumar, Roshan; Oldach, Phoebe; Sangwan, Naseer; Khurana, Jitendra P.; Gilbert, Jack A.; Lal, Rup

    2014-11-23

    Background: Sphingobium spp. are efficient degraders of a wide range of chlorinated and aromatic hydrocarbons. In particular, strains which harbour the lin pathway genes mediating the degradation of hexachlorocyclohexane (HCH) isomers are of interest due to the widespread persistence of this contaminant. Here, we examined the evolution and diversification of the lin pathway under the selective pressure of HCH, by comparing the draft genomes of six newly-sequenced Sphingobium spp. (strains LL03, DS20, IP26, HDIPO4, P25 and RL3) isolated from HCH dumpsites, with three existing genomes (S. indicum B90A, S. japonicum UT26S and Sphingobium sp. SYK6). Results: Efficient HCH degraders phylogeneticallymore » clustered in a closely related group comprising of UT26S, B90A, HDIPO4 and IP26, where HDIPO4 and IP26 were classified as subspecies with ANI value >98%. Less than 10% of the total gene content was shared among all nine strains, but among the eight HCH-associated strains, that is all except SYK6, the shared gene content jumped to nearly 25%. Genes associated with nitrogen stress response and two-component systems were found to be enriched. The strains also housed many xenobiotic degradation pathways other than HCH, despite the absence of these xenobiotics from isolation sources. In addition, these strains, although non-motile, but posses flagellar assembly genes. While strains HDIPO4 and IP26 contained the complete set of lin genes, DS20 was entirely devoid of lin genes (except linKLMN) whereas, LL03, P25 and RL3 were identified as lin deficient strains, as they housed incomplete lin pathways. Further, in HDIPO4, linA was found as a hybrid of two natural variants i.e., linA1 and linA2 known for their different enantioselectivity. In conclusion, the bacteria isolated from HCH dumpsites provide a natural testing ground to study variations in the lin system and their effects on degradation efficacy. Further, the diversity in the lin gene sequences and copy number, their

  10. Complete Plastid Genome Sequencing of Four Tilia Species (Malvaceae): A Comparative Analysis and Phylogenetic Implications

    PubMed Central

    Cai, Jie; Ma, Peng-Fei; Li, Hong-Tao; Li, De-Zhu

    2015-01-01

    Tilia is an ecologically and economically important genus in the family Malvaceae. However, there is no complete plastid genome of Tilia sequenced to date, and the taxonomy of Tilia is difficult owing to frequent hybridization and polyploidization. A well-supported interspecific relationships of this genus is not available due to limited informative sites from the commonly used molecular markers. We report here the complete plastid genome sequences of four Tilia species determined by the Illumina technology. The Tilia plastid genome is 162,653 bp to 162,796 bp in length, encoding 113 unique genes and a total number of 130 genes. The gene order and organization of the Tilia plastid genome exhibits the general structure of angiosperms and is very similar to other published plastid genomes of Malvaceae. As other long-lived tree genera, the sequence divergence among the four Tilia plastid genomes is very low. And we analyzed the nucleotide substitution patterns and the evolution of insertions and deletions in the Tilia plastid genomes. Finally, we build a phylogeny of the four sampled Tilia species with high supports using plastid phylogenomics, suggesting that it is an efficient way to resolve the phylogenetic relationships of this genus. PMID:26566230

  11. The complete mitochondrial genomes sequences of Asio flammeus and Asio otus and comparative analysis.

    PubMed

    Sun, Yi; Ma, Fei; Xiao, Bing; Zheng, Junjie; Yuan, Xiaodong; Tang, Minqian; Wang, Li; Yu, Yefei; Li, Qingwei

    2004-12-01

    The complete mitochondrial genomes of Asio flammeus and Asio otus were sequenced and found to span 18858 bp and 18493 bp, respectively. It is surprising to find the former to be the largest among all avian mitochondrial genomes sequenced so far. The two genomes have very similar gene order with that of Gallus gallus, neither contains the pseudo control region, but both have a single extra base, namely Cytidine, at position 174 in ND3 gene. The control regions of Asio flammeus and Asio otus' mitochondrial genomes span 3288 bp and 2926 bp respectively, which are the longest among vertebrates except for Myxine glutinosa and contribute to the large size of two genomes. The 3' end of the control region of Asio flammeus and Asio otus contains many tandemly repeated sequences, which are highly similar to a putative control element, i.e. Mt5, and may form stable stem-loop secondary structures. Such repeated sequences probably play an important role in regulating transcription and replication of mitochondrial genome. Our results may provide important clues for uncovering the origin and evolution mechanisms of mitochondrion genome.

  12. A comparative genomic analysis of the oxidative enzymes potentially involved in lignin degradation by Agaricus bisporus.

    PubMed

    Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Fu, Bolei; Cullen, Dan

    2013-06-01

    The oxidative enzymatic machinery for degradation of organic substrates in Agaricus bisporus (Ab) is at the core of the carbon recycling mechanisms in this fungus. To date, 156 genes have been tentatively identified as part of this oxidative enzymatic machinery, which includes 26 peroxidase encoding genes, nine copper radical oxidase [including three putative glyoxal oxidase-encoding genes (GLXs)], 12 laccases sensu stricto and 109 cytochrome P450 monooxygenases. Comparative analyses of these enzymes in Ab with those of the white-rot fungus, Phanerochaete chrysosporium, the brown-rot fungus, Postia placenta, the coprophilic litter fungus, Coprinopsis cinerea and the ectomychorizal fungus, Laccaria bicolor, revealed enzyme diversity consistent with adaptation to substrates rich in humic substances and partially degraded plant material. For instance, relative to wood decay fungi, Ab cytochrome P450 genes were less numerous (109 gene models), distributed among distinctive families, and lacked extensive duplication and clustering. Viewed together with P450 transcript accumulation patterns in three tested growth conditions, these observations were consistent with the unique Ab lifestyle. Based on tandem gene arrangements, a certain degree of gene duplication seems to have occurred in this fungus in the copper radical oxidase (CRO) and the laccase gene families. In Ab, high transcript levels and regulation of the heme-thiolate peroxidases, two manganese peroxidases and the three GLX-like genes are likely in response to complex natural substrates, including lignocellulose and its derivatives, thereby suggesting an important role in lignin degradation. On the other hand, the expression patterns of the related CROs suggest a developmental role in this fungus. Based on these observations, a brief comparative genomic overview of the Ab oxidative enzyme machinery is presented. PMID:23583597

  13. Comparative genome analysis of cortactin and HS1: the significance of the F-actin binding repeat domain

    PubMed Central

    van Rossum, Agnes GSH; Schuuring-Scholtes, Ellen; Seggelen, Vera van Buuren-van; Kluin, Philip M; Schuuring, Ed

    2005-01-01

    Background In human carcinomas, overexpression of cortactin correlates with poor prognosis. Cortactin is an F-actin-binding protein involved in cytoskeletal rearrangements and cell migration by promoting actin-related protein (Arp)2/3 mediated actin polymerization. It shares a high amino acid sequence and structural similarity to hematopoietic lineage cell-specific protein 1 (HS1) although their functions differ considerable. In this manuscript we describe the genomic organization of these two genes in a variety of species by a combination of cloning and database searches. Based on our analysis, we predict the genesis of the actin-binding repeat domain during evolution. Results Cortactin homologues exist in sponges, worms, shrimps, insects, urochordates, fishes, amphibians, birds and mammalians, whereas HS1 exists in vertebrates only, suggesting that both genes have been derived from an ancestor cortactin gene by duplication. In agreement with this, comparative genome analysis revealed very similar exon-intron structures and sequence homologies, especially over the regions that encode the characteristic highly conserved F-actin-binding repeat domain. Cortactin splice variants affecting this F-actin-binding domain were identified not only in mammalians, but also in amphibians, fishes and birds. In mammalians, cortactin is ubiquitously expressed except in hematopoietic cells, whereas HS1 is mainly expressed in hematopoietic cells. In accordance with their distinct tissue specificity, the putative promoter region of cortactin is different from HS1. Conclusions Comparative analysis of the genomic organization and amino acid sequences of cortactin and HS1 provides inside into their origin and evolution. Our analysis shows that both genes originated from a gene duplication event and subsequently HS1 lost two repeats, whereas cortactin gained one repeat. Our analysis genetically underscores the significance of the F-actin binding domain in cytoskeletal remodeling, which

  14. Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics

    PubMed Central

    Harrison, Nicola; Harrison, Richard J.

    2016-01-01

    Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia. PMID:27058864

  15. Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics.

    PubMed

    Harrison, Nicola; Harrison, Richard J; Kidner, Catherine A

    2016-01-01

    Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia.

  16. Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics.

    PubMed

    Harrison, Nicola; Harrison, Richard J; Kidner, Catherine A

    2016-01-01

    Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia. PMID:27058864

  17. Reptile genomes open the frontier for comparative analysis of amniote development and regeneration.

    PubMed

    Tollis, Marc; Hutchins, Elizabeth D; Kusumi, Kenro

    2014-01-01

    Developmental genetic studies of vertebrates have focused primarily on zebrafish, frog and mouse models, which have clear application to medicine and well-developed genomic resources. In contrast, reptiles represent the most diverse amniote group, but have only recently begun to gather the attention of genome sequencing efforts. Extant reptilian groups last shared a common ancestor ?280 million years ago and include lepidosaurs, turtles and crocodilians. This phylogenetic diversity is reflected in great morphological and behavioral diversity capturing the attention of biologists interested in mechanisms regulating developmental processes such as somitogenesis and spinal patterning, regeneration, the evolution of "snake-like" morphology, the formation of the unique turtle shell, and the convergent evolution of the four-chambered heart shared by mammals and archosaurs. The complete genome of the first non-avian reptile, the green anole lizard, was published in 2011 and has provided insights into the origin and evolution of amniotes. Since then, the genomes of multiple snakes, turtles, and crocodilians have also been completed. Here we will review the current diversity of available reptile genomes, with an emphasis on their evolutionary relationships, and will highlight how these genomes have and will continue to facilitate research in developmental and regenerative biology.

  18. Comparative genomic analysis reveals species-dependent complexities that explain difficulties with microsatellite marker development in molluscs.

    PubMed

    McInerney, C E; Allcock, A L; Johnson, M P; Bailie, D A; Prodöhl, P A

    2011-01-01

    Reliable population DNA molecular markers are difficult to develop for molluscs, the reasons for which are largely unknown. Identical protocols for microsatellite marker development were implemented in three gastropods. Success rates were lower for Gibbula cineraria compared to Littorina littorea and L. saxatilis. Comparative genomic analysis of 47.2 kb of microsatellite containing sequences (MCS) revealed a high incidence of cryptic repetitive DNA in their flanking regions. The majority of these were novel, and could be grouped into DNA families based upon sequence similarities. Significant inter-specific variation in abundance of cryptic repetitive DNA and DNA families was observed. Repbase scans show that a large proportion of cryptic repetitive DNA was identified as transposable elements (TEs). We argue that a large number of TEs and their transpositional activity may be linked to differential rates of DNA multiplication and recombination. This is likely to be an important factor explaining inter-specific variation in genome stability and hence microsatellite marker development success rates. Gastropods also differed significantly in the type of TEs classes (autonomous vs non-autonomous) observed. We propose that dissimilar transpositional mechanisms differentiate the TE classes in terms of their propensity for transposition, fixation and/or silencing. Consequently, the phylogenetic conservation of non-autonomous TEs, such as CvA, suggests that dispersal of these elements may have behaved as microsatellite-inducing elements. Results seem to indicate that, compared to autonomous, non-autonomous TEs maybe have a more active role in genome rearrangement processes. The implications of the findings for genomic rearrangement, stability and marker development are discussed. PMID:20424639

  19. Comparative genomic analysis reveals species-dependent complexities that explain difficulties with microsatellite marker development in molluscs

    PubMed Central

    McInerney, C E; Allcock, A L; Johnson, M P; Bailie, D A; Prodöhl, P A

    2011-01-01

    Reliable population DNA molecular markers are difficult to develop for molluscs, the reasons for which are largely unknown. Identical protocols for microsatellite marker development were implemented in three gastropods. Success rates were lower for Gibbula cineraria compared to Littorina littorea and L. saxatilis. Comparative genomic analysis of 47.2 kb of microsatellite containing sequences (MCS) revealed a high incidence of cryptic repetitive DNA in their flanking regions. The majority of these were novel, and could be grouped into DNA families based upon sequence similarities. Significant inter-specific variation in abundance of cryptic repetitive DNA and DNA families was observed. Repbase scans show that a large proportion of cryptic repetitive DNA was identified as transposable elements (TEs). We argue that a large number of TEs and their transpositional activity may be linked to differential rates of DNA multiplication and recombination. This is likely to be an important factor explaining inter-specific variation in genome stability and hence microsatellite marker development success rates. Gastropods also differed significantly in the type of TEs classes (autonomous vs non-autonomous) observed. We propose that dissimilar transpositional mechanisms differentiate the TE classes in terms of their propensity for transposition, fixation and/or silencing. Consequently, the phylogenetic conservation of non-autonomous TEs, such as CvA, suggests that dispersal of these elements may have behaved as microsatellite-inducing elements. Results seem to indicate that, compared to autonomous, non-autonomous TEs maybe have a more active role in genome rearrangement processes. The implications of the findings for genomic rearrangement, stability and marker development are discussed. PMID:20424639

  20. Comparative genomics and functional analysis of rhamnose catabolic pathways and regulons in bacteria

    PubMed Central

    Rodionova, Irina A.; Li, Xiaoqing; Thiel, Vera; Stolyar, Sergey; Stanton, Krista; Fredrickson, James K.; Bryant, Donald A.; Osterman, Andrei L.; Best, Aaron A.; Rodionov, Dmitry A.

    2013-01-01

    L-rhamnose (L-Rha) is a deoxy-hexose sugar commonly found in nature. L-Rha catabolic pathways were previously characterized in various bacteria including Escherichia coli. Nevertheless, homology searches failed to recognize all the genes for the complete L-Rha utilization pathways in diverse microbial species involved in biomass decomposition. Moreover, the regulatory mechanisms of L-Rha catabolism have remained unclear in most species. A comparative genomics approach was used to reconstruct the L-Rha catabolic pathways and transcriptional regulons in the phyla Actinobacteria, Bacteroidetes, Chloroflexi, Firmicutes, Proteobacteria, and Thermotogae. The reconstructed pathways include multiple novel enzymes and transporters involved in the utilization of L-Rha and L-Rha-containing polymers. Large-scale regulon inference using bioinformatics revealed remarkable variations in transcriptional regulators for L-Rha utilization genes among bacteria. A novel bifunctional enzyme, L-rhamnulose-phosphate aldolase (RhaE) fused to L-lactaldehyde dehydrogenase (RhaW), which is not homologous to previously characterized L-Rha catabolic enzymes, was identified in diverse bacteria including Chloroflexi, Bacilli, and Alphaproteobacteria. By using in vitro biochemical assays we validated both enzymatic activities of the purified recombinant RhaEW proteins from Chloroflexus aurantiacus and Bacillus subtilis. Another novel enzyme of the L-Rha catabolism, L-lactaldehyde reductase (RhaZ), was identified in Gammaproteobacteria and experimentally validated by in vitro enzymatic assays using the recombinant protein from Salmonella typhimurium. C. aurantiacus induced transcription of the predicted L-Rha utilization genes when L-Rha was present in the growth medium and consumed L-Rha from the medium. This study provided comprehensive insights to L-Rha catabolism and its regulation in diverse Bacteria. PMID:24391637

  1. Comparative genomic and functional analysis reveal conservation of plant growth promoting traits in Paenibacillus polymyxa and its closely related species

    PubMed Central

    Xie, Jianbo; Shi, Haowen; Du, Zhenglin; Wang, Tianshu; Liu, Xiaomeng; Chen, Sanfeng

    2016-01-01

    Paenibacillus polymyxa has widely been studied as a model of plant-growth promoting rhizobacteria (PGPR). Here, the genome sequences of 9 P. polymyxa strains, together with 26 other sequenced Paenibacillus spp., were comparatively studied. Phylogenetic analysis of the concatenated 244 single-copy core genes suggests that the 9 P. polymyxa strains and 5 other Paenibacillus spp., isolated from diverse geographic regions and ecological niches, formed a closely related clade (here it is called Poly-clade). Analysis of single nucleotide polymorphisms (SNPs) reveals local diversification of the 14 Poly-clade genomes. SNPs were not evenly distributed throughout the 14 genomes and the regions with high SNP density contain the genes related to secondary metabolism, including genes coding for polyketide. Recombination played an important role in the genetic diversity of this clade, although the rate of recombination was clearly lower than mutation. Some genes relevant to plant-growth promoting traits, i.e. phosphate solubilization and IAA production, are well conserved, while some genes relevant to nitrogen fixation and antibiotics synthesis are evolved with diversity in this Poly-clade. This study reveals that both P. polymyxa and its closely related species have plant growth promoting traits and they have great potential uses in agriculture and horticulture as PGPR. PMID:26856413

  2. Comparative analysis of Klebsiella pneumoniae genomes identifies a phospholipase D family protein as a novel virulence factor

    PubMed Central

    2014-01-01

    Background Klebsiella pneumoniae strains are pathogenic to animals and humans, in which they are both a frequent cause of nosocomial infections and a re-emerging cause of severe community-acquired infections. K. pneumoniae isolates of the capsular serotype K2 are among the most virulent. In order to identify novel putative virulence factors that may account for the severity of K2 infections, the genome sequence of the K2 reference strain Kp52.145 was determined and compared to two K1 and K2 strains of low virulence and to the reference strains MGH 78578 and NTUH-K2044. Results In addition to diverse functions related to host colonization and virulence encoded in genomic regions common to the four strains, four genomic islands specific for Kp52.145 were identified. These regions encoded genes for the synthesis of colibactin toxin, a putative cytotoxin outer membrane protein, secretion systems, nucleases and eukaryotic-like proteins. In addition, an insertion within a type VI secretion system locus included sel1 domain containing proteins and a phospholipase D family protein (PLD1). The pld1 mutant was avirulent in a pneumonia model in mouse. The pld1 mRNA was expressed in vivo and the pld1 gene was associated with K. pneumoniae isolates from severe infections. Analysis of lipid composition of a defective E. coli strain complemented with pld1 suggests an involvement of PLD1 in cardiolipin metabolism. Conclusions Determination of the complete genome of the K2 reference strain identified several genomic islands comprising putative elements of pathogenicity. The role of PLD1 in pathogenesis was demonstrated for the first time and suggests that lipid metabolism is a novel virulence mechanism of K. pneumoniae. PMID:24885329

  3. The Complete Chloroplast Genome Sequences of Three Veroniceae Species (Plantaginaceae): Comparative Analysis and Highly Divergent Regions

    PubMed Central

    Choi, Kyoung Su; Chung, Myong Gi; Park, SeonJoo

    2016-01-01

    Previous studies of Veronica and related genera were weakly supported by molecular and paraphyletic taxa. Here, we report the complete chloroplast genome sequence of Veronica nakaiana and the related species Veronica persica and Veronicastrum sibiricum. The chloroplast genome length of V. nakaiana, V. persica, and V. sibiricum ranged from 150,198 bp to 152,930 bp. A total of 112 genes comprising 79 protein coding genes, 29 tRNA genes, and 4 rRNA genes were observed in three chloroplast genomes. The total number of SSRs was 48, 51, and 53 in V. nakaiana, V. persica, and V. sibiricum, respectively. Two SSRs (10 bp of AT and 12 bp of AATA) were observed in the same regions (rpoC2 and ndhD) in three chloroplast genomes. A comparison of coding genes and non-coding regions between V. nakaiana and V. persica revealed divergent sites, with the greatest variation occurring petD-rpoA region. The complete chloroplast genome sequence information regarding the three Veroniceae will be helpful for elucidating Veroniceae phylogenetic relationships. PMID:27047524

  4. Genome‐scale diversity and niche adaptation analysis of Lactococcus lactis by comparative genome hybridization using multi‐strain arrays

    PubMed Central

    Siezen, Roland J.; Bayjanov, Jumamurat R.; Felis, Giovanna E.; van der Sijde, Marijke R.; Starrenburg, Marjo; Molenaar, Douwe; Wels, Michiel; van Hijum, Sacha A. F. T.; van Hylckama Vlieg, Johan E. T.

    2011-01-01

    Summary Lactococcus lactis produces lactic acid and is widely used in the manufacturing of various fermented dairy products. However, the species is also frequently isolated from non‐dairy niches, such as fermented plant material. Recently, these non‐dairy strains have gained increasing interest, as they have been described to possess flavour‐forming activities that are rarely found in dairy isolates and have diverse metabolic properties. We performed an extensive whole‐genome diversity analysis on 39 L. lactis strains, isolated from dairy and plant sources. Comparative genome hybridization analysis with multi‐strain microarrays was used to assess presence or absence of genes and gene clusters in these strains, relative to all L. lactis sequences in public databases, whereby chromosomal and plasmid‐encoded genes were computationally analysed separately. Nearly 3900 chromosomal orthologous groups (chrOGs) were defined on basis of four sequenced chromosomes of L. lactis strains (IL1403, KF147, SK11, MG1363). Of these, 1268 chrOGs are present in at least 35 strains and represent the presently known core genome of L. lactis, and 72 chrOGs appear to be unique for L. lactis. Nearly 600 and 400 chrOGs were found to be specific for either the subspecies lactis or subspecies cremoris respectively. Strain variability was found in presence or absence of gene clusters related to growth on plant substrates, such as genes involved in the consumption of arabinose, xylan, α‐galactosides and galacturonate. Further niche‐specific differences were found in gene clusters for exopolysaccharides biosynthesis, stress response (iron transport, osmotolerance) and bacterial defence mechanisms (nisin biosynthesis). Strain variability of functions encoded on known plasmids included proteolysis, lactose fermentation, citrate uptake, metal ion resistance and exopolysaccharides biosynthesis. The present study supports the view of L. lactis as a species with a very flexible

  5. VISTA - computational tools for comparative genomics

    SciTech Connect

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  6. VISTA: computational tools for comparative genomics.

    PubMed

    Frazer, Kelly A; Pachter, Lior; Poliakov, Alexander; Rubin, Edward M; Dubchak, Inna

    2004-07-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here, we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/vista/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, to submit their own sequences of interest to several VISTA servers for various types of comparative analysis and to obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kb interval on human chromosome 5 that encodes for the kinesin family member 3A (KIF3A) protein.

  7. Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789.

    PubMed

    Wei, Wu; McCusker, John H; Hyman, Richard W; Jones, Ted; Ning, Ye; Cao, Zhiwei; Gu, Zhenglong; Bruno, Dan; Miranda, Molly; Nguyen, Michelle; Wilhelmy, Julie; Komp, Caridad; Tamse, Raquel; Wang, Xiaojing; Jia, Peilin; Luedi, Philippe; Oefner, Peter J; David, Lior; Dietrich, Fred S; Li, Yixue; Davis, Ronald W; Steinmetz, Lars M

    2007-07-31

    We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.

  8. Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789

    PubMed Central

    Wei, Wu; McCusker, John H.; Hyman, Richard W.; Jones, Ted; Ning, Ye; Cao, Zhiwei; Gu, Zhenglong; Bruno, Dan; Miranda, Molly; Nguyen, Michelle; Wilhelmy, Julie; Komp, Caridad; Tamse, Raquel; Wang, Xiaojing; Jia, Peilin; Luedi, Philippe; Oefner, Peter J.; David, Lior; Dietrich, Fred S.; Li, Yixue; Davis, Ronald W.; Steinmetz, Lars M.

    2007-01-01

    We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the ≈12-Mb genome of YJM789 contains ≈60,000 SNPs and ≈6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits. PMID:17652520

  9. Comparative analysis highlights variable genome content of wheat rusts and divergence of the mating loci.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Wheat is grown around the world and has been plagued by three rust fungi for centuries. Leaf rust, stripe rust, and stem rust each cause significant damage and can adapt quickly to overcome resistance that is present in wheat cultivars. Using advanced DNA sequencing technology, the genomes of leaf ...

  10. Dissecting the fungal biology of Bipolaris papendorfii: from phylogenetic to comparative genomic analysis

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Toh, Yue Fen; Chan, Chai Ling; Ngeow, Yun Fong; Lee, Kok Wei; Na, Shiang Ling; Yee, Wai-Yan; Hoh, Chee-Choong; Ng, Kee Peng

    2015-01-01

    Bipolaris papendorfii has been reported as a fungal plant pathogen that rarely causes opportunistic infection in humans. Secondary metabolites isolated from this fungus possess medicinal and anticancer properties. However, its genetic fundamental and basic biology are largely unknown. In this study, we report the first draft genome sequence of B. papendorfii UM 226 isolated from the skin scraping of a patient. The assembled 33.4 Mb genome encodes 11,015 putative coding DNA sequences, of which, 2.49% are predicted transposable elements. Multilocus phylogenetic and phylogenomic analyses showed B. papendorfii UM 226 clustering with Curvularia species, apart from other plant pathogenic Bipolaris species. Its genomic features suggest that it is a heterothallic fungus with a putative unique gene encoding the LysM-containing protein which might be involved in fungal virulence on host plants, as well as a wide array of enzymes involved in carbohydrate metabolism, degradation of polysaccharides and lignin in the plant cell wall, secondary metabolite biosynthesis (including dimethylallyl tryptophan synthase, non-ribosomal peptide synthetase, polyketide synthase), the terpenoid pathway and the caffeine metabolism. This first genomic characterization of B. papendorfii provides the basis for further studies on its biology, pathogenicity and medicinal potential. PMID:25922537

  11. Comparative analysis between homoeologous genome segments of Brassica napus and its progenitor species reveals extensive sequence-level divergence.

    PubMed

    Cheung, Foo; Trick, Martin; Drou, Nizar; Lim, Yong Pyo; Park, Jee-Young; Kwon, Soo-Jin; Kim, Jin-A; Scott, Rod; Pires, J Chris; Paterson, Andrew H; Town, Chris; Bancroft, Ian

    2009-07-01

    Homoeologous regions of Brassica genomes were analyzed at the sequence level. These represent segments of the Brassica A genome as found in Brassica rapa and Brassica napus and the corresponding segments of the Brassica C genome as found in Brassica oleracea and B. napus. Analysis of synonymous base substitution rates within modeled genes revealed a relatively broad range of times (0.12 to 1.37 million years ago) since the divergence of orthologous genome segments as represented in B. napus and the diploid species. Similar, and consistent, ranges were also identified for single nucleotide polymorphism and insertion-deletion variation. Genes conserved across the Brassica genomes and the homoeologous segments of the genome of Arabidopsis thaliana showed almost perfect collinearity. Numerous examples of apparent transduplication of gene fragments, as previously reported in B. oleracea, were observed in B. rapa and B. napus, indicating that this phenomenon is widespread in Brassica species. In the majority of the regions studied, the C genome segments were expanded in size relative to their A genome counterparts. The considerable variation that we observed, even between the different versions of the same Brassica genome, for gene fragments and annotated putative genes suggest that the concept of the pan-genome might be particularly appropriate when considering Brassica genomes.

  12. Comparative Genome Analysis of Two Isolates of the Fish Pathogen Piscirickettsia salmonis from Different Hosts Reveals Major Differences in Virulence-Associated Secretion Systems

    PubMed Central

    Bohle, Harry; Henríquez, Patricio; Grothusen, Horst; Navas, Esteban; Sandoval, Alvaro; Bustamante, Fernando; Bustos, Patricio

    2014-01-01

    Outbreaks caused by Piscirickettsia salmonis are one of the major threats to the sustainability of the Chilean salmon industry. We report here the annotated draft genomes of two P. salmonis isolates recovered from different salmonid species. A comparative analysis showed that the number of virulence-associated secretion systems constitutes a main genomic difference. PMID:25523762

  13. Genome-Wide Phylogenetic Comparative Analysis of Plant Transcriptional Regulation: A Timeline of Loss, Gain, Expansion, and Correlation with Complexity

    PubMed Central

    Lang, Daniel; Weiche, Benjamin; Timmerhaus, Gerrit; Richardt, Sandra; Riaño-Pachón, Diego M.; Corrêa, Luiz G. G.; Reski, Ralf; Mueller-Roeber, Bernd; Rensing, Stefan A.

    2010-01-01

    Evolutionary retention of duplicated genes encoding transcription-associated proteins (TAPs, comprising transcription factors and other transcriptional regulators) has been hypothesized to be positively correlated with increasing morphological complexity and paleopolyploidizations, especially within the plant kingdom. Here, we present the most comprehensive set of classification rules for TAPs and its application for genome-wide analyses of plants and algae. Using a dated species tree and phylogenetic comparative (PC) analyses, we define the timeline of TAP loss, gain, and expansion among Viridiplantae and find that two major bursts of gain/expansion occurred, coinciding with the water-to-land transition and the radiation of flowering plants. For the first time, we provide PC proof for the long-standing hypothesis that TAPs are major driving forces behind the evolution of morphological complexity, the latter in Plantae being shaped significantly by polyploidization and subsequent biased paleolog retention. Principal component analysis incorporating the number of TAPs per genome provides an alternate and significant proxy for complexity, ideally suited for PC genomics. Our work lays the ground for further interrogation of the shaping of gene regulatory networks underlying the evolution of organism complexity. PMID:20644220

  14. Genome-wide phylogenetic comparative analysis of plant transcriptional regulation: a timeline of loss, gain, expansion, and correlation with complexity.

    PubMed

    Lang, Daniel; Weiche, Benjamin; Timmerhaus, Gerrit; Richardt, Sandra; Riaño-Pachón, Diego M; Corrêa, Luiz G G; Reski, Ralf; Mueller-Roeber, Bernd; Rensing, Stefan A

    2010-07-19

    Evolutionary retention of duplicated genes encoding transcription-associated proteins (TAPs, comprising transcription factors and other transcriptional regulators) has been hypothesized to be positively correlated with increasing morphological complexity and paleopolyploidizations, especially within the plant kingdom. Here, we present the most comprehensive set of classification rules for TAPs and its application for genome-wide analyses of plants and algae. Using a dated species tree and phylogenetic comparative (PC) analyses, we define the timeline of TAP loss, gain, and expansion among Viridiplantae and find that two major bursts of gain/expansion occurred, coinciding with the water-to-land transition and the radiation of flowering plants. For the first time, we provide PC proof for the long-standing hypothesis that TAPs are major driving forces behind the evolution of morphological complexity, the latter in Plantae being shaped significantly by polyploidization and subsequent biased paleolog retention. Principal component analysis incorporating the number of TAPs per genome provides an alternate and significant proxy for complexity, ideally suited for PC genomics. Our work lays the ground for further interrogation of the shaping of gene regulatory networks underlying the evolution of organism complexity.

  15. Comparative Analysis of Subtyping Methods against a Whole-Genome-Sequencing Standard for Salmonella enterica Serotype Enteritidis

    PubMed Central

    Shariat, Nikki; Driebe, Elizabeth M.; Roe, Chandler C.; Tolar, Beth; Trees, Eija; Keim, Paul; Zhang, Wei; Dudley, Edward G.; Fields, Patricia I.; Engelthaler, David M.

    2014-01-01

    A retrospective investigation was performed to evaluate whole-genome sequencing as a benchmark for comparing molecular subtyping methods for Salmonella enterica serotype Enteritidis and survey the population structure of commonly encountered S. enterica serotype Enteritidis outbreak isolates in the United States. A total of 52 S. enterica serotype Enteritidis isolates representing 16 major outbreaks and three sporadic cases collected between 2001 and 2012 were sequenced and subjected to subtyping by four different methods: (i) whole-genome single-nucleotide-polymorphism typing (WGST), (ii) multiple-locus variable-number tandem-repeat (VNTR) analysis (MLVA), (iii) clustered regularly interspaced short palindromic repeats combined with multi-virulence-locus sequence typing (CRISPR-MVLST), and (iv) pulsed-field gel electrophoresis (PFGE). WGST resolved all outbreak clusters and provided useful robust phylogenetic inference results with high epidemiological correlation. While both MLVA and CRISPR-MVLST yielded higher discriminatory power than PFGE, MLVA outperformed the other methods in delineating outbreak clusters whereas CRISPR-MVLST showed the potential to trace major lineages and ecological origins of S. enterica serotype Enteritidis. Our results suggested that whole-genome sequencing makes a viable platform for the evaluation and benchmarking of molecular subtyping methods. PMID:25378576

  16. Comparative genomics analysis of NtcA regulons in cyanobacteria: regulation of nitrogen assimilation and its coupling to photosynthesis

    PubMed Central

    Su, Zhengchang; Olman, Victor; Mao, Fenglou; Xu, Ying

    2005-01-01

    We have developed a new method for prediction of cis-regulatory binding sites and applied it to predicting NtcA regulated genes in cyanobacteria. The algorithm rigorously utilizes concurrence information of multiple binding sites in the upstream region of a gene and that in the upstream regions of its orthologues in related genomes. A probabilistic model was developed for the evaluation of prediction reliability so that the prediction false positive rate could be well controlled. Using this method, we have predicted multiple new members of the NtcA regulons in nine sequenced cyanobacterial genomes, and showed that the false positive rates of the predictions have been reduced on an average of 40-fold compared to the conventional methods. A detailed analysis of the predictions in each genome showed that a significant portion of our predictions are consistent with previously published results about individual genes. Intriguingly, NtcA promoters are found for many genes involved in various stages of photosynthesis. Although photosynthesis is known to be tightly coordinated with nitrogen assimilation, very little is known about the underlying mechanism. We postulate for the fist time that these genes serve as the regulatory points to orchestrate these two important processes in a cyanobacterial cell. PMID:16157864

  17. Comparative analysis of genome-wide Mlo gene family in Cajanus cajan and Phaseolus vulgaris.

    PubMed

    Deshmukh, Reena; Singh, V K; Singh, B D

    2016-04-01

    The Mlo gene was discovered in barley because the mutant 'mlo' allele conferred broad-spectrum, non-race-specific resistance to powdery mildew caused by Blumeria graminis f. sp. hordei. The Mlo genes also play important roles in growth and development of plants, and in responses to biotic and abiotic stresses. The Mlo gene family has been characterized in several crop species, but only a single legume species, soybean (Glycine max L.), has been investigated so far. The present report describes in silico identification of 18 CcMlo and 20 PvMlo genes in the important legume crops Cajanus cajan (L.) Millsp. and Phaseolus vulgaris L., respectively. In silico analysis of gene organization, protein properties and conserved domains revealed that the C. cajan and P. vulgaris Mlo gene paralogs are more divergent from each other than from their orthologous pairs. The comparative phylogenetic analysis classified CcMlo and PvMlo genes into three major clades. A comparative analysis of CcMlo and PvMlo proteins with the G. max Mlo proteins indicated close association of one CcMlo, one PvMlo with two GmMlo genes, indicating that there was no further expansion of the Mlo gene family after the separation of these species. Thus, most of the diploid species of eudicots might be expected to contain 15-20 Mlo genes. The genes CcMlo12 and 14, and PvMlo11 and 12 are predicted to participate in powdery mildew resistance. If this prediction were verified, these genes could be targeted by TILLING or CRISPR to isolate powdery mildew resistant mutants. PMID:26961357

  18. Comparative analysis of genome-wide Mlo gene family in Cajanus cajan and Phaseolus vulgaris.

    PubMed

    Deshmukh, Reena; Singh, V K; Singh, B D

    2016-04-01

    The Mlo gene was discovered in barley because the mutant 'mlo' allele conferred broad-spectrum, non-race-specific resistance to powdery mildew caused by Blumeria graminis f. sp. hordei. The Mlo genes also play important roles in growth and development of plants, and in responses to biotic and abiotic stresses. The Mlo gene family has been characterized in several crop species, but only a single legume species, soybean (Glycine max L.), has been investigated so far. The present report describes in silico identification of 18 CcMlo and 20 PvMlo genes in the important legume crops Cajanus cajan (L.) Millsp. and Phaseolus vulgaris L., respectively. In silico analysis of gene organization, protein properties and conserved domains revealed that the C. cajan and P. vulgaris Mlo gene paralogs are more divergent from each other than from their orthologous pairs. The comparative phylogenetic analysis classified CcMlo and PvMlo genes into three major clades. A comparative analysis of CcMlo and PvMlo proteins with the G. max Mlo proteins indicated close association of one CcMlo, one PvMlo with two GmMlo genes, indicating that there was no further expansion of the Mlo gene family after the separation of these species. Thus, most of the diploid species of eudicots might be expected to contain 15-20 Mlo genes. The genes CcMlo12 and 14, and PvMlo11 and 12 are predicted to participate in powdery mildew resistance. If this prediction were verified, these genes could be targeted by TILLING or CRISPR to isolate powdery mildew resistant mutants.

  19. Comparative genomics of biotechnologically important yeasts

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the...

  20. The complete mitochondrial genome of Pseudoterranova azarasi and comparative analysis with other anisakid nematodes.

    PubMed

    Liu, Shan-Shan; Liu, Guo-Hua; Zhu, Xing-Quan; Weng, Ya-Biao

    2015-07-01

    Anisakiasis/anisakidosis caused by anisakid nematodes is an emerging infectious disease that can cause a wide range of clinical syndromes and are difficult to diagnose and treat in humans. In spite of their significance as pathogens, the systematics, genetics, epidemiology and biology of these parasites remain poorly understood. In the present study, we sequenced the complete mitochondrial (mt) genome of Pseudoterranova azarasi, which is one of the most important zoonotic anisakid parasites. The circular mt genome is 13,954 bp in size and encodes of 36 genes, including 12 protein-coding, 2 ribosomal RNA and 22 transfer RNA genes. The mt gene order of P. azarasi is the same as those of Ascaris spp. (Ascarididae), Toxocara spp. (Toxocaridae) and Anisakis simplex (Anisakidae), but distinct from those of Ascaridia spp. (Ascaridiidae) and Cucullanus robustus (Cucullanidae). Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference (BI) showed that Pseudoterranova were more closely related to Anisakis than they were to Contracaecum with strong a posterior probability support. This mt genome provides a novel genetic markers for exploring cryptic/sibling species and host affiliations, and should have implications for the diagnosis, prevention and control of anisakidosis in humans. PMID:25998795

  1. Comparative Analysis of the Complete Genome Sequence of the California MSW Strain of Myxoma Virus Reveals Potential Host Adaptations

    PubMed Central

    Kerr, Peter J.; Rogers, Matthew B.; Fitch, Adam; DePasse, Jay V.; Cattadori, Isabella M.; Hudson, Peter J.; Tscharke, David C.; Holmes, Edward C.

    2013-01-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits. PMID:23986601

  2. The Xenopus alcohol dehydrogenase gene family: characterization and comparative analysis incorporating amphibian and reptilian genomes

    PubMed Central

    2014-01-01

    Background The alcohol dehydrogenase (ADH) gene family uniquely illustrates the concept of enzymogenesis. In vertebrates, tandem duplications gave rise to a multiplicity of forms that have been classified in eight enzyme classes, according to primary structure and function. Some of these classes appear to be exclusive of particular organisms, such as the frog ADH8, a unique NADP+-dependent ADH enzyme. This work describes the ADH system of Xenopus, as a model organism, and explores the first amphibian and reptilian genomes released in order to contribute towards a better knowledge of the vertebrate ADH gene family. Results Xenopus cDNA and genomic sequences along with expressed sequence tags (ESTs) were used in phylogenetic analyses and structure-function correlations of amphibian ADHs. Novel ADH sequences identified in the genomes of Anolis carolinensis (anole lizard) and Pelodiscus sinensis (turtle) were also included in these studies. Tissue and stage-specific libraries provided expression data, which has been supported by mRNA detection in Xenopus laevis tissues and regulatory elements in promoter regions. Exon-intron boundaries, position and orientation of ADH genes were deduced from the amphibian and reptilian genome assemblies, thus revealing syntenic regions and gene rearrangements with respect to the human genome. Our results reveal the high complexity of the ADH system in amphibians, with eleven genes, coding for seven enzyme classes in Xenopus tropicalis. Frogs possess the amphibian-specific ADH8 and the novel ADH1-derived forms ADH9 and ADH10. In addition, they exhibit ADH1, ADH2, ADH3 and ADH7, also present in reptiles and birds. Class-specific signatures have been assigned to ADH7, and ancestral ADH2 is predicted to be a mixed-class as the ostrich enzyme, structurally close to mammalian ADH2 but with class-I kinetic properties. Remarkably, many ADH1 and ADH7 forms are observed in the lizard, probably due to lineage-specific duplications. ADH4 is not

  3. Comparative genomic hybridization with single cells after whole genome amplification

    SciTech Connect

    Haddad, B.R.; Baldini, A.; Hughes, M.R.

    1994-09-01

    Conventional karyotype analysis is the ideal way to diagnose chromosomal imbalances. However it requires cell culture and chromosome preparation. There are instances where a very small number of cells are available for cytogenetic evaluation and chromosomes cannot be obtained. Comparative genomic hybridization (CGH) is a novel molecular cytogenetic technique that provides information about genetic imbalances affecting the genome. The power of this technique lies in its ability to detect genetic imbalances using total genomic DNA. We have previously demonstrated the feasibility of whole genome amplification from single cells for subsequent analysis of multiple genetic loci by PCR. In this present work, we combine whole genome amplification with CGH to detect chromosomal imbalances from small numbers of cells. Both cytogenetically normal and abnormal cells were individually picked by micromanipulation and subjected to whole genome amplification using random oligonucleotide primers. Amplified test and control DNA were differentially labeled by incorporation of digoxigenin or biotin, mixed together and hybridized to normal male metaphase spreads. Hybridization was detected with two fluorochromes, rhodamine-anti-digoxigenin and FITC -Avidin. Ratio of intensities of the two fluorochromes along the target chromosomes was analyzed using locally developed computer imaging software. Using the combination of whole genome amplification and CGH, we were able to detect different chromosomal aneuploidies from 30, 20, and 10 cells. It can also be applied to the analysis of fetal cells sorted from maternal circulation, or to tumor cells obtained from needle biopsies or from different body fluids and effusions. Finally, its successful application to single cells will have a great impact on preimplantation diagnosis.

  4. Comparative Genome Analysis of Trichophyton rubrum and Related Dermatophytes Reveals Candidate Genes Involved in Infection

    PubMed Central

    Martinez, Diego A.; Oliver, Brian G.; Gräser, Yvonne; Goldberg, Jonathan M.; Li, Wenjun; Martinez-Rossi, Nilce M.; Monod, Michel; Shelest, Ekaterina; Barton, Richard C.; Birch, Elizabeth; Brakhage, Axel A.; Chen, Zehua; Gurr, Sarah J.; Heiman, David; Heitman, Joseph; Kosti, Idit; Rossi, Antonio; Saif, Sakina; Samalova, Marketa; Saunders, Charles W.; Shea, Terrance; Summerbell, Richard C.; Xu, Jun; Young, Sarah; Zeng, Qiandong; Birren, Bruce W.; Cuomo, Christina A.; White, Theodore C.

    2012-01-01

    ABSTRACT The major cause of athlete’s foot is Trichophyton rubrum, a dermatophyte or fungal pathogen of human skin. To facilitate molecular analyses of the dermatophytes, we sequenced T. rubrum and four related species, Trichophyton tonsurans, Trichophyton equinum, Microsporum canis, and Microsporum gypseum. These species differ in host range, mating, and disease progression. The dermatophyte genomes are highly colinear yet contain gene family expansions not found in other human-associated fungi. Dermatophyte genomes are enriched for gene families containing the LysM domain, which binds chitin and potentially related carbohydrates. These LysM domains differ in sequence from those in other species in regions of the peptide that could affect substrate binding. The dermatophytes also encode novel sets of fungus-specific kinases with unknown specificity, including nonfunctional pseudokinases, which may inhibit phosphorylation by competing for kinase sites within substrates, acting as allosteric effectors, or acting as scaffolds for signaling. The dermatophytes are also enriched for a large number of enzymes that synthesize secondary metabolites, including dermatophyte-specific genes that could synthesize novel compounds. Finally, dermatophytes are enriched in several classes of proteases that are necessary for fungal growth and nutrient acquisition on keratinized tissues. Despite differences in mating ability, genes involved in mating and meiosis are conserved across species, suggesting the possibility of cryptic mating in species where it has not been previously detected. These genome analyses identify gene families that are important to our understanding of how dermatophytes cause chronic infections, how they interact with epithelial cells, and how they respond to the host immune response. PMID:22951933

  5. A comparative analysis of methods for predicting clinical outcomes using high-dimensional genomic datasets

    PubMed Central

    Jiang, Xia; Cai, Binghuang; Xue, Diyang; Lu, Xinghua; Cooper, Gregory F; Neapolitan, Richard E

    2014-01-01

    Objective The objective of this investigation is to evaluate binary prediction methods for predicting disease status using high-dimensional genomic data. The central hypothesis is that the Bayesian network (BN)-based method called efficient Bayesian multivariate classifier (EBMC) will do well at this task because EBMC builds on BN-based methods that have performed well at learning epistatic interactions. Method We evaluate how well eight methods perform binary prediction using high-dimensional discrete genomic datasets containing epistatic interactions. The methods are as follows: naive Bayes (NB), model averaging NB (MANB), feature selection NB (FSNB), EBMC, logistic regression (LR), support vector machines (SVM), Lasso, and extreme learning machines (ELM). We use a hundred 1000-single nucleotide polymorphism (SNP) simulated datasets, ten 10 000-SNP datasets, six semi-synthetic sets, and two real genome-wide association studies (GWAS) datasets in our evaluation. Results In fivefold cross-validation studies, the SVM performed best on the 1000-SNP dataset, while the BN-based methods performed best on the other datasets, with EBMC exhibiting the best overall performance. In-sample testing indicates that LR, SVM, Lasso, ELM, and NB tend to overfit the data. Discussion EBMC performed better than NB when there are several strong predictors, whereas NB performed better when there are many weak predictors. Furthermore, for all BN-based methods, prediction capability did not degrade as the dimension increased. Conclusions Our results support the hypothesis that EBMC performs well at binary outcome prediction using high-dimensional discrete datasets containing epistatic-like interactions. Future research using more GWAS datasets is needed to further investigate the potential of EBMC. PMID:24737607

  6. Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

    PubMed

    Echols, Nathaniel; Harrison, Paul; Balasubramanian, Suganthi; Luscombe, Nicholas M; Bertone, Paul; Zhang, Zhaolei; Gerstein, Mark

    2002-06-01

    Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes-the worm, yeast, fly and human (chromosomes 21 and 22 only). Each of our nearly 2500 pseudogenes is characterized by one or more disablements mid-domain, such as premature stops and frameshifts. Here, we perform a comprehensive survey of the amino acid and nucleotide composition of these pseudogenes in comparison to that of functional genes and intergenic DNA. We show that pseudogenes invariably have an amino acid composition intermediate between genes and translated intergenic DNA. Although the degree of intermediacy varies among the four organisms, in all cases, it is most evident for amino acid types that differ most in occurrence between genes and intergenic regions. The same intermediacy also applies to codon frequencies, especially in the worm and human. Moreover, the intermediate composition of pseudogenes applies even though the composition of the genes in the four organisms is markedly different, showing a strong correlation with the overall A/T content of the genomic sequence. Pseudogenes can be divided into 'ancient' and 'modern' subsets, based on the level of sequence identity with their closest matching homolog (within the same genome). Modern pseudogenes usually have a much closer sequence composition to genes than ancient pseudogenes. Collectively, our results indicate that the composition of pseudogenes that are under no selective constraints progressively drifts from that of coding DNA towards non-coding DNA. Therefore, we propose that the degree to which pseudogenes approach a random sequence composition may be useful in dating different sets of pseudogenes, as well as to assess the rate at which intergenic DNA accumulates mutations. Our compositional analyses with the interactive viewer are available over the web at http://genecensus.org/pseudogene.

  7. Comparative Genomic Analysis of Drechmeria coniospora Reveals Core and Specific Genetic Requirements for Fungal Endoparasitism of Nematodes

    PubMed Central

    Thakur, Nishant; Arguel, Marie-Jeanne; Polanowska, Jolanta; Henrissat, Bernard; Record, Eric; Magdelenat, Ghislaine; Barbe, Valérie; Raffaele, Sylvain; Barbry, Pascal

    2016-01-01

    Drechmeria coniospora is an obligate fungal pathogen that infects nematodes via the adhesion of specialized spores to the host cuticle. D. coniospora is frequently found associated with Caenorhabditis elegans in environmental samples. It is used in the study of the nematode’s response to fungal infection. Full understanding of this bi-partite interaction requires knowledge of the pathogen’s genome, analysis of its gene expression program and a capacity for genetic engineering. The acquisition of all three is reported here. A phylogenetic analysis placed D. coniospora close to the truffle parasite Tolypocladium ophioglossoides, and Hirsutella minnesotensis, another nematophagous fungus. Ascomycete nematopathogenicity is polyphyletic; D. coniospora represents a branch that has not been molecularly characterized. A detailed in silico functional analysis, comparing D. coniospora to 11 fungal species, revealed genes and gene families potentially involved in virulence and showed it to be a highly specialized pathogen. A targeted comparison with nematophagous fungi highlighted D. coniospora-specific genes and a core set of genes associated with nematode parasitism. A comparative gene expression analysis of samples from fungal spores and mycelia, and infected C. elegans, gave a molecular view of the different stages of the D. coniospora lifecycle. Transformation of D. coniospora allowed targeted gene knock-out and the production of fungus that expresses fluorescent reporter genes. It also permitted the initial characterisation of a potential fungal counter-defensive strategy, involving interference with a host antimicrobial mechanism. This high-quality annotated genome for D. coniospora gives insights into the evolution and virulence of nematode-destroying fungi. Coupled with genetic transformation, it opens the way for molecular dissection of D. coniospora physiology, and will allow both sides of the interaction between D. coniospora and C. elegans, as well as the

  8. Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster

    NASA Astrophysics Data System (ADS)

    Zdobnov, Evgeny M.; von Mering, Christian; Letunic, Ivica; Torrents, David; Suyama, Mikita; Copley, Richard R.; Christophides, George K.; Thomasova, Dana; Holt, Robert A.; Subramanian, G. Mani; Mueller, Hans-Michael; Dimopoulos, George; Law, John H.; Wells, Michael A.; Birney, Ewan; Charlab, Rosane; Halpern, Aaron L.; Kokoza, Elena; Kraft, Cheryl L.; Lai, Zhongwu; Lewis, Suzanna; Louis, Christos; Barillas-Mury, Carolina; Nusskern, Deborah; Rubin, Gerald M.; Salzberg, Steven L.; Sutton, Granger G.; Topalis, Pantelis; Wides, Ron; Wincker, Patrick; Yandell, Mark; Collins, Frank H.; Ribeiro, Jose; Gelbart, William M.; Kafatos, Fotis C.; Bork, Peer

    2002-10-01

    Comparison of the genomes and proteomes of the two diptera Anopheles gambiae and Drosophila melanogaster, which diverged about 250 million years ago, reveals considerable similarities. However, numerous differences are also observed; some of these must reflect the selection and subsequent adaptation associated with different ecologies and life strategies. Almost half of the genes in both genomes are interpreted as orthologs and show an average sequence identity of about 56%, which is slightly lower than that observed between the orthologs of the pufferfish and human (diverged about 450 million years ago). This indicates that these two insects diverged considerably faster than vertebrates. Aligned sequences reveal that orthologous genes have retained only half of their intron/exon structure, indicating that intron gains or losses have occurred at a rate of about one per gene per 125 million years. Chromosomal arms exhibit significant remnants of homology between the two species, although only 34% of the genes colocalize in small ``microsyntenic'' clusters, and major interarm transfers as well as intra-arm shuffling of gene order are detected.

  9. Comparative genomic analysis of coffee-infecting Xylella fastidiosa strains isolated from Brazil.

    PubMed

    Barbosa, Deibs; Alencar, Valquíria Campos; Santos, Daiene Souza; de Freitas Oliveira, Ana Cláudia; de Souza, Alessandra A; Coletta-Filho, Helvecio D; de Oliveira, Regina Souza; Nunes, Luiz R

    2015-05-01

    Strains of Xylella fastidiosa constitute a complex group of bacteria that develop within the xylem of many plant hosts, causing diseases of significant economic importance, such as Pierce's disease in North American grapevines and citrus variegated chlorosis in Brazil. X. fastidiosa has also been obtained from other host plants, in direct correlation with the development of diseases, as in the case of coffee leaf scorch (CLS)--a disease with potential to cause severe economic losses to the Brazilian coffee industry. This paper describes a thorough genomic characterization of coffee-infecting X. fastidiosa strains, initially performed through a microarray-based approach, which demonstrated that CLS strains could be subdivided in two phylogenetically distinct subgroups. Whole-genomic sequencing of two of these bacteria (one from each subgroup) allowed identification of ORFs and horizontally transferred elements (HTEs) that were specific to CLS-related X. fastidiosa strains. Such analyses confirmed the size and importance of HTEs as major mediators of chromosomal evolution amongst these bacteria, and allowed identification of differences in gene content, after comparisons were made with previously sequenced X. fastidiosa strains, isolated from alternative hosts. Although direct experimentation still needs to be performed to elucidate the biological consequences associated with such differences, it was interesting to verify that CLS-related bacteria display variations in genes that produce toxins, as well as surface-related factors (such as fimbrial adhesins and LPS) that have been shown to be involved with recognition of specific host factors in different pathogenic bacteria. PMID:25737482

  10. Comparative analysis of American Dengue virus type 1 full-genome sequences.

    PubMed

    Carvalho, S E S; Martin, D P; Oliveira, L M; Ribeiro, B M; Nagata, T

    2010-02-01

    Dengue virus (DENV; Genus Flavivirus, Family Flaviviridae) has been circulating in Brazil since at least the mid-1980s and continues to be responsible for sporadic cases of Dengue fever and Dengue hemorrhagic fever throughout this country. Here, we describe the full genomes of two new Brazilian DENV-serotype 1 (DENV-1) variants and analyze these together with all other available American DENV-1 full-genome sequences. Besides confirming the existence of various country-specific DENV-1 founder effects that have produced a high degree of geographical structure in the American DENV-1 population, we also identify that one of the new viruses is one of only three detectable intra-American DENV-1 recombinants. Although such obvious evidence of genetic exchange among epidemiologically unlinked Latin American DENV-1 sequences is relatively rare, we find that at the population-scale there exists substantial evidence of pervasive recombination that most likely occurs between viruses that are so genetically similar that it is not possible to reliably distinguish and characterize individual recombination events.

  11. Comparative Genomic and Sequence Analysis Provides Insight into the Molecular Functionality of NOD1 and NOD2

    PubMed Central

    Boyle, Joseph P.; Mayle, Sophie; Parkhouse, Rhiannon; Monie, Tom P.

    2013-01-01

    Amino acids with functional or key structural roles display higher degrees of conservation through evolution. The comparative analysis of protein sequences from multiple species and/or between homologous proteins can be highly informative in the identification of key structural and functional residues. Residues which in turn provide insight into the molecular mechanisms of protein function. We have explored the genomic and amino acid conservation of the prototypic innate immune genes NOD1 and NOD2. NOD1 orthologs were found in all vertebrate species analyzed, whilst NOD2 was absent from the genomes of avian, reptilian and amphibian species. Evolutionary trace analysis was used to identify highly conserved regions of NOD1 and NOD2 across multiple species. Consistent with the known functions of NOD1 and NOD2 highly conserved patches were identified that matched the Walker A and B motifs and provided interaction surfaces for the adaptor protein RIP2. Other patches of high conservation reflect key structural functions as predicted by homology models. In addition, the pattern of residue conservation within the leucine-rich repeat (LRR) region of NOD1 and NOD2 is indicative of a conserved mechanism of ligand recognition involving the concave surface of the LRRs. PMID:24109482

  12. Sequence-level comparative analysis of the Brassica napus genome around two stearoyl-ACP desaturase loci.

    PubMed

    Cho, Kwangsoo; O'Neill, Carmel M; Kwon, Soo-Jin; Yang, Tae-Jin; Smooker, Andrew M; Fraser, Fiona; Bancroft, Ian

    2010-02-01

    We conducted a sequence-level comparative analyses, at the scale of complete bacterial artificial chromosome (BAC) clones, between the genome of the most economically important Brassica species, Brassica napus (oilseed rape), and those of Brassica rapa, the genome of which is currently being sequenced, and Arabidopsis thaliana. We constructed a new B. napus BAC library and identified and sequenced clones that contain homoeologous regions of the genome including stearoyl-ACP desaturase-encoding genes. We sequenced the orthologous region of the genome of B. rapa and conducted comparative analyses between the Brassica sequences and those of the orthologous region of the genome of A. thaliana. The proportion of genes conserved (approximately 56%) is lower than has been reported previously between A. thaliana and Brassica (approximately 66%). The gene models for sets of conserved genes were used to determine the extent of nucleotide conservation of coding regions. This was found to be 84.2 +/- 3.9% and 85.8 +/- 3.7% between the B. napus A and C genomes, respectively, and that of A. thaliana, which is consistent with previous results for other Brassica species, and 97.5 +/- 3.1% between the B. napus A genome and B. rapa, and 93.1 +/- 4.9% between the B. napus C genome and B. rapa. The divergence of the B. napus genes from the A genome and the B. rapa genes was greater than anticipated and indicates that the A genome ancestor of the B. napus cultivar studied was relatively distantly related to the cultivar of B. rapa selected for genome sequencing.

  13. Comparative analysis of mitochondrial genomes in Diplura (hexapoda, arthropoda): taxon sampling is crucial for phylogenetic inferences.

    PubMed

    Chen, Wan-Jun; Koch, Markus; Mallatt, Jon M; Luan, Yun-Xia

    2014-01-01

    Two-pronged bristletails (Diplura) are traditionally classified into three major superfamilies: Campodeoidea, Projapygoidea, and Japygoidea. The interrelationships of these three superfamilies and the monophyly of Diplura have been much debated. Few previous studies included Projapygoidea in their phylogenetic considerations, and its position within Diplura still is a puzzle from both morphological and molecular points of view. Until now, no mitochondrial genome has been sequenced for any projapygoid species. To fill in this gap, we determined and annotated the complete mitochondrial genome of Octostigma sinensis (Octostigmatidae, Projapygoidea), and of three more dipluran species, one each from the Campodeidae, Parajapygidae, and Japygidae. All four newly sequenced dipluran mtDNAs encode the same set of genes in the same gene order as shared by most crustaceans and hexapods. Secondary structure truncations have occurred in trnR, trnC, trnS1, and trnS2, and the reduction of transfer RNA D-arms was found to be taxonomically correlated, with Campodeoidea having experienced the most reduction. Partitioned phylogenetic analyses, based on both amino acids and nucleotides of the protein-coding genes plus the ribosomal RNA genes, retrieve significant support for a monophyletic Diplura within Pancrustacea, with Projapygoidea more closely related to Campodeoidea than to Japygoidea. Another key finding is that monophyly of Diplura cannot be recovered unless Projapygoidea is included in the phylogenetic analyses; this explains the dipluran polyphyly found by past mitogenomic studies. Including Projapygoidea increased the sampling density within Diplura and probably helped by breaking up a long-branch-attraction artifact. This finding provides an example of how proper sampling is significant for phylogenetic inference.

  14. Comparative primate genomics: emerging patterns of genome content and dynamics.

    PubMed

    Rogers, Jeffrey; Gibbs, Richard A

    2014-05-01

    Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future.

  15. Comparative primate genomics: emerging patterns of genome content and dynamics.

    PubMed

    Rogers, Jeffrey; Gibbs, Richard A

    2014-05-01

    Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future. PMID:24709753

  16. Comparative primate genomics: emerging patterns of genome content and dynamics

    PubMed Central

    Rogers, Jeffrey; Gibbs, Richard A.

    2014-01-01

    Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

  17. Ebolavirus comparative genomics

    SciTech Connect

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Ussery, David W.

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.

  18. Comparative genomic analysis of the Haloferax volcanii DS2 and Halobacterium salinarium GRB contig maps reveals extensive rearrangement.

    PubMed Central

    St Jean, A; Charlebois, R L

    1996-01-01

    Anonymous probes from the genome of Halobacterium salinarium GRB and 12 gene probes were hybridized to the cosmid clones representing the chromosome and plasmids of Halobacterium salinarium GRB and Haloferax volcanii DS2. The order of and pairwise distances between 35 loci uniquely cross-hybridizing to both chromosomes were analyzed in a search for conservation. No conservation between the genomes could be detected at the 15-kbp resolution used in this study. We found distinct sets of low-copy-number repeated sequences in the chromosome and plasmids of Halobacterium salinarium GRB, indicating some degree of partitioning between these replicons. We propose alternative courses for the evolution of the haloarchaeal genome: (i) that the majority of genomic differences that exist between genera came about at the inception of this group or (ii) that the differences have accumulated over the lifetime of the lineage. The strengths and limitations of investigating these models through comparative genomic studies are discussed. PMID:8682791

  19. Comparative phylogenetic analysis of genome-wide Mlo gene family members from Glycine max and Arabidopsis thaliana.

    PubMed

    Deshmukh, Reena; Singh, V K; Singh, B D

    2014-06-01

    Powdery mildew locus O (Mlo) gene family is one of the largest seven transmembrane protein-encoding gene families. The Mlo proteins act as negative regulators of powdery mildew resistance and a loss-of-function mutation in Mlo is known to confer broad-spectrum resistance to powdery mildew. In addition, the Mlo gene family members are known to participate in various developmental and biotic and abiotic stress response-related pathways. Therefore, a genome-wide similarity search using the characterized Mlo protein sequences of Arabidopsis thaliana was carried out to identify putative Mlo genes in soybean (Glycine max) genome. This search identified 39 Mlo domain containing protein-encoding genes that were distributed on 15 of the 20 G. max chromosomes. The putative promoter regions of these Mlo genes contained response elements for different external stimuli, including different hormones and abiotic stresses. Of the 39 GmMlo proteins, 35 were rich (8.7-13.1 %) in leucine, while five were serine-rich (9.2-11.9 %). Furthermore, all the GmMlo members were localized in the plasma membrane. Phylogenetic analysis of the GmMlo and the AtMlo proteins classified them into three main clusters, and the cluster I comprised two sub-clusters. Multiple sequence alignment visualized the location of seven transmembrane domains, and a conserved CaM-binding domain. Some of the GmMlo proteins (GmMlo10, 20, 22, 23, 32, 36, 37) contained less than seven transmembrane domains. The motif analysis yielded 27 motifs; out of these, motif 2, the only motif present in all the GmMlos, was highly conserved and three amino acid residues were essentially invariant. Five of the GmMlo members were much smaller in size; presumably they originated through deletion following a gene duplication event. The presence of a large number of GmMlo members in the G. max genome may be due to its paleopolyploid nature and the large genome size as compared to that of Arabidopsis. The findings of this study may

  20. A Comparative Map of the Zebrafish Genome

    PubMed Central

    Woods, Ian G.; Kelly, Peter D.; Chu, Felicia; Ngo-Hazelett, Phuong; Yan, Yi-Lin; Huang, Hui; Postlethwait, John H.; Talbot, William S.

    2000-01-01

    Zebrafish mutations define the functions of hundreds of essential genes in the vertebrate genome. To accelerate the molecular analysis of zebrafish mutations and to facilitate comparisons among the genomes of zebrafish and other vertebrates, we used a homozygous diploid meiotic mapping panel to localize polymorphisms in 691 previously unmapped genes and expressed sequence tags (ESTs). Together with earlier efforts, this work raises the total number of markers scored in the mapping panel to 2119, including 1503 genes and ESTs and 616 previously characterized simple-sequence length polymorphisms. Sequence analysis of zebrafish genes mapped in this study and in prior work identified putative human orthologs for 804 zebrafish genes and ESTs. Map comparisons revealed 139 new conserved syntenies, in which two or more genes are on the same chromosome in zebrafish and human. Although some conserved syntenies are quite large, there were changes in gene order within conserved groups, apparently reflecting the relatively frequent occurrence of inversions and other intrachromosomal rearrangements since the divergence of teleost and tetrapod ancestors. Comparative mapping also shows that there is not a one-to-one correspondence between zebrafish and human chromosomes. Mapping of duplicate gene pairs identified segments of 20 linkage groups that may have arisen during a genome duplication that occurred early in the evolution of teleosts after the divergence of teleost and mammalian ancestors. This comparative map will accelerate the molecular analysis of zebrafish mutations and enhance the understanding of the evolution of the vertebrate genome. PMID:11116086

  1. A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains

    PubMed Central

    2013-01-01

    Background The mouse inbred line C57BL/6J is widely used in mouse genetics and its genome has been incorporated into many genetic reference populations. More recently large initiatives such as the International Knockout Mouse Consortium (IKMC) are using the C57BL/6N mouse strain to generate null alleles for all mouse genes. Hence both strains are now widely used in mouse genetics studies. Here we perform a comprehensive genomic and phenotypic analysis of the two strains to identify differences that may influence their underlying genetic mechanisms. Results We undertake genome sequence comparisons of C57BL/6J and C57BL/6N to identify SNPs, indels and structural variants, with a focus on identifying all coding variants. We annotate 34 SNPs and 2 indels that distinguish C57BL/6J and C57BL/6N coding sequences, as well as 15 structural variants that overlap a gene. In parallel we assess the comparative phenotypes of the two inbred lines utilizing the EMPReSSslim phenotyping pipeline, a broad based assessment encompassing diverse biological systems. We perform additional secondary phenotyping assessments to explore other phenotype domains and to elaborate phenotype differences identified in the primary assessment. We uncover significant phenotypic differences between the two lines, replicated across multiple centers, in a number of physiological, biochemical and behavioral systems. Conclusions Comparison of C57BL/6J and C57BL/6N demonstrates a range of phenotypic differences that have the potential to impact upon penetrance and expressivity of mutational effects in these strains. Moreover, the sequence variants we identify provide a set of candidate genes for the phenotypic differences observed between the two strains. PMID:23902802

  2. Comparative genomic analysis of Saccharomyces cerevisiae yeasts isolated from fermentations of traditional beverages unveils different adaptive strategies.

    PubMed

    Ibáñez, Clara; Pérez-Torrado, Roberto; Chiva, Rosana; Guillamón, José Manuel; Barrio, Eladio; Querol, Amparo

    2014-02-01

    Saccharomyces cerevisiae strains are the main responsible of most traditional alcohol fermentation processes performed around the world. The characteristics of the diverse traditional fermentations are very different according to their sugar composition, temperature, pH or nitrogen sources. During the adaptation of yeasts to these new environments provided by human activity, their different compositions likely imposed selective pressures that shaped the S. cerevisiae genome. In the present work we performed a comparative genomic hybridization analysis to explore the genome constitution of six S. cerevisiae strains isolated from different traditional fermentations (masato, mescal, cachaça, sake, wine, and sherry wine) and one natural strain. Our results indicate that gene copy numbers (GCN) are very variable among strains, and most of them were observed in subtelomeric and intrachromosomal gene families involved in metabolic functions related to cellular homeostasis, cell-to-cell interactions, and transport of solutes such as ions, sugars and metals. In many cases, these genes are not essential but they can play an important role in the adaptation to new environmental conditions. However, the most interesting result is the association observed between GCN changes in genes involved in the nitrogen metabolism and the availability of nitrogen sources in the different traditional fermentation processes. This is clearly illustrated by the differences in copy numbers not only in gene PUT1, the main player in the assimilation of proline as a nitrogen source, but also in CAR2, involved in arginine catabolism. Strains isolated from fermentations where proline is more abundant contain a higher number of PUT1 copies and are more efficient in assimilating this amino acid as a nitrogen source. A strain isolated from sugarcane juice fermentations, in which arginine is a rare amino acid, contains less copies of CAR2 and showed low efficiency in arginine assimilation. These

  3. Comparative Analysis of Human B Cell Epitopes Based on BCG Genomes.

    PubMed

    Li, Machao; Liu, Haican; Zhao, Xiuqin; Wan, Kanglin

    2016-01-01

    Background. Tuberculosis is a huge global health problem. BCG is the only vaccine used for about 100 years against TB, but the reasons for protection variability in populations remain unclear. To improve BCG efficacy and develop a strategy for new vaccines, the underlying genetic differences among BCG subtypes should be understood urgently. Methods and Findings. Human B cell epitope data were collected from the Immune Epitope Database. Epitope sequences were mapped with those of 15 genomes, including 13 BCGs, M. bovis AF2122/97, and M. tuberculosis H37Rv, to identify epitopes distribution. Among 398 experimentally verified B cell epitopes, 321 (80.7%) were conserved, while the remaining 77 (19.3%) were lost to varying degrees in BCGs. The variable protective efficacy of BCGs may result from the degree of B cell epitopes deficiency. Conclusions. Here we firstly analyzed the genetic characteristics of BCGs based on B cell epitopes and found that B cell epitopes distribution may contribute to vaccine efficacy. Restoration of important antigens or effective B cell epitopes in BCG could be a useful strategy for vaccine development. PMID:27382565

  4. Comparative analysis of contextual bias around the translation initiation sites in plant genomes.

    PubMed

    Gupta, Paras; Rangan, Latha; Ramesh, T Venkata; Gupta, Mudit

    2016-09-01

    Nucleotide distribution around translation initiation site (TIS) is thought to play an important role in determining translation efficiency. Kozak in vertebrates and later Joshi et al. in plants identified context sequence having a key role in translation efficiency, but a great variation regarding this context sequence has been observed among different taxa. The present study aims to refine the context sequence around initiation codon in plants and addresses the sampling error problem by using complete genomes of 7 monocots and 7 dicots separately. Besides positions -3 and +4, significant conservation at -2 and +5 positions was also found and nucleotide bias at the latter two positions was shown to directly influence translation efficiency in the taxon studied. About 1.8% (monocots) and 2.4% (dicots) of the total sequences fit the context sequence from positions -3 to +5, which might be indicative of lower number of housekeeping genes in the transcriptome. A three base periodicity was observed in 5' UTR and CDS of monocots and only in CDS of dicots as confirmed against random occurrence and annotation errors. Deterministic enrichment of GCNAUGGC in monocots, AANAUGGC in dicots and GCNAUGGC in plants around TIS was also established (where AUG denotes the start codon), which can serve as an arbiter of putative TIS with efficient translation in plants. PMID:27316311

  5. Genomic organization of the mouse dystrobrevin gene: Comparative analysis with the dystrophin gene

    SciTech Connect

    Ambrose, H.J.; Blake, D.J.; Nawrotzki, R.A.; Davies, K.E.

    1997-02-01

    Dystrobrevin, the mammalian orthologue of the Torpedo 87-kDa postsynaptic protein, is a member of the dystrophin gene family with homology to the cysteine-rich carboxy-terminal domain of dystrophin. Torpedo dystrobrevin copurifies with the acetylcholine receptors and is thought to form a complex with dystrophin and syntrophin. This complex is also found at the sarcolemma in vertebrates and defines the cytoplasmic component of the dystrophin-associated protein complex. Previously we have cloned several dystrobrevin isoforms from mouse brain and muscle. Here we show that these transcripts are the products of a single gene located on proximal mouse chromosome 18. To investigate the diversity of dystrobrevin transcripts we have determined that the mouse dystrobrevin gene is organized into 24 coding exons that span between 130 and 170 kb at the genomic level. The gene encodes at least three distinct protein isoforms that are expressed in a tissue-specific manner. Interestingly, although there is only 27% amino acid identity between the homologous regions of dystrobrevin and dystrophin, the positions of 8 of the 15 exon-intron junctions are identical. 47 refs., 4 figs., 2 tabs.

  6. Comparative Analysis of Human B Cell Epitopes Based on BCG Genomes

    PubMed Central

    Liu, Haican; Zhao, Xiuqin; Wan, Kanglin

    2016-01-01

    Background. Tuberculosis is a huge global health problem. BCG is the only vaccine used for about 100 years against TB, but the reasons for protection variability in populations remain unclear. To improve BCG efficacy and develop a strategy for new vaccines, the underlying genetic differences among BCG subtypes should be understood urgently. Methods and Findings. Human B cell epitope data were collected from the Immune Epitope Database. Epitope sequences were mapped with those of 15 genomes, including 13 BCGs, M. bovis AF2122/97, and M. tuberculosis H37Rv, to identify epitopes distribution. Among 398 experimentally verified B cell epitopes, 321 (80.7%) were conserved, while the remaining 77 (19.3%) were lost to varying degrees in BCGs. The variable protective efficacy of BCGs may result from the degree of B cell epitopes deficiency. Conclusions. Here we firstly analyzed the genetic characteristics of BCGs based on B cell epitopes and found that B cell epitopes distribution may contribute to vaccine efficacy. Restoration of important antigens or effective B cell epitopes in BCG could be a useful strategy for vaccine development. PMID:27382565

  7. Unraveling novel broad-spectrum antibacterial targets in food and waterborne pathogens using comparative genomics and protein interaction network analysis.

    PubMed

    Jadhav, Ankush; Shanmugham, Buvaneswari; Rajendiran, Anjana; Pan, Archana

    2014-10-01

    Food and waterborne diseases are a growing concern in terms of human morbidity and mortality worldwide, even in the 21st century, emphasizing the need for new therapeutic interventions for these diseases. The current study aims at prioritizing broad-spectrum antibacterial targets, present in multiple food and waterborne bacterial pathogens, through a comparative genomics strategy coupled with a protein interaction network analysis. The pathways unique and common to all the pathogens under study (viz., methane metabolism, d-alanine metabolism, peptidoglycan biosynthesis, bacterial secretion system, two-component system, C5-branched dibasic acid metabolism), identified by comparative metabolic pathway analysis, were considered for the analysis. The proteins/enzymes involved in these pathways were prioritized following host non-homology analysis, essentiality analysis, gut flora non-homology analysis and protein interaction network analysis. The analyses revealed a set of promising broad-spectrum antibacterial targets, present in multiple food and waterborne pathogens, which are essential for bacterial survival, non-homologous to host and gut flora, and functionally important in the metabolic network. The identified broad-spectrum candidates, namely, integral membrane protein/virulence factor (MviN), preprotein translocase subunits SecB and SecG, carbon storage regulator (CsrA), and nitrogen regulatory protein P-II 1 (GlnB), contributed by the peptidoglycan pathway, bacterial secretion systems and two-component systems, were also found to be present in a wide range of other disease-causing bacteria. Cytoplasmic proteins SecG, CsrA and GlnB were considered as drug targets, while membrane proteins MviN and SecB were classified as vaccine targets. The identified broad-spectrum targets can aid in the design and development of antibacterial agents not only against food and waterborne pathogens but also against other pathogens. PMID:25128740

  8. Unraveling novel broad-spectrum antibacterial targets in food and waterborne pathogens using comparative genomics and protein interaction network analysis.

    PubMed

    Jadhav, Ankush; Shanmugham, Buvaneswari; Rajendiran, Anjana; Pan, Archana

    2014-10-01

    Food and waterborne diseases are a growing concern in terms of human morbidity and mortality worldwide, even in the 21st century, emphasizing the need for new therapeutic interventions for these diseases. The current study aims at prioritizing broad-spectrum antibacterial targets, present in multiple food and waterborne bacterial pathogens, through a comparative genomics strategy coupled with a protein interaction network analysis. The pathways unique and common to all the pathogens under study (viz., methane metabolism, d-alanine metabolism, peptidoglycan biosynthesis, bacterial secretion system, two-component system, C5-branched dibasic acid metabolism), identified by comparative metabolic pathway analysis, were considered for the analysis. The proteins/enzymes involved in these pathways were prioritized following host non-homology analysis, essentiality analysis, gut flora non-homology analysis and protein interaction network analysis. The analyses revealed a set of promising broad-spectrum antibacterial targets, present in multiple food and waterborne pathogens, which are essential for bacterial survival, non-homologous to host and gut flora, and functionally important in the metabolic network. The identified broad-spectrum candidates, namely, integral membrane protein/virulence factor (MviN), preprotein translocase subunits SecB and SecG, carbon storage regulator (CsrA), and nitrogen regulatory protein P-II 1 (GlnB), contributed by the peptidoglycan pathway, bacterial secretion systems and two-component systems, were also found to be present in a wide range of other disease-causing bacteria. Cytoplasmic proteins SecG, CsrA and GlnB were considered as drug targets, while membrane proteins MviN and SecB were classified as vaccine targets. The identified broad-spectrum targets can aid in the design and development of antibacterial agents not only against food and waterborne pathogens but also against other pathogens.

  9. Comparative genomics of Listeria species.

    PubMed

    Glaser, P; Frangeul, L; Buchrieser, C; Rusniok, C; Amend, A; Baquero, F; Berche, P; Bloecker, H; Brandt, P; Chakraborty, T; Charbit, A; Chetouani, F; Couvé, E; de Daruvar, A; Dehoux, P; Domann, E; Domínguez-Bernal, G; Duchaud, E; Durant, L; Dussurget, O; Entian, K D; Fsihi, H; García-del Portillo, F; Garrido, P; Gautier, L; Goebel, W; Gómez-López, N; Hain, T; Hauf, J; Jackson, D; Jones, L M; Kaerst, U; Kreft, J; Kuhn, M; Kunst, F; Kurapkat, G; Madueno, E; Maitournam, A; Vicente, J M; Ng, E; Nedjari, H; Nordsiek, G; Novella, S; de Pablos, B; Pérez-Diaz, J C; Purcell, R; Remmel, B; Rose, M; Schlueter, T; Simoes, N; Tierrez, A; Vázquez-Boland, J A; Voss, H; Wehland, J; Cossart, P

    2001-10-26

    Listeria monocytogenes is a food-borne pathogen with a high mortality rate that has also emerged as a paradigm for intracellular parasitism. We present and compare the genome sequences of L. monocytogenes (2,944,528 base pairs) and a nonpathogenic species, L. innocua (3,011,209 base pairs). We found a large number of predicted genes encoding surface and secreted proteins, transporters, and transcriptional regulators, consistent with the ability of both species to adapt to diverse environments. The presence of 270 L. monocytogenes and 149 L. innocua strain-specific genes (clustered in 100 and 63 islets, respectively) suggests that virulence in Listeria results from multiple gene acquisition and deletion events.

  10. Complete Genome Sequence of a High Lipid-Producing Strain of Mucor circinelloides WJ11 and Comparative Genome Analysis with a Low Lipid-Producing Strain CBS 277.49

    PubMed Central

    Tang, Xin; Zhao, Lina; Chen, Haiqin; Chen, Yong Q.; Chen, Wei; Song, Yuanda; Ratledge, Colin

    2015-01-01

    The genome of a high lipid-producing fungus Mucor circinelloides WJ11 (36% w/w lipid, cell dry weight, CDW) was sequenced and compared with that of the low lipid-producing strain, CBS 277.49 (15% w/w lipid, CDW), which had been sequenced by Joint Genome Institute. The WJ11 genome assembly size was 35.4 Mb with a G+C content of 39.7%. The general features of WJ11 and CBS 277.49 indicated that they have close similarity at the level of gene order and gene identity. Whole genome alignments with MAUVE revealed the presence of numerous blocks of homologous regions and MUMmer analysis showed that the genomes of these two strains were mostly co-linear. The central carbon and lipid metabolism pathways of these two strains were reconstructed and the numbers of genes encoding the enzymes related to lipid accumulation were compared. Many unique genes coding for proteins involved in cell growth, carbohydrate metabolism and lipid metabolism were identified for each strain. In conclusion, our study on the genome sequence of WJ11 and the comparative genomic analysis between WJ11 and CBS 277.49 elucidated the general features of the genome and the potential mechanism of high lipid accumulation in strain WJ11 at the genomic level. The different numbers of genes and unique genes involved in lipid accumulation may play a role in the high oleaginicity of strain WJ11. PMID:26352831

  11. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    PubMed

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  12. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation

    PubMed Central

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M. PMID:26039674

  13. Comparative genomics of BCG vaccines.

    PubMed

    Behr, M A

    2001-01-01

    Bacille Calmette-Guérin (BCG) vaccines have been given to more people than any other vaccine. They have also probably resulted in as much controversy as any other vaccine. In clinical trials, the efficacy of BCG vaccination against pulmonary TB has been widely variable. At the same time, a number of investigators have observed phenotypic differences between BCG daughter strains, raising the possibility that differences between BCG products may in some way translate into different outcomes. With recent genomic analysis of BCG strains, it has become possible to piece together the molecular events that have resulted in current BCG vaccines. Between the derivation of BCG in 1921 and the lyophilization of BCG Pasteur 1173 in 1961, there have been at least seven genetic events, including deletions, duplications and a single nucleotide polymorphism. The phenotypic relevance of these changes in BCG vaccines remains to be explored.

  14. Comparative genomics of BCG vaccines.

    PubMed

    Behr, M A

    2001-01-01

    Bacille Calmette-Guérin (BCG) vaccines have been given to more people than any other vaccine. They have also probably resulted in as much controversy as any other vaccine. In clinical trials, the efficacy of BCG vaccination against pulmonary TB has been widely variable. At the same time, a number of investigators have observed phenotypic differences between BCG daughter strains, raising the possibility that differences between BCG products may in some way translate into different outcomes. With recent genomic analysis of BCG strains, it has become possible to piece together the molecular events that have resulted in current BCG vaccines. Between the derivation of BCG in 1921 and the lyophilization of BCG Pasteur 1173 in 1961, there have been at least seven genetic events, including deletions, duplications and a single nucleotide polymorphism. The phenotypic relevance of these changes in BCG vaccines remains to be explored. PMID:11463238

  15. Evolution of a distinct genomic domain in Drosophila: comparative analysis of the dot chromosome in Drosophila melanogaster and Drosophila virilis.

    PubMed

    Leung, Wilson; Shaffer, Christopher D; Cordonnier, Taylor; Wong, Jeannette; Itano, Michelle S; Slawson Tempel, Elizabeth E; Kellmann, Elmer; Desruisseau, David Michael; Cain, Carolyn; Carrasquillo, Robert; Chusak, Tien M; Falkowska, Katazyna; Grim, Kelli D; Guan, Rui; Honeybourne, Jacquelyn; Khan, Sana; Lo, Louis; McGaha, Rebecca; Plunkett, Jevon; Richner, Justin M; Richt, Ryan; Sabin, Leah; Shah, Anita; Sharma, Anushree; Singhal, Sonal; Song, Fine; Swope, Christopher; Wilen, Craig B; Buhler, Jeremy; Mardis, Elaine R; Elgin, Sarah C R

    2010-08-01

    The distal arm of the fourth ("dot") chromosome of Drosophila melanogaster is unusual in that it exhibits an amalgamation of heterochromatic properties (e.g., dense packaging, late replication) and euchromatic properties (e.g., gene density similar to euchromatic domains, replication during polytenization). To examine the evolution of this unusual domain, we undertook a comparative study by generating high-quality sequence data and manually curating gene models for the dot chromosome of D. virilis (Tucson strain 15010-1051.88). Our analysis shows that the dot chromosomes of D. melanogaster and D. virilis have higher repeat density, larger gene size, lower codon bias, and a higher rate of gene rearrangement compared to a reference euchromatic domain. Analysis of eight "wanderer" genes (present in a euchromatic chromosome arm in one species and on the dot chromosome in the other) shows that their characteristics are similar to other genes in the same domain, which suggests that these characteristics are features of the domain and are not required for these genes to function. Comparison of this strain of D. virilis with the strain sequenced by the Drosophila 12 Genomes Consortium (Tucson strain 15010-1051.87) indicates that most genes on the dot are under weak purifying selection. Collectively, despite the heterochromatin-like properties of this domain, genes on the dot evolve to maintain function while being responsive to changes in their local environment.

  16. A comparative genomic analysis of targets of Hox protein Ultrabithorax amongst distant insect species.

    PubMed

    Prasad, Naveen; Tarikere, Shreeharsha; Khanale, Dhanashree; Habib, Farhat; Shashidhara, L S

    2016-01-01

    In the fruitfly Drosophila melanogaster, the differential development of wing and haltere is dependent on the function of the Hox protein Ultrabithorax (Ubx). Here we compare Ubx-mediated regulation of wing patterning genes between the honeybee, Apis mellifera, the silkmoth, Bombyx mori and Drosophila. Orthologues of Ubx are expressed in the third thoracic segment of Apis and Bombyx, although they make functional hindwings. When over-expressed in transgenic Drosophila, Ubx derived from Apis or Bombyx could suppress wing development, suggesting evolutionary changes at the level of co-factors and/or targets of Ubx. To gain further insights into such events, we identified direct targets of Ubx from Apis and Bombyx by ChIP-seq and compared them with those of Drosophila. While majority of the putative targets of Ubx are species-specific, a considerable number of wing-patterning genes are retained, over the past 300 millions years, as targets in all the three species. Interestingly, many of these are differentially expressed only between wing and haltere in Drosophila but not between forewing and hindwing in Apis or Bombyx. Detailed bioinformatics and experimental validation of enhancer sequences suggest that, perhaps along with other factors, changes in the cis-regulatory sequences of earlier targets contribute to diversity in Ubx function. PMID:27296678

  17. A comparative genomic analysis of targets of Hox protein Ultrabithorax amongst distant insect species

    PubMed Central

    Prasad, Naveen; Tarikere, Shreeharsha; Khanale, Dhanashree; Habib, Farhat; Shashidhara, L. S.

    2016-01-01

    In the fruitfly Drosophila melanogaster, the differential development of wing and haltere is dependent on the function of the Hox protein Ultrabithorax (Ubx). Here we compare Ubx-mediated regulation of wing patterning genes between the honeybee, Apis mellifera, the silkmoth, Bombyx mori and Drosophila. Orthologues of Ubx are expressed in the third thoracic segment of Apis and Bombyx, although they make functional hindwings. When over-expressed in transgenic Drosophila, Ubx derived from Apis or Bombyx could suppress wing development, suggesting evolutionary changes at the level of co-factors and/or targets of Ubx. To gain further insights into such events, we identified direct targets of Ubx from Apis and Bombyx by ChIP-seq and compared them with those of Drosophila. While majority of the putative targets of Ubx are species-specific, a considerable number of wing-patterning genes are retained, over the past 300 millions years, as targets in all the three species. Interestingly, many of these are differentially expressed only between wing and haltere in Drosophila but not between forewing and hindwing in Apis or Bombyx. Detailed bioinformatics and experimental validation of enhancer sequences suggest that, perhaps along with other factors, changes in the cis-regulatory sequences of earlier targets contribute to diversity in Ubx function. PMID:27296678

  18. Integration of the Rat Recombination and EST Maps in the Rat Genomic Sequence and Comparative Mapping Analysis With the Mouse Genome

    PubMed Central

    Wilder, Steven P.; Bihoreau, Marie-Thérèse; Argoud, Karène; Watanabe, Takeshi K.; Lathrop, Mark; Gauguier, Dominique

    2004-01-01

    Inbred strains of the laboratory rat are widely used for identifying genetic regions involved in the control of complex quantitative phenotypes of biomedical importance. The draft genomic sequence of the rat now provides essential information for annotating rat quantitative trait locus (QTL) maps. Following the survey of unique rat microsatellite (11,585 including 1648 new markers) and EST (10,067) markers currently available, we have incorporated a selection of 7952 rat EST sequences in an improved version of the integrated linkage-radiation hybrid map of the rat containing 2058 microsatellite markers which provided over 10,000 potential anchor points between rat QTL and the genomic sequence of the rat. A total of 996 genetic positions were resolved (avg. spacing 1.77 cM) in a single large intercross and anchored in the rat genomic sequence (avg. spacing 1.62 Mb). Comparative genome maps between rat and mouse were constructed by successful computational alignment of 6108 mapped rat ESTs in the mouse genome. The integration of rat linkage maps in the draft genomic sequence of the rat and that of other species represents an essential step for translating rat QTL intervals into human chromosomal targets. PMID:15060020

  19. Comparative genomic analysis of single-molecule sequencing and hybrid approaches for finishing the Clostridium autoethanogenum JA1-1 strain DSM 10061 genome

    SciTech Connect

    Brown, Steven D; Nagaraju, Shilpa; Utturkar, Sagar M; De Tissera, Sashini; Segovia, Simón; Mitchell, Wayne; Land, Miriam L; Dassanayake, Asela; Köpke, Michael

    2014-01-01

    Background Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. Results A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a

  20. Comparative genomic analysis of duplicated homoeologous regions involved in the resistance of Brassica napus to stem canker.

    PubMed

    Fopa Fomeju, Berline; Falentin, Cyril; Lassalle, Gilles; Manzanares-Dauleux, Maria J; Delourme, Régine

    2015-01-01

    All crop species are current or ancient polyploids. Following whole genome duplication, structural and functional modifications result in differential gene content or regulation in the duplicated regions, which can play a fundamental role in the diversification of genes underlying complex traits. We have investigated this issue in Brassica napus, a species with a highly duplicated genome, with the aim of studying the structural and functional organization of duplicated regions involved in quantitative resistance to stem canker, a disease caused by the fungal pathogen Leptosphaeria maculans. Genome-wide association analysis on two oilseed rape panels confirmed that duplicated regions of ancestral blocks E, J, R, U, and W were involved in resistance to stem canker. The structural analysis of the duplicated genomic regions showed a higher gene density on the A genome than on the C genome and a better collinearity between homoeologous regions than paralogous regions, as overall in the whole B. napus genome. The three ancestral sub-genomes were involved in the resistance to stem canker and the fractionation profile of the duplicated regions corresponded to what was expected from results on the B. napus progenitors. About 60% of the genes identified in these duplicated regions were single-copy genes while less than 5% were retained in all the duplicated copies of a given ancestral block. Genes retained in several copies were mainly involved in response to stress, signaling, or transcription regulation. Genes with resistance-associated markers were mainly retained in more than two copies. These results suggested that some genes underlying quantitative resistance to stem canker might be duplicated genes. Genes with a hydrolase activity that were retained in one copy or R-like genes might also account for resistance in some regions. Further analyses need to be conducted to indicate to what extent duplicated genes contribute to the expression of the resistance phenotype

  1. Comparative genomic analysis of duplicated homoeologous regions involved in the resistance of Brassica napus to stem canker

    PubMed Central

    Fopa Fomeju, Berline; Falentin, Cyril; Lassalle, Gilles; Manzanares-Dauleux, Maria J.; Delourme, Régine

    2015-01-01

    All crop species are current or ancient polyploids. Following whole genome duplication, structural and functional modifications result in differential gene content or regulation in the duplicated regions, which can play a fundamental role in the diversification of genes underlying complex traits. We have investigated this issue in Brassica napus, a species with a highly duplicated genome, with the aim of studying the structural and functional organization of duplicated regions involved in quantitative resistance to stem canker, a disease caused by the fungal pathogen Leptosphaeria maculans. Genome-wide association analysis on two oilseed rape panels confirmed that duplicated regions of ancestral blocks E, J, R, U, and W were involved in resistance to stem canker. The structural analysis of the duplicated genomic regions showed a higher gene density on the A genome than on the C genome and a better collinearity between homoeologous regions than paralogous regions, as overall in the whole B. napus genome. The three ancestral sub-genomes were involved in the resistance to stem canker and the fractionation profile of the duplicated regions corresponded to what was expected from results on the B. napus progenitors. About 60% of the genes identified in these duplicated regions were single-copy genes while less than 5% were retained in all the duplicated copies of a given ancestral block. Genes retained in several copies were mainly involved in response to stress, signaling, or transcription regulation. Genes with resistance-associated markers were mainly retained in more than two copies. These results suggested that some genes underlying quantitative resistance to stem canker might be duplicated genes. Genes with a hydrolase activity that were retained in one copy or R-like genes might also account for resistance in some regions. Further analyses need to be conducted to indicate to what extent duplicated genes contribute to the expression of the resistance phenotype

  2. Genome-Wide Comparative Analysis of the Phospholipase D Gene Families among Allotetraploid Cotton and Its Diploid Progenitors

    PubMed Central

    Tang, Kai; Dong, Chun-Juan; Liu, Jin-Yuan

    2016-01-01

    In this study, 40 phospholipase D (PLD) genes were identified from allotetraploid cotton Gossypium hirsutum, and 20 PLD genes were examined in diploid cotton Gossypium raimondii. Combining with 19 previously identified Gossypium arboreum PLD genes, a comparative analysis was performed among the PLD gene families among allotetraploid and two diploid cottons. Based on the orthologous relationships, we found that almost each G. hirsutum PLD had a corresponding homolog in the G. arboreum and G. raimondii genomes, except for GhPLDβ3A, whose homolog GaPLDβ3 may have been lost during the evolution of G. arboreum after the interspecific hybridization. Phylogenetic analysis showed that all of the cotton PLDs were unevenly classified into six numbered subgroups: α, β/γ, δ, ε, ζ and φ. An N-terminal C2 domain was found in the α, β/γ, δ and ε subgroups, while phox homology (PX) and pleckstrin homology (PH) domains were identified in the ζ subgroup. The subgroup φ possessed a single peptide instead of a functional domain. In each phylogenetic subgroup, the PLDs showed high conservation in gene structure and amino acid sequences in functional domains. The expansion of GhPLD and GrPLD gene families were mainly attributed to segmental duplication and partly attributed to tandem duplication. Furthermore, purifying selection played a critical role in the evolution of PLD genes in cotton. Quantitative RT-PCR documented that allotetraploid cotton PLD genes were broadly expressed and each had a unique spatial and developmental expression pattern, indicating their functional diversification in cotton growth and development. Further analysis of cis-regulatory elements elucidated transcriptional regulations and potential functions. Our comparative analysis provided valuable information for understanding the putative functions of the PLD genes in cotton fiber. PMID:27213891

  3. Genome-Wide Comparative Analysis of the Phospholipase D Gene Families among Allotetraploid Cotton and Its Diploid Progenitors.

    PubMed

    Tang, Kai; Dong, Chun-Juan; Liu, Jin-Yuan

    2016-01-01

    In this study, 40 phospholipase D (PLD) genes were identified from allotetraploid cotton Gossypium hirsutum, and 20 PLD genes were examined in diploid cotton Gossypium raimondii. Combining with 19 previously identified Gossypium arboreum PLD genes, a comparative analysis was performed among the PLD gene families among allotetraploid and two diploid cottons. Based on the orthologous relationships, we found that almost each G. hirsutum PLD had a corresponding homolog in the G. arboreum and G. raimondii genomes, except for GhPLDβ3A, whose homolog GaPLDβ3 may have been lost during the evolution of G. arboreum after the interspecific hybridization. Phylogenetic analysis showed that all of the cotton PLDs were unevenly classified into six numbered subgroups: α, β/γ, δ, ε, ζ and φ. An N-terminal C2 domain was found in the α, β/γ, δ and ε subgroups, while phox homology (PX) and pleckstrin homology (PH) domains were identified in the ζ subgroup. The subgroup φ possessed a single peptide instead of a functional domain. In each phylogenetic subgroup, the PLDs showed high conservation in gene structure and amino acid sequences in functional domains. The expansion of GhPLD and GrPLD gene families were mainly attributed to segmental duplication and partly attributed to tandem duplication. Furthermore, purifying selection played a critical role in the evolution of PLD genes in cotton. Quantitative RT-PCR documented that allotetraploid cotton PLD genes were broadly expressed and each had a unique spatial and developmental expression pattern, indicating their functional diversification in cotton growth and development. Further analysis of cis-regulatory elements elucidated transcriptional regulations and potential functions. Our comparative analysis provided valuable information for understanding the putative functions of the PLD genes in cotton fiber. PMID:27213891

  4. Comparative Genomic and Transcriptomic Analysis of Wangiella dermatitidis, A Major Cause of Phaeohyphomycosis and a Model Black Yeast Human Pathogen

    PubMed Central

    Chen, Zehua; Martinez, Diego A.; Gujja, Sharvari; Sykes, Sean M.; Zeng, Qiandong; Szaniszlo, Paul J.; Wang, Zheng; Cuomo, Christina A.

    2014-01-01

    Black or dark brown (phaeoid) fungi cause cutaneous, subcutaneous, and systemic infections in humans. Black fungi thrive in stressful conditions such as intense light, high radiation, and very low pH. Wangiella (Exophiala) dermatitidis is arguably the most studied phaeoid fungal pathogen of humans. Here, we report our comparative analysis of the genome of W. dermatitidis and the transcriptional response to low pH stress. This revealed that W. dermatitidis has lost the ability to synthesize alpha-glucan, a cell wall compound many pathogenic fungi use to evade the host immune system. In contrast, W. dermatitidis contains a similar profile of chitin synthase genes as related fungi and strongly induces genes involved in cell wall synthesis in response to pH stress. The large portfolio of transporters may provide W. dermatitidis with an enhanced ability to remove harmful products as well as to survive on diverse nutrient sources. The genome encodes three independent pathways for producing melanin, an ability linked to pathogenesis; these are active during pH stress, potentially to produce a barrier to accumulated oxidative damage that might occur under stress conditions. In addition, a full set of fungal light-sensing genes is present, including as part of a carotenoid biosynthesis gene cluster. Finally, we identify a two-gene cluster involved in nucleotide sugar metabolism conserved with a subset of fungi and characterize a horizontal transfer event of this cluster between fungi and algal viruses. This work reveals how W. dermatitidis has adapted to stress and survives in diverse environments, including during human infections. PMID:24496724

  5. Construction of Global Acyl Lipid Metabolic Map by Comparative Genomics and Subcellular Localization Analysis in the Red Alga Cyanidioschyzon merolae

    PubMed Central

    Mori, Natsumi; Moriyama, Takashi; Toyoshima, Masakazu; Sato, Naoki

    2016-01-01

    Pathways of lipid metabolism have been established in land plants, such as Arabidopsis thaliana, but the information on exact pathways is still under study in microalgae. In contrast with Chlamydomonas reinhardtii, which is currently studied extensively, the pathway information in red algae is still in the state in which enzymes and pathways are estimated by analogy with the knowledge in plants. Here we attempt to construct the entire acyl lipid metabolic pathways in a model red alga, Cyanidioschyzon merolae, as an initial basis for future genetic and biochemical studies, by exploiting comparative genomics and loc