Science.gov

Sample records for comparative genomic analysis

  1. Comparative genomic analysis of esophageal cancers.

    PubMed

    Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang

    2014-09-01

    The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.

  2. Comparative Genome Analysis of Enterobacter cloacae

    PubMed Central

    Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

    2013-01-01

    The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314

  3. Comparative genome analysis of Basidiomycete fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  4. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  5. Comparative genomic analysis of prion genes

    PubMed Central

    Premzl, Marko; Gamulin, Vera

    2007-01-01

    Background The homologues of human disease genes are expected to contribute to better understanding of physiological and pathogenic processes. We made use of the present availability of vertebrate genomic sequences, and we have conducted the most comprehensive comparative genomic analysis of the prion protein gene PRNP and its homologues, shadow of prion protein gene SPRN and doppel gene PRND, and prion testis-specific gene PRNT so far. Results While the SPRN and PRNP homologues are present in all vertebrates, PRND is known in tetrapods, and PRNT is present in primates. PRNT could be viewed as a TE-associated gene. Using human as the base sequence for genomic sequence comparisons (VISTA), we annotated numerous potential cis-elements. The conserved regions in SPRNs harbour the potential Sp1 sites in promoters (mammals, birds), C-rich intron splicing enhancers and PTB intron splicing silencers in introns (mammals, birds), and hsa-miR-34a sites in 3'-UTRs (eutherians). We showed the conserved PRNP upstream regions, which may be potential enhancers or silencers (primates, dog). In the PRNP 3'-UTRs, there are conserved cytoplasmic polyadenylation element sites (mammals, birds). The PRND core promoters include highly conserved CCAAT, CArG and TATA boxes (mammals). We deduced 42 new protein primary structures, and performed the first phylogenetic analysis of all vertebrate prion genes. Using the protein alignment which included 122 sequences, we constructed the neighbour-joining tree which showed four major clusters, including shadoos, shadoo2s and prion protein-likes (cluster 1), fish prion proteins (cluster 2), tetrapode prion proteins (cluster 3) and doppels (cluster 4). We showed that the entire prion protein conformationally plastic region is well conserved between eutherian prion proteins and shadoos (18–25% identity and 28–34% similarity), and there could be a potential structural compatibility between shadoos and the left-handed parallel beta-helical fold

  6. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  7. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  8. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    PubMed Central

    Chari, Raj; Lockwood, William W.; Lam, Wan L.

    2006-01-01

    Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development. PMID:17992253

  9. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    PubMed Central

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  10. Comparative Genome Analysis of Basidiomycete Fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  11. Comparative genomic analysis of eutherian interferon-γ-inducible GTPases.

    PubMed

    Premzl, Marko

    2012-11-01

    The interferon-γ-inducible GTPases, IFGGs, are intracellular proteins involved in immune response against pathogens. A comprehensive comparative genomic review and analysis of eutherian IFGGs was carried out using public genomic sequences. The 64 eutherian IFGG genes were examined in detail and annotated. The eutherian IFGG promoter types were first catalogued followed by a phylogenetic analysis of eutherian IFGGs, which described five major IFGG clusters. The patterns of differential gene expansions and protein regions that may regulate IFGG catalytic features suggested a new classification of eutherian IFGGs. This mini-review has also provided new tests of reliability of public genomic sequences as well as tests of protein molecular evolution.

  12. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  13. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    NASA Astrophysics Data System (ADS)

    Song, Jiuzhou; Ware, Tony; Liu, Shu-Lin; Surette, M.

    2004-12-01

    Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  14. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    PubMed

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  15. AcCNET (Accessory Genome Constellation Network): comparative genomics software for accessory genome analysis using bipartite networks.

    PubMed

    Lanza, Val F; Baquero, Fernando; de la Cruz, Fernando; Coque, Teresa M

    2017-01-15

    AcCNET (Accessory genome Constellation Network) is a Perl application that aims to compare accessory genomes of a large number of genomic units, both at qualitative and quantitative levels. Using the proteomes extracted from the analysed genomes, AcCNET creates a bipartite network compatible with standard network analysis platforms. AcCNET allows merging phylogenetic and functional information about the concerned genomes, thus improving the capability of current methods of network analysis. The AcCNET bipartite network opens a new perspective to explore the pangenome of bacterial species, focusing on the accessory genome behind the idiosyncrasy of a particular strain and/or population.

  16. OGRe: a relational database for comparative analysis of mitochondrial genomes

    PubMed Central

    Jameson, Daniel; Gibson, Andrew P.; Hudelot, Cendrine; Higgs, Paul G.

    2003-01-01

    Organellar Genome Retrieval (OGRe) is a relational database of complete mitochondrial genome sequences for over 250 Metazoan species. OGRe provides a resource for the comparative analysis of mitochondrial genomes at several levels. At the sequence level, OGRe allows the retrieval of any selected set of mitochondrial genes from any selected set of species. Species are classified using a taxonomic system that allows easy selection of related groups of species. Sequence alignments are also available for some species. At the level of individual nucleotides, the system contains information on base frequencies and codon usage frequencies that can be compared between organisms. At the level of whole genomes, OGRe provides several ways of visualizing information on gene order. Diagrams illustrating the genome arrangement can be generated for any selected set of species automatically from the information in the database. Searches can be done based on gene arrangement to find sets of species that have the same order as one another. Diagrams for pairwise comparison of species can be produced that show the positions of break-points in the gene order and use colour to highlight the sections of the genome that have moved. OGRe is available from http://www.bioinf.man.ac.uk/ogre. PMID:12519982

  17. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use.

  18. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    SciTech Connect

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  19. Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species

    PubMed Central

    Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

    2012-01-01

    Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of

  20. Comparative analysis of genomic signal processing for microarray data clustering.

    PubMed

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  1. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    PubMed

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

  2. Malignant canine mammary tumours: Preliminary genomic insights using oligonucleotide array comparative genomic hybridisation analysis.

    PubMed

    Santos, Marta; Dias-Pereira, Patrícia; Williams, Christina; Lopes, Carlos; Breen, Matthew

    2017-03-28

    Neoplastic mammary disease in female dogs represents a major health concern for dog owners and veterinarians, but the genomic basis of the disease is poorly understood. In this study, we performed high resolution oligonucleotide array comparative genomic hybridisation (oaCGH) to assess genome wide DNA copy number changes in 10 malignant canine mammary tumours from seven female dogs, including multiple tumours collected at one time from each of three female dogs. In all but two tumours, genomic imbalances were detected, with losses being more common than gains. Canine chromosomes 9, 22, 26, 27, 34 and X were most frequently affected. Dissimilar oaCGH ratio profiles were observed in multiple tumours from the same dogs, providing preliminary evidence for probable independent pathogenesis. Analysis of adjacent samples of one tumour revealed regional differences in the number of genomic imbalances, suggesting heterogeneity within tumours.

  3. Comparative analysis of essential genes in prokaryotic genomic islands

    PubMed Central

    Zhang, Xi; Peng, Chong; Zhang, Ge; Gao, Feng

    2015-01-01

    Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands. PMID:26223387

  4. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

    PubMed Central

    Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  5. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae

    PubMed Central

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-01-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  6. Comparative genomic analysis of Chlamydia trachomatis oculotropic and genitotropic strains.

    PubMed

    Carlson, John H; Porcella, Stephen F; McClarty, Grant; Caldwell, Harlan D

    2005-10-01

    Chlamydia trachomatis infection is an important cause of preventable blindness and sexually transmitted disease (STD) in humans. C. trachomatis exists as multiple serovariants that exhibit distinct organotropism for the eye or urogenital tract. We previously reported tissue-tropic correlations with the presence or absence of a functional tryptophan synthase and a putative GTPase-inactivating domain of the chlamydial toxin gene. This suggested that these genes may be the primary factors responsible for chlamydial disease organotropism. To test this hypothesis, the genome of an oculotropic trachoma isolate (A/HAR-13) was sequenced and compared to the genome of a genitotropic (D/UW-3) isolate. Remarkably, the genomes share 99.6% identity, supporting the conclusion that a functional tryptophan synthase enzyme and toxin might be the principal virulence factors underlying disease organotropism. Tarp (translocated actin-recruiting phosphoprotein) was identified to have variable numbers of repeat units within the N and C portions of the protein. A correlation exists between lymphogranuloma venereum serovars and the number of N-terminal repeats. Single-nucleotide polymorphism (SNP) analysis between the two genomes highlighted the minimal genetic variation. A disproportionate number of SNPs were observed within some members of the polymorphic membrane protein (pmp) autotransporter gene family that corresponded to predicted T-cell epitopes that bind HLA class I and II alleles. These results implicate Pmps as novel immune targets, which could advance future chlamydial vaccine strategies. Lastly, a novel target for PCR diagnostics was discovered that can discriminate between ocular and genital strains. This discovery will enhance epidemiological investigations in nations where both trachoma and chlamydial STD are endemic.

  7. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources

    PubMed Central

    Klima, Cassidy L.; Cook, Shaun R.; Zaheer, Rahat; Laing, Chad; Gannon, Vick P.; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W.; McAllister, Tim A.

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2–8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  8. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  9. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE PAGES

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; ...

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but

  10. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    SciTech Connect

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The species P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this

  11. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    PubMed Central

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  12. Genomic characteristics and comparative genomics analysis of Penicillium chrysogenum KF-25

    PubMed Central

    2014-01-01

    Background Penicillium chrysogenum has been used in producing penicillin and derived β-lactam antibiotics for many years. Although the genome of the mutant strain P. chrysogenum Wisconsin 54-1255 has already been sequenced, the versatility and genetic diversity of this species still needs to be intensively studied. In this study, the genome of the wild-type P. chrysogenum strain KF-25, which has high activity against Ustilaginoidea virens, was sequenced and characterized. Results The genome of KF-25 was about 29.9 Mb in size and contained 9,804 putative open reading frames (orfs). Thirteen genes were predicted to encode two-component system proteins, of which six were putatively involved in osmolarity adaption. There were 33 putative secondary metabolism pathways and numerous genes that were essential in metabolite biosynthesis. Several P. chrysogenum virus untranslated region sequences were found in the KF-25 genome, suggesting that there might be a relationship between the virus and P. chrysogenum in evolution. Comparative genome analysis showed that the genomes of KF-25 and Wisconsin 54-1255 were highly similar, except that KF-25 was 2.3 Mb smaller. Three hundred and fifty-five KF-25 specific genes were found and the biological functions of the proteins encoded by these genes were mainly unknown (232, representing 65%), except for some orfs encoding proteins with predicted functions in transport, metabolism, and signal transduction. Numerous KF-25-specific genes were found to be associated with the pathogenicity and virulence of the strains, which were identical to those of wild-type P. chrysogenum NRRL 1951. Conclusion Genome sequencing and comparative analysis are helpful in further understanding the biology, evolution, and environment adaption of P. chrysogenum, and provide a new tool for identifying further functional metabolites. PMID:24555742

  13. Genome Sequence and Comparative Genome Analysis of Lactobacillus casei: Insights into Their Niche-Associated Evolution

    PubMed Central

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F.; Broadbent, Jeff R.

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  14. Using comparative genome analysis to identify problems in annotated microbial genomes.

    PubMed

    Poptsova, Maria S; Gogarten, J Peter

    2010-07-01

    Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.

  15. A Mitochondrial Genome of Rhyparochromidae (Hemiptera: Heteroptera) and a Comparative Analysis of Related Mitochondrial Genomes

    PubMed Central

    Li, Teng; Yang, Jie; Li, Yinwan; Cui, Ying; Xie, Qiang; Bu, Wenjun; Hillis, David M.

    2016-01-01

    The Rhyparochromidae, the largest family of Lygaeoidea, encompasses more than 1,850 described species, but no mitochondrial genome has been sequenced to date. Here we describe the first mitochondrial genome for Rhyparochromidae: a complete mitochondrial genome of Panaorus albomaculatus (Scott, 1874). This mitochondrial genome is comprised of 16,345 bp, and contains the expected 37 genes and control region. The majority of the control region is made up of a large tandem-repeat region, which has a novel pattern not previously observed in other insects. The tandem-repeats region of P. albomaculatus consists of 53 tandem duplications (including one partial repeat), which is the largest number of tandem repeats among all the known insect mitochondrial genomes. Slipped-strand mispairing during replication is likely to have generated this novel pattern of tandem repeats. Comparative analysis of tRNA gene families in sequenced Pentatomomorpha and Lygaeoidea species shows that the pattern of nucleotide conservation is markedly higher on the J-strand. Phylogenetic reconstruction based on mitochondrial genomes suggests that Rhyparochromidae is not the sister group to all the remaining Lygaeoidea, and supports the monophyly of Lygaeoidea. PMID:27756915

  16. MGcV: the microbial genomic context viewer for comparative genome analysis

    PubMed Central

    2013-01-01

    Background Conserved gene context is used in many types of comparative genome analyses. It is used to provide leads on gene function, to guide the discovery of regulatory sequences, but also to aid in the reconstruction of metabolic networks. We present the Microbial Genomic context Viewer (MGcV), an interactive, web-based application tailored to strengthen the practice of manual comparative genome context analysis for bacteria. Results MGcV is a versatile, easy-to-use tool that renders a visualization of the genomic context of any set of selected genes, genes within a phylogenetic tree, genomic segments, or regulatory elements. It is tailored to facilitate laborious tasks such as the interactive annotation of gene function, the discovery of regulatory elements, or the sequence-based reconstruction of gene regulatory networks. We illustrate that MGcV can be used in gene function annotation by visually integrating information on prokaryotic genes, like their annotation as available from NCBI with other annotation data such as Pfam domains, sub-cellular location predictions and gene-sequence characteristics such as GC content. We also illustrate the usefulness of the interactive features that allow the graphical selection of genes to facilitate data gathering (e.g. upstream regions, ID’s or annotation), in the analysis and reconstruction of transcription regulation. Moreover, putative regulatory elements and their corresponding scores or data from RNA-seq and microarray experiments can be uploaded, visualized and interpreted in (ranked-) comparative context maps. The ranked maps allow the interpretation of predicted regulatory elements and experimental data in light of each other. Conclusion MGcV advances the manual comparative analysis of genes and regulatory elements by providing fast and flexible integration of gene related data combined with straightforward data retrieval. MGcV is available at http://mgcv.cmbi.ru.nl. PMID:23547764

  17. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    PubMed Central

    2010-01-01

    Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation. PMID:20929575

  18. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    PubMed

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2015-10-30

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.

  19. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  20. Comparative Analysis of Six Lagerstroemia Complete Chloroplast Genomes

    PubMed Central

    Xu, Chao; Dong, Wenpan; Li, Wenqing; Lu, Yizeng; Xie, Xiaoman; Jin, Xiaobai; Shi, Jipu; He, Kaihong; Suo, Zhili

    2017-01-01

    Crape myrtles are economically important ornamental trees of the genus Lagerstroemia L. (Lythraceae), with a distribution from tropical to northern temperate zones. They are positioned phylogenetically to a large subclade of rosids (in the eudicots) which contain more than 25% of all the angiosperms. They commonly bloom from summer till fall and are of significant value in city landscape and environmental protection. Morphological traits are shared inter-specifically among plants of Lagerstroemia to certain extent and are also influenced by environmental conditions and different developmental stages. Thus, classification of plants in Lagerstroemia at species and cultivar levels is still a challenging task. Chloroplast (cp) genome sequences have been proven to be an informative and valuable source of cp DNA markers for genetic diversity evaluation. In this study, the complete cp genomes of three Lagerstroemia species were newly sequenced, and three other published cp genome sequences of Lagerstroemia were retrieved for comparative analyses in order to obtain an upgraded understanding of the application value of genetic information from the cp genomes. The six cp genomes ranged from 152,049 bp (L. subcostata) to 152,526 bp (L. speciosa) in length. We analyzed nucleotide substitutions, insertions/deletions, and simple sequence repeats in the cp genomes, and discovered 12 relatively highly variable regions that will potentially provide plastid markers for further taxonomic, phylogenetic, and population genetics studies in Lagerstroemia. The phylogenetic relationships of the Lagerstroemia taxa inferred from the datasets from the cp genomes obtained high support, indicating that cp genome data may be useful in resolving relationships in this genus. PMID:28154574

  1. Comparative and demographic analysis of orang-utan genomes.

    PubMed

    Locke, Devin P; Hillier, LaDeana W; Warren, Wesley C; Worley, Kim C; Nazareth, Lynne V; Muzny, Donna M; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T; Minx, Pat; Mitreva, Makedonka; Cook, Lisa; Delehaunty, Kim D; Fronick, Catrina; Schmidt, Heather; Fulton, Lucinda A; Fulton, Robert S; Nelson, Joanne O; Magrini, Vincent; Pohl, Craig; Graves, Tina A; Markovic, Chris; Cree, Andy; Dinh, Huyen H; Hume, Jennifer; Kovar, Christie L; Fowler, Gerald R; Lunter, Gerton; Meader, Stephen; Heger, Andreas; Ponting, Chris P; Marques-Bonet, Tomas; Alkan, Can; Chen, Lin; Cheng, Ze; Kidd, Jeffrey M; Eichler, Evan E; White, Simon; Searle, Stephen; Vilella, Albert J; Chen, Yuan; Flicek, Paul; Ma, Jian; Raney, Brian; Suh, Bernard; Burhans, Richard; Herrero, Javier; Haussler, David; Faria, Rui; Fernando, Olga; Darré, Fleur; Farré, Domènec; Gazave, Elodie; Oliva, Meritxell; Navarro, Arcadi; Roberto, Roberta; Capozzi, Oronzo; Archidiacono, Nicoletta; Della Valle, Giuliano; Purgato, Stefania; Rocchi, Mariano; Konkel, Miriam K; Walker, Jerilyn A; Ullmer, Brygg; Batzer, Mark A; Smit, Arian F A; Hubley, Robert; Casola, Claudio; Schrider, Daniel R; Hahn, Matthew W; Quesada, Victor; Puente, Xose S; Ordoñez, Gonzalo R; López-Otín, Carlos; Vinar, Tomas; Brejova, Brona; Ratan, Aakrosh; Harris, Robert S; Miller, Webb; Kosiol, Carolin; Lawson, Heather A; Taliwal, Vikas; Martins, André L; Siepel, Adam; Roychoudhury, Arindam; Ma, Xin; Degenhardt, Jeremiah; Bustamante, Carlos D; Gutenkunst, Ryan N; Mailund, Thomas; Dutheil, Julien Y; Hobolth, Asger; Schierup, Mikkel H; Ryder, Oliver A; Yoshinaga, Yuko; de Jong, Pieter J; Weinstock, George M; Rogers, Jeffrey; Mardis, Elaine R; Gibbs, Richard A; Wilson, Richard K

    2011-01-27

    'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new

  2. Comparative genomics analysis in Prunoideae to identify biologically relevant polymorphisms.

    PubMed

    Koepke, Tyson; Schaeffer, Scott; Harper, Artemus; Dicenta, Federico; Edwards, Mark; Henry, Robert J; Møller, Birger L; Meisel, Lee; Oraguzie, Nnadozie; Silva, Herman; Sánchez-Pérez, Raquel; Dhingra, Amit

    2013-09-01

    Prunus is an economically important genus with a wide range of physiological and biological variability. Using the peach genome as a reference, sequencing reads from four almond accessions and one sweet cherry cultivar were used for comparative analysis of these three Prunus species. Reference mapping enabled the identification of many biological relevant polymorphisms within the individuals. Examining the depth of the polymorphisms and the overall scaffold coverage, we identified many potentially interesting regions including hundreds of small scaffolds with no coverage from any individual. Non-sense mutations account for about 70 000 of the 13 million identified single nucleotide polymorphisms (SNPs). Blast2GO analyses on these non-sense SNPs revealed several interesting results. First, non-sense SNPs were not evenly distributed across all gene ontology terms. Specifically, in comparison with peach, sweet cherry is found to have non-sense SNPs in two 1-aminocyclopropane-1-carboxylate synthase (ACS) genes and two 1-aminocyclopropane-1-carboxylate oxidase (ACO) genes. These polymorphisms may be at the root of the nonclimacteric ripening of sweet cherry. A set of candidate genes associated with bitterness in almond were identified by comparing sweet and bitter almond sequences. To the best of our knowledge, this is the first report in plants of non-sense SNP abundance in a genus being linked to specific GO terms.

  3. Comparative Genomics Analysis in Prunoideae to Identify Biologically Relevant Polymorphisms

    PubMed Central

    Koepke, Tyson; Schaeffer, Scott; Harper, Artemus; Dicenta, Federico; Edwards, Mark; Henry, Robert J.; Møller, Birger Lindberg; Meisel, Lee; Oraguzie, Nnadozie; Silva, Herman; Sánchez-Pérez, Raquel; Dhingra, Amit

    2013-01-01

    Prunus is an economically important genus with a wide range of physiological and biological variability. Using the peach genome as a reference, sequencing reads from four almond accessions and one sweet cherry cultivar were used for comparative analysis of these three Prunus species. Reference mapping enabled the identification of many biological relevant polymorphisms within the individuals. Examining the depth of the polymorphisms and the overall scaffold coverage, we identified many potentially interesting regions including hundreds of small scaffolds with no coverage from any individual. Nonsense mutations account for about 70,000 of the 13 million identified single nucleotide polymorphisms (SNPs). Blast2GO analyses on these nonsense SNPs revealed several interesting results. First, nonsense SNPs were not evenly distributed across all gene ontology terms. Specifically, in comparison to peach, sweet cherry is found to have nonsense SNPs in two 1-aminocyclopropane-1-carboxylate synthase (ACS) genes and two 1-aminocyclopropane-1-carboxylate oxidase (ACO) genes. These polymorphisms may be at the root of the non-climacteric ripening of sweet cherry. A set of candidate genes associated with bitterness in almond were identified by comparing sweet and bitter almond sequences. To the best of our knowledge, this is the first report in plants of nonsense SNP abundance in a genus being linked to specific GO terms. PMID:23763653

  4. Comparative Analysis of Genome Diversity in Bullmastiff Dogs

    PubMed Central

    Mortlock, Sally-Anne; Khatkar, Mehar S.; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  5. Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome

    PubMed Central

    Venkatesh, Byrappa; Kirkness, Ewen F; Loh, Yong-Hwee; Halpern, Aaron L; Lee, Alison P; Johnson, Justin; Dandona, Nidhi; Viswanathan, Lakshmi D; Tay, Alice; Venter, J. Craig; Strausberg, Robert L; Brenner, Sydney

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes. PMID:17407382

  6. Hidden Markov models for evolution and comparative genomics analysis.

    PubMed

    Bykova, Nadezda A; Favorov, Alexander V; Mironov, Andrey A

    2013-01-01

    The problem of reconstruction of ancestral states given a phylogeny and data from extant species arises in a wide range of biological studies. The continuous-time Markov model for the discrete states evolution is generally used for the reconstruction of ancestral states. We modify this model to account for a case when the states of the extant species are uncertain. This situation appears, for example, if the states for extant species are predicted by some program and thus are known only with some level of reliability; it is common for bioinformatics field. The main idea is formulation of the problem as a hidden Markov model on a tree (tree HMM, tHMM), where the basic continuous-time Markov model is expanded with the introduction of emission probabilities of observed data (e.g. prediction scores) for each underlying discrete state. Our tHMM decoding algorithm allows us to predict states at the ancestral nodes as well as to refine states at the leaves on the basis of quantitative comparative genomics. The test on the simulated data shows that the tHMM approach applied to the continuous variable reflecting the probabilities of the states (i.e. prediction score) appears to be more accurate then the reconstruction from the discrete states assignment defined by the best score threshold. We provide examples of applying our model to the evolutionary analysis of N-terminal signal peptides and transcription factor binding sites in bacteria. The program is freely available at http://bioinf.fbb.msu.ru/~nadya/tHMM and via web-service at http://bioinf.fbb.msu.ru/treehmmweb.

  7. Marine invertebrate lipases: Comparative and functional genomic analysis.

    PubMed

    Rivera-Perez, Crisalejandra

    2015-09-01

    Lipases are key enzymes involved in lipid digestion, storage and mobilization of reserves during fasting or heightened metabolic demand. This is a highly conserved process, essential for survival. The genomes of five marine invertebrate species with distinctive digestive system were screened for the six major lipase families. The two most common families in marine invertebrates, the neutral an acid lipases, are also the main families in mammals and insects. The number of lipases varies two-fold across analyzed genomes. A high degree of orthology with mammalian lipases was observed. Interestingly, 19% of the marine invertebrate lipases have lost motifs required for catalysis. Analysis of the lid and loop regions of the neutral lipases suggests that many marine invertebrates have a functional triacylglycerol hydrolytic activity as well as some acid lipases. A revision of the expression profiles and functional activity on sequences in databases and scientific literature provided information regarding the function of these families of enzymes in marine invertebrates.

  8. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org.

  9. Delineation of Steroid-Degrading Microorganisms through Comparative Genomic Analysis

    PubMed Central

    Bergstrand, Lee H.; Cardenas, Erick; Holert, Johannes; Van Hamme, Jonathan D.

    2016-01-01

    ABSTRACT Steroids are ubiquitous in natural environments and are a significant growth substrate for microorganisms. Microbial steroid metabolism is also important for some pathogens and for biotechnical applications. This study delineated the distribution of aerobic steroid catabolism pathways among over 8,000 microorganisms whose genomes are available in the NCBI RefSeq database. Combined analysis of bacterial, archaeal, and fungal genomes with both hidden Markov models and reciprocal BLAST identified 265 putative steroid degraders within only Actinobacteria and Proteobacteria, which mainly originated from soil, eukaryotic host, and aquatic environments. These bacteria include members of 17 genera not previously known to contain steroid degraders. A pathway for cholesterol degradation was conserved in many actinobacterial genera, particularly in members of the Corynebacterineae, and a pathway for cholate degradation was conserved in members of the genus Rhodococcus. A pathway for testosterone and, sometimes, cholate degradation had a patchy distribution among Proteobacteria. The steroid degradation genes tended to occur within large gene clusters. Growth experiments confirmed bioinformatic predictions of steroid metabolism capacity in nine bacterial strains. The results indicate there was a single ancestral 9,10-seco-steroid degradation pathway. Gene duplication, likely in a progenitor of Rhodococcus, later gave rise to a cholate degradation pathway. Proteobacteria and additional Actinobacteria subsequently obtained a cholate degradation pathway via horizontal gene transfer, in some cases facilitated by plasmids. Catabolism of steroids appears to be an important component of the ecological niches of broad groups of Actinobacteria and individual species of Proteobacteria. PMID:26956583

  10. Comparative genomic analysis of teleost fish bmal genes.

    PubMed

    Wang, Han

    2009-05-01

    Bmal1 (Brain and muscle ARNT like 1) gene is a key circadian clock gene. Tetrapods also have the second Bmal gene, Bmal2. Fruit fly has only one bmal1/cycle gene. Interrogation of the five teleost fish genome sequences coupled with phylogenetic and splice site analyses found that zebrafish have two bmal1 genes, bmal1a and bmal1b, and bmal2a; Japanese pufferfish (fugu), green spotted pufferfish (tetraodon) and Japanese medaka fish each have two bmal2 genes, bmal2a and bmal2b, and bmal1a; and three-spine stickleback have bmal1a and bmal2b. Syntenic analysis further indicated that zebrafish bmal1a/bmal1b, and fugu, tetraodon and medaka bmal2a/bmal2b are ancient duplicates. Although the dN/dS ratios of these four fish bmal duplicates are all <1, implicating they have been under purifying selection, the Tajima relative rate test showed that fugu, tetraodon and medaka bmal2a/bmal2b have asymmetric evolutionary rates, suggesting that one of these duplicates have been subject to positive selection or relaxed functional constraint. These results support the notion that teleost fish bmal genes were derived from the fish-specific genome duplication (FSGD), divergent resolution following the duplication led to retaining different ancient bmal duplicates in different fishes, which could have shaped the evolution of the complex teleost fish timekeeping mechanisms.

  11. e-Fungi: a data resource for comparative analysis of fungal genomes

    PubMed Central

    Hedeler, Cornelia; Wong, Han Min; Cornell, Michael J; Alam, Intikhab; Soanes, Darren M; Rattray, Magnus; Hubbard, Simon J; Talbot, Nicholas J; Oliver, Stephen G; Paton, Norman W

    2007-01-01

    Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the genomes stored in the database

  12. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species.

  13. The tiger genome and comparative analysis with lion and snow leopard genomes

    PubMed Central

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  14. Comparative genomic analysis of hyperthermophilic archaeal fuselloviridae viruses

    SciTech Connect

    B. Wiedenheft; K. Stedman; F. Roberto; D. Willits; A. K. Gleske; L. Zoeller; J. Snyder; T. Douglas; M. Young

    2004-02-01

    The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindleshaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of _15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate.

  15. Whole genome comparative analysis of channel catfish (Ictalurus punctatus) with four model fish species

    PubMed Central

    2013-01-01

    Background Comparative mapping is a powerful tool to study evolution of genomes. It allows transfer of genome information from the well-studied model species to non-model species. Catfish is an economically important aquaculture species in United States. A large amount of genome resources have been developed from catfish including genetic linkage maps, physical maps, BAC end sequences (BES), integrated linkage and physical maps using BES-derived markers, physical map contig-specific sequences, and draft genome sequences. Application of such genome resources should allow comparative analysis at the genome scale with several other model fish species. Results In this study, we conducted whole genome comparative analysis between channel catfish and four model fish species with fully sequenced genomes, zebrafish, medaka, stickleback and Tetraodon. A total of 517 Mb draft genome sequences of catfish were anchored to its genetic linkage map, which accounted for 62% of the total draft genome sequences. Based on the location of homologous genes, homologous chromosomes were determined among catfish and the four model fish species. A large number of conserved syntenic blocks were identified. Analysis of the syntenic relationships between catfish and the four model fishes supported that the catfish genome is most similar to the genome of zebrafish. Conclusion The organization of the catfish genome is similar to that of the four teleost species, zebrafish, medaka, stickleback, and Tetraodon such that homologous chromosomes can be identified. Within each chromosome, extended syntenic blocks were evident, but the conserved syntenies at the chromosome level involve extensive inter-chromosomal and intra-chromosomal rearrangements. This whole genome comparative map should facilitate the whole genome assembly and annotation in catfish, and will be useful for genomic studies of various other fish species. PMID:24215161

  16. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome

    PubMed Central

    Cornick, Jennifer E.; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R.; Gray, Katherine J.; Kiran, Anmol M.; Molyneux, Elizabeth; French, Neil; Faragher, Brian E.; Everett, Dean B.; Bentley, Stephen D.

    2015-01-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  17. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    PubMed

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites.

  18. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  19. IMG 4 version of the integrated microbial genomes comparative analysis system

    PubMed Central

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  20. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  1. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  2. Sequencing and comparative genome analysis of two pathogenic Streptococcus gallolyticus subspecies: genome plasticity, adaptation and virulence.

    PubMed

    Lin, I-Hsuan; Liu, Tze-Tze; Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops.

  3. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach

    PubMed Central

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A.; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S.; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  4. Array comparative genomic hybridization analysis of olfactory neuroblastoma.

    PubMed

    Guled, Mohamed; Myllykangas, Samuel; Frierson, Henry F; Mills, Stacey E; Knuutila, Sakari; Stelow, Edward B

    2008-06-01

    Olfactory neuroblastoma is an unusual neuroectodermal malignancy, which is thought to arise at the olfactory membrane of the sinonasal tract. Due to its rarity, little is understood regarding its molecular and cytogenetic abnormalities. The aim of the current study is to identify specific DNA copy number changes in olfactory neuroblastoma. Thirteen dissected tissue samples were analyzed using array comparative genomic hybridization. Our results show that gene copy number profiles of olfactory neuroblastoma samples are complex. The most frequent changes included gains at 7q11.22-q21.11, 9p13.3, 13q, 20p/q, and Xp/q, and losses at 2q31.1, 2q33.3, 2q37.1, 6q16.3, 6q21.33, 6q22.1, 22q11.23, 22q12.1, and Xp/q. Gains were more frequent than losses, and high-stage tumors showed more alterations than low-stage olfactory neuroblastoma. Frequent changes in high-stage tumors were gains at 13q14.2-q14.3, 13q31.1, and 20q11.21-q11.23, and loss of Xp21.1 (in 66% of cases). Gains at 5q35, 13q, and 20q, and losses at 2q31.1, 2q33.3, and 6q16-q22, were present in 50% of cases. The identified regions of gene copy number change have been implicated in a variety of tumors, especially carcinomas. In addition, our results indicate that gains in 20q and 13q may be important in the progression of this cancer, and that these regions possibly harbor genes with functional relevance in olfactory neuroblastoma.

  5. Ebolavirus comparative genomics

    DOE PAGES

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; ...

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  6. Comparative Analysis of Alu Repeats in Primate Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Alu repeats are SINEs (Short intersperse repetitive elements) which enjoy a successful application in genome evolution, population biology, phylogenetics and forensics. Human Alu consensus sequences were widely used as surrogates in nonhuman primate studies with an assumption that all p...

  7. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    SciTech Connect

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  8. Genome sequence and comparative analysis of Avibacterium paragallinarum

    PubMed Central

    Requena, David; Chumbe, Ana; Torres, Michael; Alzamora, Ofelia; Ramirez, Manuel; Valdivia-Olarte, Hugo; Gutierrez, Andres Hazaet; Izquierdo-Lara, Ray; Saravia, Luis Enrique; Zavaleta, Milagros; Tataje-Lavanda, Luis; Best, Ivan; Fernández-Sánchez, Manolo; Icochea, Eliana; Zimic, Mirko; Fernández-Díaz, Manolo

    2013-01-01

    Background: Avibacterium paragallinarum, the causative agent of infectious coryza, is a highly contagious respiratory acute disease of poultry, which affects commercial chickens, laying hens and broilers worldwide. Methodology: In this study, we performed the whole genome sequencing, assembly and annotation of a Peruvian isolate of A. paragallinarum. Genome was sequenced in a 454 GS FLX Titanium system. De novo assembly was performed and annotation was completed with GS De Novo Assembler 2.6 using the H. influenzae str. F3031 gene model. Manual curation of the genome was performed with Artemis. Putative function of genes was predicted with Blast2GO. Virulence factors were identified by comparison with the Virulence Factor Database. Results: The genome obtained has a length of 2.47 Mb with 40.66% of GC content. Seventy five large contigs (>500 nt) were obtained, which comprised 1,204 predicted genes. All the contigs are available in Genbank [GenBank: PRJNA64665]. A total of 103 virulence factors, reported in the Virulence Factor Database, were found in A. paragallinarum. Forty four of them are present in 7 species of Haemophilus, which are related with pathogenesis, virulence and host immune system evasion. A tetracycline-resistance associated transposon (Tn10), was found in A. paragallinarum, possibly acting as a defense mechanism. Discussion and conclusion: The availability of A. paragallinarum genome represents an important source of information for the development of diagnostic tests, genotyping, and novel antigens for potential vaccines against infectious coryza. Identification of virulence factors contributes to better understanding the pathogenesis, and planning efforts for prevention and control of the disease. PMID:23861570

  9. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  10. c-GAMMA:Comparative Genome Analysis of Molecular Markers

    NASA Astrophysics Data System (ADS)

    Peterlongo, Pierre; Nicolas, Jacques; Lavenier, Dominique; Vorc'h, Raoul; Querellou, Joël

    Discovery of molecular markers for efficient identification of living organisms remains a challenge of high interest. The diversity of species can now be observed in details with low cost genomic sequences produced by new generation of sequencers. A method, called c-GAMMA, is proposed. It formalizes the design of new markers for such data. It is based on a series of filters on forbidden pairs of words, followed by an optimization step on the discriminative power of candidate markers.

  11. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  12. Comparative Genomic Analysis Reveals Ecological Differentiation in the Genus Carnobacterium

    PubMed Central

    Iskandar, Christelle F.; Borges, Frédéric; Taminiau, Bernard; Daube, Georges; Zagorec, Monique; Remenant, Benoît; Leisner, Jørgen J.; Hansen, Martin A.; Sørensen, Søren J.; Mangavel, Cécile; Cailliez-Grimal, Catherine; Revol-Junelles, Anne-Marie

    2017-01-01

    Lactic acid bacteria (LAB) differ in their ability to colonize food and animal-associated habitats: while some species are specialized and colonize a limited number of habitats, other are generalist and are able to colonize multiple animal-linked habitats. In the current study, Carnobacterium was used as a model genus to elucidate the genetic basis of these colonization differences. Analyses of 16S rRNA gene meta-barcoding data showed that C. maltaromaticum followed by C. divergens are the most prevalent species in foods derived from animals (meat, fish, dairy products), and in the gut. According to phylogenetic analyses, these two animal-adapted species belong to one of two deeply branched lineages. The second lineage contains species isolated from habitats where contact with animal is rare. Genome analyses revealed that members of the animal-adapted lineage harbor a larger secretome than members of the other lineage. The predicted cell-surface proteome is highly diversified in C. maltaromaticum and C. divergens with genes involved in adaptation to the animal milieu such as those encoding biopolymer hydrolytic enzymes, a heme uptake system, and biopolymer-binding adhesins. These species also exhibit genes for gut adaptation and respiration. In contrast, Carnobacterium species belonging to the second lineage encode a poorly diversified cell-surface proteome, lack genes for gut adaptation and are unable to respire. These results shed light on the important genomics traits required for adaptation to animal-linked habitats in generalist Carnobacterium. PMID:28337181

  13. Comparative Genomic Analysis of Human Fungal Pathogens Causing Paracoccidioidomycosis

    PubMed Central

    Desjardins, Christopher A.; Champion, Mia D.; Holder, Jason W.; Muszewska, Anna; Goldberg, Jonathan; Bailão, Alexandre M.; Brigido, Marcelo Macedo; Ferreira, Márcia Eliana da Silva; Garcia, Ana Maria; Grynberg, Marcin; Gujja, Sharvari; Heiman, David I.; Henn, Matthew R.; Kodira, Chinnappa D.; León-Narváez, Henry; Longo, Larissa V. G.; Ma, Li-Jun; Malavazi, Iran; Matsuo, Alisson L.; Morais, Flavia V.; Pereira, Maristela; Rodríguez-Brito, Sabrina; Sakthikumar, Sharadha; Salem-Izacc, Silvia M.; Sykes, Sean M.; Teixeira, Marcus Melo; Vallejo, Milene C.; Walter, Maria Emília Machado Telles; Yandava, Chandri; Young, Sarah; Zeng, Qiandong; Zucker, Jeremy; Felipe, Maria Sueli; Goldman, Gustavo H.; Haas, Brian J.; McEwen, Juan G.; Nino-Vega, Gustavo; Puccia, Rosana; San-Blas, Gioconda; Soares, Celia Maria de Almeida; Birren, Bruce W.; Cuomo, Christina A.

    2011-01-01

    Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18) and one strain of Paracoccidioides lutzii (Pb01). These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic species of

  14. Comparative analysis of microsatellites in chloroplast genomes of lower and higher plants.

    PubMed

    George, Biju; Bhatt, Bhavin S; Awasthi, Mayur; George, Binu; Singh, Achuit K

    2015-11-01

    Microsatellites, or simple sequence repeats (SSRs), contain repetitive DNA sequence where tandem repeats of one to six base pairs are present number of times. Chloroplast genome sequences have been  shown to possess extensive variations in the length, number and distribution of SSRs. However, a comparative analysis of chloroplast microsatellites is not available. Considering their potential importance in generating genomic diversity, we have systematically analysed the abundance and distribution of simple and compound microsatellites in 164 sequenced chloroplast genomes from wide range of plants. The key findings of these studies are (1) a large number of mononucleotide repeats as compared to SSR(2-6)(di-, tri-, tetra-, penta-, hexanucleotide repeats) are present in all chloroplast genomes investigated, (2) lower plants such as algae show wide variation in relative abundance, density and distribution of microsatellite repeats as compared to flowering plants, (3) longer SSRs are excluded from coding regions of most chloroplast genomes, (4) GC content has a weak influence on number, relative abundance and relative density of mononucleotide as well as SSR(2-6). However, GC content strongly showed negative correlation with relative density (R (2) = 0.5, P < 0.05) and relative abundance (R (2) = 0.6, P < 0.05) of cSSRs. In summary, our comparative studies of chloroplast genomes illustrate the variable distribution of microsatellites and revealed that chloroplast genome of smaller plants possesses relatively more genomic diversity compared to higher plants.

  15. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    PubMed Central

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  16. Investigating hookworm genomes by comparative analysis of two Ancylostoma species

    PubMed Central

    Mitreva, Makedonka; McCarter, James P; Arasu, Prema; Hawdon, John; Martin, John; Dante, Mike; Wylie, Todd; Xu, Jian; Stajich, Jason E; Kapulkin, Wadim; Clifton, Sandra W; Waterston, Robert H; Wilson, Richard K

    2005-01-01

    Background Hookworms, infecting over one billion people, are the mostly closely related major human parasites to the model nematode Caenorhabditis elegans. Applying genomics techniques to these species, we analyzed 3,840 and 3,149 genes from Ancylostoma caninum and A. ceylanicum. Results Transcripts originated from libraries representing infective L3 larva, stimulated L3, arrested L3, and adults. Most genes are represented in single stages including abundant transcripts like hsp-20 in infective L3 and vit-3 in adults. Over 80% of the genes have homologs in C. elegans, and nearly 30% of these were with observable RNA interference phenotypes. Homologies were identified to nematode-specific and clade V specific gene families. To study the evolution of hookworm genes, 574 A. caninum / A. ceylanicum orthologs were identified, all of which were found to be under purifying selection with distribution ratios of nonsynonymous to synonymous amino acid substitutions similar to that reported for C. elegans / C. briggsae orthologs. The phylogenetic distance between A. caninum and A. ceylanicum is almost identical to that for C. elegans / C. briggsae. Conclusion The genes discovered should substantially accelerate research toward better understanding of the parasites' basic biology as well as new therapies including vaccines and novel anthelmintics. PMID:15854223

  17. Comparative genomic analysis of Acinetobacter oleivorans DR1 to determine strain-specific genomic regions and gentisate biodegradation.

    PubMed

    Jung, Jaejoon; Madsen, Eugene L; Jeon, Che Ok; Park, Woojun

    2011-10-01

    The comparative genomics of Acinetobacter oleivorans DR1 assayed with A. baylyi ADP1, A. calcoaceticus PHEA-2, and A. baumannii ATCC 17978 revealed that the incorporation of phage-related genomic regions and the absence of transposable elements have contributed to the large size (4.15 Mb) of the DR1 genome. A horizontally transferred genomic region and a higher proportion of transcriptional regulator- and signal peptide-coding genes were identified as characteristics of the DR1 genome. Incomplete glucose metabolism, metabolic pathways of aromatic compounds, biofilm formation, antibiotics and metal resistance, and natural competence genes were conserved in four compared genomes. Interestingly, only strain DR1 possesses gentisate 1,2-dioxygenase (nagI) and grows on gentisate, whereas other species cannot. Expression of the nagI gene was upregulated during gentisate utilization, and four downstream open reading frames (ORFs) were cotranscribed, supporting the notion that gentisate metabolism is a unique characteristic of strain DR1. The genomic analysis of strain DR1 provides additional insights into the function, ecology, and evolution of Acinetobacter species.

  18. Genome Sequence of Cronobacter sakazakii BAA-894 and Comparative Genomic Hybridization Analysis with Other Cronobacter Species

    PubMed Central

    Kucerova, Eva; Clifton, Sandra W.; Xia, Xiao-Qin; Long, Fred; Porwollik, Steffen; Fulton, Lucinda; Fronick, Catrina; Minx, Patrick; Kyung, Kim; Warren, Wesley; Fulton, Robert; Feng, Dongyan; Wollam, Aye; Shah, Neha; Bhonagiri, Veena; Nash, William E.; Hallsworth-Pepin, Kymberlie; Wilson, Richard K.

    2010-01-01

    Background The genus Cronobacter (formerly called Enterobacter sakazakii) is composed of five species; C. sakazakii, C. malonaticus, C. turicensis, C. muytjensii, and C. dublinensis. The genus includes opportunistic human pathogens, and the first three species have been associated with neonatal infections. The most severe diseases are caused in neonates and include fatal necrotizing enterocolitis and meningitis. The genetic basis of the diversity within the genus is unknown, and few virulence traits have been identified. Methodology/Principal Findings We report here the first sequence of a member of this genus, C. sakazakii strain BAA-894. The genome of Cronobacter sakazakii strain BAA-894 comprises a 4.4 Mb chromosome (57% GC content) and two plasmids; 31 kb (51% GC) and 131 kb (56% GC). The genome was used to construct a 387,000 probe oligonucleotide tiling DNA microarray covering the whole genome. Comparative genomic hybridization (CGH) was undertaken on five other C. sakazakii strains, and representatives of the four other Cronobacter species. Among 4,382 annotated genes inspected in this study, about 55% of genes were common to all C. sakazakii strains and 43% were common to all Cronobacter strains, with 10–17% absence of genes. Conclusions/Significance CGH highlighted 15 clusters of genes in C. sakazakii BAA-894 that were divergent or absent in more than half of the tested strains; six of these are of probable prophage origin. Putative virulence factors were identified in these prophage and in other variable regions. A number of genes unique to Cronobacter species associated with neonatal infections (C. sakazakii, C. malonaticus and C. turicensis) were identified. These included a copper and silver resistance system known to be linked to invasion of the blood-brain barrier by neonatal meningitic strains of Escherichia coli. In addition, genes encoding for multidrug efflux pumps and adhesins were identified that were unique to C. sakazakii strains from

  19. Ebolavirus comparative genomics.

    PubMed

    Jun, Se-Ran; Leuze, Michael R; Nookaew, Intawat; Uberbacher, Edward C; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas D; Wassenaar, Trudy M; Ussery, David W

    2015-09-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

  20. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  1. Comparative analysis of CRISPR-Cas systems in Klebsiella genomes.

    PubMed

    Shen, Juntao; Lv, Li; Wang, Xudong; Xiu, Zhilong; Chen, Guoqiang

    2017-02-03

    Prokaryotic CRISPR-Cas system provides adaptive immunity against invasive genetic elements. Bacteria of the genus Klebsiella are important nosocomial opportunistic pathogens. However, information of CRISPR-Cas system in Klebsiella remains largely unknown. Here, we analyzed the CRISPR-Cas systems of 68 complete genomes of Klebsiella representing four species. All the elements for CRISPR-Cas system (cas genes, repeats, leader sequences, and PAMs) were characterized. Besides the typical Type I-E and I-F CRISPR-Cas systems, a new Subtype I system located in the ABC transport system-glyoxalase region was found. The conservation of the new subtype CRISPR system between different species showed new evidence for CRISPR horizontal transfer. CRISPR polymorphism was strongly correlated both with species and multilocus sequence types. Some results indicated the function of adaptive immunity: most spacers (112 of 124) matched to prophages and plasmids and no matching housekeeping genes; new spacer acquisition was observed within the same sequence type (ST) and same clonal complex; the identical spacers were observed only in the ancient position (far from the leader) between different STs and clonal complexes. Interestingly, a high ratio of self-targeting spacers (7.5%, 31 of 416) was found in CRISPR-bearing Klebsiella pneumoniae (61%, 11 of 18). In some strains, there even were multiple full matching self-targeting spacers. Some self-targeting spacers were conserved even between different STs. These results indicated that some unknown mechanisms existed to compromise the function of self-targets of CRISPR-Cas systems in K. pneumoniae.

  2. Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation

    PubMed Central

    Rasko, David A.; Worsham, Patricia L.; Abshire, Terry G.; Stanley, Scott T.; Bannan, Jason D.; Wilson, Mark R.; Langham, Richard J.; Decker, R. Scott; Jiang, Lingxia; Read, Timothy D.; Phillippy, Adam M.; Salzberg, Steven L.; Pop, Mihai; Van Ert, Matthew N.; Kenefic, Leo J.; Keim, Paul S.; Fraser-Liggett, Claire M.; Ravel, Jacques

    2011-01-01

    Before the anthrax letter attacks of 2001, the developing field of microbial forensics relied on microbial genotyping schemes based on a small portion of a genome sequence. Amerithrax, the investigation into the anthrax letter attacks, applied high-resolution whole-genome sequencing and comparative genomics to identify key genetic features of the letters’ Bacillus anthracis Ames strain. During systematic microbiological analysis of the spore material from the letters, we identified a number of morphological variants based on phenotypic characteristics and the ability to sporulate. The genomes of these morphological variants were sequenced and compared with that of the B. anthracis Ames ancestor, the progenitor of all B. anthracis Ames strains. Through comparative genomics, we identified four distinct loci with verifiable genetic mutations. Three of the four mutations could be directly linked to sporulation pathways in B. anthracis and more specifically to the regulation of the phosphorylation state of Spo0F, a key regulatory protein in the initiation of the sporulation cascade, thus linking phenotype to genotype. None of these variant genotypes were identified in single-colony environmental B. anthracis Ames isolates associated with the investigation. These genotypes were identified only in B. anthracis morphotypes isolated from the letters, indicating that the variants were not prevalent in the environment, not even the environments associated with the investigation. This study demonstrates the forensic value of systematic microbiological analysis combined with whole-genome sequencing and comparative genomics. PMID:21383169

  3. Comparative genomic analysis as a tool for biologicaldiscovery

    SciTech Connect

    Nobrega, Marcelo A.; Pennacchio, Len A.

    2003-03-30

    Biology is a discipline rooted in comparisons. Comparative physiology has assembled a detailed catalogue of the biological similarities and differences between species, revealing insights into how life has adapted to fill a wide-range of environmental niches. For example, the oxygen and carbon dioxide carrying capacity of vertebrate has evolved to provide strong advantages for species respiring at sea level, at high elevation or within water. Comparative- anatomy, -biochemistry, -pharmacology, -immunology and -cell biology have provided the fundamental paradigms from which each discipline has grown.

  4. Comparative genomic analysis of Brucella melitensis vaccine strain M5 provides insights into virulence attenuation.

    PubMed

    Jiang, Hai; Du, Pengcheng; Zhang, Wen; Wang, Heng; Zhao, Hongyan; Piao, Dongri; Tian, Guozhong; Chen, Chen; Cui, Buyun

    2013-01-01

    The Brucella melitensis vaccine strain M5 is widely used to prevent and control brucellosis in animals. In this study, we determined the whole-genome sequence of M5, and conducted a comprehensive comparative analysis against the whole-genome sequence of the virulent strain 16 M and other reference strains. This analysis revealed 11 regions of deletion (RDs) and 2 regions of insertion (RIs) within the M5 genome. Among these regions, the sequences encompassed in 5 RDs and 1 RI showed consistent variation, with a large deletion between the M5 and the 16 M genomes. RD4 and RD5 showed the large diversity among all Brucella genomes, both in RD length and RD copy number. Thus, RD4 and RD5 are potential sites for typing different Brucella strains. Other RD and RI regions exhibited multiple single nucleotide polymorphisms (SNPs). In addition, a genome fragment with a 56 kb rearrangement was determined to be consistent with previous studies. Comparative genomic analysis indicated that genomic island inversion in Brucella was widely present. With the genetic pattern common among all strains analyzed, these 2 RDs, 1 RI, and one inversion region are potential sites for detection of genomic differences. Several SNPs of important virulence-related genes (motB, dhbC, sfuB, dsbAB, aidA, aroC, and lysR) were also detected, and may be used to determine the mechanism of virulence attenuation. Collectively, this study reveals that comparative analysis between wild-type and vaccine strains can provide resources for the study of virulence and microevolution of Brucella.

  5. Comparative Genomic Analysis of Brucella melitensis Vaccine Strain M5 Provides Insights into Virulence Attenuation

    PubMed Central

    Zhang, Wen; Wang, Heng; Zhao, Hongyan; Piao, Dongri; Tian, Guozhong; Chen, Chen; Cui, Buyun

    2013-01-01

    The Brucella melitensis vaccine strain M5 is widely used to prevent and control brucellosis in animals. In this study, we determined the whole-genome sequence of M5, and conducted a comprehensive comparative analysis against the whole-genome sequence of the virulent strain 16 M and other reference strains. This analysis revealed 11 regions of deletion (RDs) and 2 regions of insertion (RIs) within the M5 genome. Among these regions, the sequences encompassed in 5 RDs and 1 RI showed consistent variation, with a large deletion between the M5 and the 16 M genomes. RD4 and RD5 showed the large diversity among all Brucella genomes, both in RD length and RD copy number. Thus, RD4 and RD5 are potential sites for typing different Brucella strains. Other RD and RI regions exhibited multiple single nucleotide polymorphisms (SNPs). In addition, a genome fragment with a 56 kb rearrangement was determined to be consistent with previous studies. Comparative genomic analysis indicated that genomic island inversion in Brucella was widely present. With the genetic pattern common among all strains analyzed, these 2 RDs, 1 RI, and one inversion region are potential sites for detection of genomic differences. Several SNPs of important virulence-related genes (motB, dhbC, sfuB, dsbAB, aidA, aroC, and lysR) were also detected, and may be used to determine the mechanism of virulence attenuation. Collectively, this study reveals that comparative analysis between wild-type and vaccine strains can provide resources for the study of virulence and microevolution of Brucella. PMID:23967122

  6. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  7. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources.

  8. Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

    PubMed Central

    Paterson, Andrew H.; Wang, Xuelin; Xu, Yiqing; Wu, Dongyang; Qu, Yanshu; Jiang, Anna; Ye, Qiaolin

    2016-01-01

    Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp) genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt) DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb) in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense) than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants. PMID:27847816

  9. Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches.

    PubMed

    Bi, Changwei; Paterson, Andrew H; Wang, Xuelin; Xu, Yiqing; Wu, Dongyang; Qu, Yanshu; Jiang, Anna; Ye, Qiaolin; Ye, Ning

    2016-01-01

    Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp) genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt) DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb) in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense) than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants.

  10. Comparative genomics of Brassicaceae crops

    PubMed Central

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-01-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  11. Comparative analysis of Salmonella genomes identifies a metabolic network for escalating growth in the inflamed gut.

    PubMed

    Nuccio, Sean-Paul; Bäumler, Andreas J

    2014-03-18

    The Salmonella genus comprises a group of pathogens associated with illnesses ranging from gastroenteritis to typhoid fever. We performed an in silico analysis of comparatively reannotated Salmonella genomes to identify genomic signatures indicative of disease potential. By removing numerous annotation inconsistencies and inaccuracies, the process of reannotation identified a network of 469 genes involved in central anaerobic metabolism, which was intact in genomes of gastrointestinal pathogens but degrading in genomes of extraintestinal pathogens. This large network contained pathways that enable gastrointestinal pathogens to utilize inflammation-derived nutrients as well as many of the biochemical reactions used for the enrichment and biochemical discrimination of Salmonella serovars. Thus, comparative genome analysis identifies a metabolic network that provides clues about the strategies for nutrient acquisition and utilization that are characteristic of gastrointestinal pathogens. IMPORTANCE While some Salmonella serovars cause infections that remain localized to the gut, others disseminate throughout the body. Here, we compared Salmonella genomes to identify characteristics that distinguish gastrointestinal from extraintestinal pathogens. We identified a large metabolic network that is functional in gastrointestinal pathogens but decaying in extraintestinal pathogens. While taxonomists have used traits from this network empirically for many decades for the enrichment and biochemical discrimination of Salmonella serovars, our findings suggest that it is part of a "business plan" for growth in the inflamed gastrointestinal tract. By identifying a large metabolic network characteristic of Salmonella serovars associated with gastroenteritis, our in silico analysis provides a blueprint for potential strategies to utilize inflammation-derived nutrients and edge out competing gut microbes.

  12. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium

    PubMed Central

    2013-01-01

    Background The genus Spiroplasma contains a group of helical, motile, and wall-less bacteria in the class Mollicutes. Similar to other members of this class, such as the animal-pathogenic Mycoplasma and the plant-pathogenic ‘Candidatus Phytoplasma’, all characterized Spiroplasma species were found to be associated with eukaryotic hosts. While most of the Spiroplasma species appeared to be harmless commensals of insects, a small number of species have evolved pathogenicity toward various arthropods and plants. In this study, we isolated a novel strain of honeybee-associated S. melliferum and investigated its genetic composition and evolutionary history by whole-genome shotgun sequencing and comparative analysis with other Mollicutes genomes. Results The whole-genome shotgun sequencing of S. melliferum IPMB4A produced a draft assembly that was ~1.1 Mb in size and covered ~80% of the chromosome. Similar to other Spiroplasma genomes that have been studied to date, we found that this genome contains abundant repetitive sequences that originated from plectrovirus insertions. These phage fragments represented a major obstacle in obtaining a complete genome sequence of Spiroplasma with the current sequencing technology. Comparative analysis of S. melliferum IPMB4A with other Spiroplasma genomes revealed that these phages may have facilitated extensive genome rearrangements in these bacteria and contributed to horizontal gene transfers that led to species-specific adaptation to different eukaryotic hosts. In addition, comparison of gene content with other Mollicutes suggested that the common ancestor of the SEM (Spiroplasma, Entomoplasma, and Mycoplasma) clade may have had a relatively large genome and flexible metabolic capacity; the extremely reduced genomes of present day Mycoplasma and ‘Candidatus Phytoplasma’ species are likely to be the result of independent gene losses in these lineages. Conclusions The findings in this study highlighted the significance of

  13. Comparative Genomics Analysis of Rice and Pineapple Contributes to Understand the Chromosome Number Reduction and Genomic Changes in Grasses.

    PubMed

    Wang, Jinpeng; Yu, Jiaxiang; Sun, Pengchuan; Li, Yuxian; Xia, Ruiyan; Liu, Yinzhe; Ma, Xuelian; Yu, Jigao; Yang, Nanshan; Lei, Tianyu; Wang, Zhenyi; Wang, Li; Ge, Weina; Song, Xiaoming; Liu, Xiaojian; Sun, Sangrong; Liu, Tao; Jin, Dianchuan; Pan, Yuxin; Wang, Xiyin

    2016-01-01

    Rice is one of the most researched model plant, and has a genome structure most resembling that of the grass common ancestor after a grass common tetraploidization ∼100 million years ago. There has been a standing controversy whether there had been five or seven basic chromosomes, before the tetraploidization, which were tackled but could not be well solved for the lacking of a sequenced and assembled outgroup plant to have a conservative genome structure. Recently, the availability of pineapple genome, which has not been subjected to the grass-common tetraploidization, provides a precious opportunity to solve the above controversy and to research into genome changes of rice and other grasses. Here, we performed a comparative genomics analysis of pineapple and rice, and found solid evidence that grass-common ancestor had 2n = 2x = 14 basic chromosomes before the tetraploidization and duplicated to 2n = 4x = 28 after the event. Moreover, we proposed that enormous gene missing from duplicated regions in rice should be explained by an allotetraploid produced by prominently divergent parental lines, rather than gene losses after their divergence. This means that genome fractionation might have occurred before the formation of the allotetraploid grass ancestor.

  14. Comparative Genomics Analysis of Rice and Pineapple Contributes to Understand the Chromosome Number Reduction and Genomic Changes in Grasses

    PubMed Central

    Wang, Jinpeng; Yu, Jiaxiang; Sun, Pengchuan; Li, Yuxian; Xia, Ruiyan; Liu, Yinzhe; Ma, Xuelian; Yu, Jigao; Yang, Nanshan; Lei, Tianyu; Wang, Zhenyi; Wang, Li; Ge, Weina; Song, Xiaoming; Liu, Xiaojian; Sun, Sangrong; Liu, Tao; Jin, Dianchuan; Pan, Yuxin; Wang, Xiyin

    2016-01-01

    Rice is one of the most researched model plant, and has a genome structure most resembling that of the grass common ancestor after a grass common tetraploidization ∼100 million years ago. There has been a standing controversy whether there had been five or seven basic chromosomes, before the tetraploidization, which were tackled but could not be well solved for the lacking of a sequenced and assembled outgroup plant to have a conservative genome structure. Recently, the availability of pineapple genome, which has not been subjected to the grass-common tetraploidization, provides a precious opportunity to solve the above controversy and to research into genome changes of rice and other grasses. Here, we performed a comparative genomics analysis of pineapple and rice, and found solid evidence that grass-common ancestor had 2n = 2x = 14 basic chromosomes before the tetraploidization and duplicated to 2n = 4x = 28 after the event. Moreover, we proposed that enormous gene missing from duplicated regions in rice should be explained by an allotetraploid produced by prominently divergent parental lines, rather than gene losses after their divergence. This means that genome fractionation might have occurred before the formation of the allotetraploid grass ancestor. PMID:27757123

  15. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India.

    PubMed

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar; Radhakrishnan, Girish

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  16. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    PubMed Central

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis. PMID:27525259

  17. Genome Sequence, Comparative Analysis, and Evolutionary Insights into Chitinases of Entomopathogenic Fungus Hirsutella thompsonii

    PubMed Central

    Agrawal, Yamini; Khatri, Indu; Subramanian, Srikrishna; Shenoy, Belle Damodara

    2015-01-01

    Hirsutella thompsonii (Ht) is a fungal pathogen of acarines and the primary cause of epizootics among mites. The draft genomes of two isolates of Ht (MTCC 3556: Ht3, 34.6 Mb and MTCC 6686: Ht6, 34.7 Mb) are presented and compared with the genomes of Beauveria bassiana (Bb) ARSEF 2860 and Ophiocordyceps sinensis (Os) CO18. Comparative analysis of carbohydrate active enzymes, pathogen–host interaction genes, metabolism-associated genes, and genes involved in biosynthesis of secondary metabolites in the four genomes was carried out. Reduction in gene family sizes in Ht3 and Os as compared with Ht6 and Bb is observed. Analysis of the mating type genes in Ht reveals the presence of MAT idiomorphs which is suggestive of cryptic sexual traits in Ht. We further identify and classify putative chitinases that may function as virulence factors in fungal entomopathogens due to their role in degradation of arthropod cuticle. PMID:25716828

  18. Comparative analysis of plastid genomes of non-photosynthetic Ericaceae and their photosynthetic relatives

    PubMed Central

    Logacheva, Maria D.; Schelkunov, Mikhail I.; Shtratnikova, Victoria Y.; Matveeva, Maria V.; Penin, Aleksey A.

    2016-01-01

    Although plastid genomes of flowering plants are typically highly conserved regarding their size, gene content and order, there are some exceptions. Ericaceae, a large and diverse family of flowering plants, warrants special attention within the context of plastid genome evolution because it includes both non-photosynthetic and photosynthetic species with rearranged plastomes and putative losses of “essential” genes. We characterized plastid genomes of three species of Ericaceae, non-photosynthetic Monotropa uniflora and Hypopitys monotropa and photosynthetic Pyrola rotundifolia, using high-throughput sequencing. As expected for non-photosynthetic plants, M. uniflora and H. monotropa have small plastid genomes (46 kb and 35 kb, respectively) lacking genes related to photosynthesis, whereas P. rotundifolia has a larger genome (169 kb) with a gene set similar to other photosynthetic plants. The examined genomes contain an unusually high number of repeats and translocations. Comparative analysis of the expanded set of Ericaceae plastomes suggests that the genes clpP and accD that are present in the plastid genomes of almost all plants have not been lost in this family (as was previously thought) but rather persist in these genomes in unusual forms. Also we found a new gene in P. rotundifolia that emerged as a result of duplication of rps4 gene. PMID:27452401

  19. Genome Information Broker for Viruses (GIB-V): database for comparative analysis of virus genomes

    PubMed Central

    Hirahata, Masaki; Abe, Takashi; Tanaka, Naoto; Kuwana, Yoshikazu; Shigemoto, Yasumasa; Miyazaki, Satoru; Suzuki, Yoshiyuki; Sugawara, Hideaki

    2007-01-01

    Genome Information Broker for Viruses (GIB-V) is a comprehensive virus genome/segment database. We extracted 18 418 complete virus genomes/segments from the International Nucleotide Sequence Database Collaboration (INSDC, ) by DNA Data Bank of Japan (DDBJ), EMBL and GenBank and stored them in our system. The list of registered viruses is arranged hierarchically according to taxonomy. Keyword searches can be performed for genome/segment data or biological features of any virus stored in GIB-V. GIB-V is equipped with a BLAST search function, and search results are displayed graphically or in list form. Moreover, the BLAST results can be used online with the ClustalW feature of the DDBJ. All available virus genome/segment data can be collected by the GIB-V download function. GIB-V can be accessed at no charge at . PMID:17158166

  20. Comparative Genomics Analysis of Two Different Virulent Bovine Pasteurella multocida Isolates

    PubMed Central

    Pan, Tingting; Li, Tian; Wu, Rui

    2016-01-01

    The Pasteurella multocida capsular type A isolates can cause pneumonia and bovine respiratory disease (BRD). In this study, comparative genomics analysis was carried out to identify the virulence genes in two different virulent P. multocida capsular type A isolates (high virulent PmCQ2 and low virulent PmCQ6). The draft genome sequence of PmCQ2 is 2.32 Mbp and contains 2,002 protein-coding genes, 9 insertion sequence (IS) elements, and 1 prophage region. The draft genome sequence of PmCQ6 is 2.29 Mbp and contains 1,970 protein-coding genes, 2 IS elements, and 3 prophage regions. The genome alignment analysis revealed that the genome similarity between PmCQ2 and PmCQ6 is 99% with high colinearity. To identify the candidate genes responsible for virulence, the PmCQ2 and PmCQ6 were compared together with that of the published genomes of high virulent Pm36950 and PmHN06 and avirulent Pm3480 and Pm70 (capsular type F). Five genes and two insertion sequences are identified in high virulent strains but not in low virulent or avirulent strains. These results indicated that these genes or insertion sequences might be responsible for the virulence of P. multocida, providing prospective candidates for further studies on the pathogenesis and the host-pathogen interactions of P. multocida. PMID:28070502

  1. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective

    PubMed Central

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163

  2. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    PubMed

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  3. IMG/M: integrated genome and metagenome comparative data analysis system.

    PubMed

    Chen, I-Min A; Markowitz, Victor M; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N; Kyrpides, Nikos C

    2017-01-04

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.

  4. IMG/M: integrated genome and metagenome comparative data analysis system

    PubMed Central

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2017-01-01

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. PMID:27738135

  5. Comparative Analysis of Subtyping Methods against a Whole- Genome-Sequencing Standard for Salmonella enterica Serotype Enteritidis

    DTIC Science & Technology

    2015-01-01

    SECURITY CLASSIFICATION OF: A retrospective investigation was performed to evaluate whole-genome sequencing as a benchmark for comparing molecular...cases collected between 2001 and 2012 were sequenced and subjected to subtyping by four different methods: (i) whole-genome single- nucleotide...distribution is unlimited. Comparative Analysis of Subtyping Methods against a Whole- Genome- Sequencing Standard for Salmonella enterica Serotype Enteritidis

  6. Comparative Genomic Analysis of Malaria Mosquito Vector-Associated Novel Pathogen Elizabethkingia anophelis

    PubMed Central

    Teo, Jeanette; Tan, Sean Yang-Yi; Liu, Yang; Tay, Martin; Ding, Yichen; Li, Yingying; Kjelleberg, Staffan; Givskov, Michael; Lin, Raymond T.P.; Yang, Liang

    2014-01-01

    Acquisition of Elizabethkingia infections in intensive care units (ICUs) has risen in the past decade. Treatment of Elizabethkingia infections is challenging due to the lack of effective therapeutic regimens, leading to a high mortality rate. Elizabethkingia infections have long been attributed to Elizabethkingia meningoseptica. Recently, we used whole-genome sequencing to reveal that E. anophelis is the pathogenic agent for an Elizabethkingia outbreak at two ICUs. We performed comparative genomic analysis of seven hospital-isolated E. anophelis strains with five available Elizabethkingia spp. genomes deposited in the National Center for Biotechnology Information Database. A pan-genomic approach was applied to identify the core- and pan-genome for the Elizabethkingia genus. We showed that unlike the hospital-isolated pathogen E. meningoseptica ATCC 12535 strain, the hospital-isolated E. anophelis strains have genome content and organization similar to the E. anophelis Ag1 and R26 strains isolated from the midgut microbiota of the malaria mosquito vector Anopheles gambiae. Both the core- and accessory genomes of Elizabethkingia spp. possess genes conferring antibiotic resistance and virulence. Our study highlights that E. anophelis is an emerging bacterial pathogen for hospital environments. PMID:24803570

  7. Comparative genomic analysis of malaria mosquito vector-associated novel pathogen Elizabethkingia anophelis.

    PubMed

    Teo, Jeanette; Tan, Sean Yang-Yi; Liu, Yang; Tay, Martin; Ding, Yichen; Li, Yingying; Kjelleberg, Staffan; Givskov, Michael; Lin, Raymond T P; Yang, Liang

    2014-05-06

    Acquisition of Elizabethkingia infections in intensive care units (ICUs) has risen in the past decade. Treatment of Elizabethkingia infections is challenging due to the lack of effective therapeutic regimens, leading to a high mortality rate. Elizabethkingia infections have long been attributed to Elizabethkingia meningoseptica. Recently, we used whole-genome sequencing to reveal that E. anophelis is the pathogenic agent for an Elizabethkingia outbreak at two ICUs. We performed comparative genomic analysis of seven hospital-isolated E. anophelis strains with five available Elizabethkingia spp. genomes deposited in the National Center for Biotechnology Information Database. A pan-genomic approach was applied to identify the core- and pan-genome for the Elizabethkingia genus. We showed that unlike the hospital-isolated pathogen E. meningoseptica ATCC 12535 strain, the hospital-isolated E. anophelis strains have genome content and organization similar to the E. anophelis Ag1 and R26 strains isolated from the midgut microbiota of the malaria mosquito vector Anopheles gambiae. Both the core- and accessory genomes of Elizabethkingia spp. possess genes conferring antibiotic resistance and virulence. Our study highlights that E. anophelis is an emerging bacterial pathogen for hospital environments.

  8. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

    PubMed Central

    Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  9. Complete Genome Sequence of Borrelia afzelii K78 and Comparative Genome Analysis

    PubMed Central

    Schüler, Wolfgang; Bunikis, Ignas; Weber-Lehman, Jacqueline; Comstedt, Pär; Kutschan-Bunikis, Sabrina; Stanek, Gerold; Huber, Jutta; Meinke, Andreas; Bergström, Sven; Lundberg, Urban

    2015-01-01

    The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp) and 13 plasmids (8 linear and 5 circular) together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes. PMID:25798594

  10. Comparative genome analysis: selection pressure on the Borrelia vls cassettes is essential for infectivity

    PubMed Central

    Glöckner, Gernot; Schulte-Spechtel, Ulrike; Schilhabel, Markus; Felder, Marius; Sühnel, Jürgen; Wilske, Bettina; Platzer, Matthias

    2006-01-01

    Background At least three species of Borrelia burgdorferi sensu lato (Bbsl) cause tick-borne Lyme disease. Previous work including the genome analysis of B. burgdorferi B31 and B. garinii PBi suggested a highly variable plasmid part. The frequent occurrence of duplicated sequence stretches, the observed plasmid redundancy, as well as the mainly unknown function and variability of plasmid encoded genes rendered the relationships between plasmids within and between species largely unresolvable. Results To gain further insight into Borreliae genome properties we completed the plasmid sequences of B. garinii PBi, added the genome of a further species, B. afzelii PKo, to our analysis, and compared for both species the genomes of pathogenic and apathogenic strains. The core of all Bbsl genomes consists of the chromosome and two plasmids collinear between all species. We also found additional groups of plasmids, which share large parts of their sequences. This makes it very likely that these plasmids are relatively stable and share common ancestors before the diversification of Borrelia species. The analysis of the differences between B. garinii PBi and B. afzelii PKo genomes of low and high passages revealed that the loss of infectivity is accompanied in both species by a loss of similar genetic material. Whereas B. garinii PBi suffered only from the break-off of a plasmid end, B. afzelii PKo lost more material, probably an entire plasmid. In both cases the vls gene locus encoding for variable surface proteins is affected. Conclusion The complete genome sequences of a B. garinii and a B. afzelii strain facilitate further comparative studies within the genus Borrellia. Our study shows that loss of infectivity can be traced back to only one single event in B. garinii PBi: the loss of the vls cassettes possibly due to error prone gene conversion. Similar albeit extended losses in B. afzelii PKo support the hypothesis that infectivity of Borrelia species depends heavily on

  11. Genome Sequencing and Comparative Genomics Analysis Revealed Pathogenic Potential in Penicillium capsulatum as a Novel Fungal Pathogen Belonging to Eurotiales

    PubMed Central

    Yang, Ying; Chen, Min; Li, Zongwei; Al-Hatmi, Abdullah M. S.; de Hoog, Sybren; Pan, Weihua; Ye, Qiang; Bo, Xiaochen; Li, Zhen; Wang, Shengqi; Wang, Junzhi; Chen, Huipeng; Liao, Wanqing

    2016-01-01

    Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptomes of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNPs in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen. PMID:27761131

  12. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis

    PubMed Central

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5’ portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids. PMID:26046631

  13. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok

    PubMed Central

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-01-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution. PMID:27581124

  14. Comparative genomic analysis of Brazilian Leptospira kirschneri serogroup Pomona serovar Mozdok.

    PubMed

    Moreno, Luisa Z; Kremer, Frederico S; Miraglia, Fabiana; Loureiro, Ana P; Eslabao, Marcus R; Dellagostin, Odir A; Lilenbaum, Walter; Moreno, Andrea M

    2016-08-01

    Leptospira kirschneri is one of the pathogenic species of the Leptospira genus. Human and animal infection from L. kirschneri gained further attention over the last few decades. Here we present the isolation and characterisation of Brazilian L. kirschneri serogroup Pomona serovar Mozdok strain M36/05 and the comparative genomic analysis with Brazilian human strain 61H. The M36/05 strain caused pulmonary hemorrhagic lesions in the hamster model, showing high virulence. The studied genomes presented high symmetrical identity and the in silico multilocus sequence typing analysis resulted in a new allelic profile (ST101) that so far has only been associated with the Brazilian L. kirschneri serogroup Pomona serovar Mozdok strains. Considering the environmental conditions and high genomic similarity observed between strains, we suggest the existence of a Brazilian L. kirschneri serogroup Pomona serovar Mozdok lineage that could represent a high public health risk; further studies are necessary to confirm the lineage significance and distribution.

  15. Comparative Genome Analysis of Filamentous Fungi Reveals Gene Family Expansions Associated with Fungal Pathogenesis

    PubMed Central

    Soanes, Darren M.; Alam, Intikhab; Cornell, Mike; Wong, Han Min; Hedeler, Cornelia; Paton, Norman W.; Rattray, Magnus; Hubbard, Simon J.; Oliver, Stephen G.; Talbot, Nicholas J.

    2008-01-01

    Fungi and oomycetes are the causal agents of many of the most serious diseases of plants. Here we report a detailed comparative analysis of the genome sequences of thirty-six species of fungi and oomycetes, including seven plant pathogenic species, that aims to explore the common genetic features associated with plant disease-causing species. The predicted translational products of each genome have been clustered into groups of potential orthologues using Markov Chain Clustering and the data integrated into the e-Fungi object-oriented data warehouse (http://www.e-fungi.org.uk/). Analysis of the species distribution of members of these clusters has identified proteins that are specific to filamentous fungal species and a group of proteins found only in plant pathogens. By comparing the gene inventories of filamentous, ascomycetous phytopathogenic and free-living species of fungi, we have identified a set of gene families that appear to have expanded during the evolution of phytopathogens and may therefore serve important roles in plant disease. We have also characterised the predicted set of secreted proteins encoded by each genome and identified a set of protein families which are significantly over-represented in the secretomes of plant pathogenic fungi, including putative effector proteins that might perturb host cell biology during plant infection. The results demonstrate the potential of comparative genome analysis for exploring the evolution of eukaryotic microbial pathogenesis. PMID:18523684

  16. Large-Scale Comparative Genomics Meta-Analysis of Campylobacter jejuni Isolates Reveals Low Level of Genome Plasticity

    PubMed Central

    Taboada, Eduardo N.; Acedillo, Rey R.; Carrillo, Catherine D.; Findlay, Wendy A.; Medeiros, Diane T.; Mykytczuk, Oksana L.; Roberts, Michael J.; Valencia, C. Alexander; Farber, Jeffrey M.; Nash, John H. E.

    2004-01-01

    We have used comparative genomic hybridization (CGH) on a full-genome Campylobacter jejuni microarray to examine genome-wide gene conservation patterns among 51 strains isolated from food and clinical sources. These data have been integrated with data from three previous C. jejuni CGH studies to perform a meta-analysis that included 97 strains from the four separate data sets. Although many genes were found to be divergent across multiple strains (n = 350), many genes (n = 249) were uniquely variable in single strains. Thus, the strains in each data set comprise strains with a unique genetic diversity not found in the strains in the other data sets. Despite the large increase in the collective number of variable C. jejuni genes (n = 599) found in the meta-analysis data set, nearly half of these (n = 276) mapped to previously defined variable loci, and it therefore appears that large regions of the C. jejuni genome are genetically stable. A detailed analysis of the microarray data revealed that divergent genes could be differentiated on the basis of the amplitudes of their differential microarray signals. Of 599 variable genes, 122 could be classified as highly divergent on the basis of CGH data. Nearly all highly divergent genes (117 of 122) had divergent neighbors and showed high levels of intraspecies variability. The approach outlined here has enabled us to distinguish global trends of gene conservation in C. jejuni and has enabled us to define this group of genes as a robust set of variable markers that can become the cornerstone of a new generation of genotyping methods that use genome-wide C. jejuni gene variability data. PMID:15472310

  17. Comparative genomic analysis of four representative plant growth-promoting rhizobacteria in Pseudomonas

    PubMed Central

    2013-01-01

    Background Some Pseudomonas strains function as predominant plant growth-promoting rhizobacteria (PGPR). Within this group, Pseudomonas chlororaphis and Pseudomonas fluorescens are non-pathogenic biocontrol agents, and some Pseudomonas aeruginosa and Pseudomonas stutzeri strains are PGPR. P. chlororaphis GP72 is a plant growth-promoting rhizobacterium with a fully sequenced genome. We conducted a genomic analysis comparing GP72 with three other pseudomonad PGPR: P. fluorescens Pf-5, P. aeruginosa M18, and the nitrogen-fixing strain P. stutzeri A1501. Our aim was to identify the similarities and differences among these strains using a comparative genomic approach to clarify the mechanisms of plant growth-promoting activity. Results The genome sizes of GP72, Pf-5, M18, and A1501 ranged from 4.6 to 7.1 M, and the number of protein-coding genes varied among the four species. Clusters of Orthologous Groups (COGs) analysis assigned functions to predicted proteins. The COGs distributions were similar among the four species. However, the percentage of genes encoding transposases and their inactivated derivatives (COG L) was 1.33% of the total genes with COGs classifications in A1501, 0.21% in GP72, 0.02% in Pf-5, and 0.11% in M18. A phylogenetic analysis indicated that GP72 and Pf-5 were the most closely related strains, consistent with the genome alignment results. Comparisons of predicted coding sequences (CDSs) between GP72 and Pf-5 revealed 3544 conserved genes. There were fewer conserved genes when GP72 CDSs were compared with those of A1501 and M18. Comparisons among the four Pseudomonas species revealed 603 conserved genes in GP72, illustrating common plant growth-promoting traits shared among these PGPR. Conserved genes were related to catabolism, transport of plant-derived compounds, stress resistance, and rhizosphere colonization. Some strain-specific CDSs were related to different kinds of biocontrol activities or plant growth promotion. The GP72 genome

  18. Comparative genome analysis across a kingdom of eukaryotic organisms: Specialization and diversification in the Fungi

    PubMed Central

    Cornell, Michael J.; Alam, Intikhab; Soanes, Darren M.; Wong, Han Min; Hedeler, Cornelia; Paton, Norman W.; Rattray, Magnus; Hubbard, Simon J.; Talbot, Nicholas J.; Oliver, Stephen G.

    2007-01-01

    The recent proliferation of genome sequencing in diverse fungal species has provided the first opportunity for comparative genome analysis across a eukaryotic kingdom. Here, we report a comparative study of 34 complete fungal genome sequences, representing a broad diversity of Ascomycete, Basidiomycete, and Zygomycete species. We have clustered all predicted protein-encoding gene sequences from these species to provide a means of investigating gene innovations, gene family expansions, protein family diversification, and the conservation of essential gene functions—empirically determined in Saccharomyces cerevisiae—among the fungi. The results are presented with reference to a phylogeny of the 34 fungal species, based on 29 universally conserved protein-encoding gene sequences. We contrast this phylogeny with one based on gene presence and absence and show that, while the two phylogenies are largely in agreement, there are differences in the positioning of some species. We have investigated levels of gene duplication and demonstrate that this varies greatly between fungal species, although there are instances of coduplication in distantly related fungi. We have also investigated the extent of orthology for protein families and demonstrate unexpectedly high levels of diversity among genes involved in lipid metabolism. These analyses have been collated in the e-Fungi data warehouse, providing an online resource for comparative genomic analysis of the fungi. PMID:17984228

  19. Comparative Analysis of Genomics and Proteomics in Bacillus thuringiensis 4.0718

    PubMed Central

    Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu

    2015-01-01

    Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for

  20. Complete genomic sequences and comparative analysis of Mamestra brassicae nucleopolyhedrovirus isolated in Korea.

    PubMed

    Choi, Jae Bang; Heo, Won Il; Shin, Tae Young; Bae, Sung Min; Kim, Woo Jin; Kim, Ju Il; Kwon, Min; Choi, Jae Young; Je, Yeon Ho; Jin, Byung Rae; Woo, Soo Dong

    2013-08-01

    Mamestra brassicae nucleopolyhedrovirus-K1 (MabrNPV-K1) was isolated from naturally infected M. brassicae (Lepidoptera: Noctuidae) larvae in Korea. The full genome sequences of MabrNPV-K1 were determined, analysed and compared to those of other baculoviruses. The MabrNPV-K1 genome consisted of 152,710 bp and had an overall G + C content of 39.9%. Computer-assisted analysis predicted 158 open reading frames (ORFs) of 150 nucleotides or greater that showed minimal overlap. Two inhibitor of apoptosis (iap) and six baculovirus repeated ORFs were interspersed in the MabrNPV-K1 genome. The unique MabrNPV-K1 ORF133 was identified in the MabrNPV-K1 genome that was not previously reported in baculoviruses. The gene content and arrangement in MabrNPV-K1 had the highest similarity with those of Helicoverpa armigera MNPV (HearMNPV) and Mamestra configurata NPV-B (MacoNPV-B), and their shared homologous genes were 99% collinear. The MabrNPV-K1 genome contained four homologous repeat regions (hr1, hr2, hr3 and hr4) that accounted for 3.3% of the genome. The genomic positions of the four MabrNPV-K1 hr regions were conserved among those of HearMNPV and MacoNPV-B. The gene parity plot, percent identity of the gene homologues and a phylogenetic analysis suggested that these three viruses are closely related not only to each other but also to the same virus strains rather than different virus species.

  1. Comparative analysis of genome maintenance genes in naked mole rat, mouse, and human.

    PubMed

    MacRae, Sheila L; Zhang, Quanwei; Lemetre, Christophe; Seim, Inge; Calder, Robert B; Hoeijmakers, Jan; Suh, Yousin; Gladyshev, Vadim N; Seluanov, Andrei; Gorbunova, Vera; Vijg, Jan; Zhang, Zhengdong D

    2015-04-01

    Genome maintenance (GM) is an essential defense system against aging and cancer, as both are characterized by increased genome instability. Here, we compared the copy number variation and mutation rate of 518 GM-associated genes in the naked mole rat (NMR), mouse, and human genomes. GM genes appeared to be strongly conserved, with copy number variation in only four genes. Interestingly, we found NMR to have a higher copy number of CEBPG, a regulator of DNA repair, and TINF2, a protector of telomere integrity. NMR, as well as human, was also found to have a lower rate of germline nucleotide substitution than the mouse. Together, the data suggest that the long-lived NMR, as well as human, has more robust GM than mouse and identifies new targets for the analysis of the exceptional longevity of the NMR.

  2. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    PubMed

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution.

  3. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade

    PubMed Central

    Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A.; Zhou, Zeyang; Vossbrinck, Charles R.

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An “ACCCTT” motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic

  4. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    SciTech Connect

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  5. Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles* #

    PubMed Central

    Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-yu; Zhang, Xiao-mei; Song, Da-feng; Zhang, Chen

    2016-01-01

    Objective: In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. Methods: The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). Results: We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Conclusions: Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate. PMID:27487802

  6. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  7. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea.

    PubMed

    Trantas, Emmanouil A; Licciardello, Grazia; Almeida, Nalvo F; Witek, Kamil; Strano, Cinzia P; Duxbury, Zane; Ververidis, Filippos; Goumas, Dimitrios E; Jones, Jonathan D G; Guttman, David S; Catara, Vittoria; Sarris, Panagiotis F

    2015-01-01

    The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor) and P. mediterranea (Pmed), are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for genes that encode proteins involved in commercially important chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of type III secretion system and known type III effector-encoding genes from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes. Genome-mining also revealed the presence of gene clusters for biosynthesis of siderophores, polyketides, non-ribosomal peptides, and hydrogen cyanide. A highly conserved quorum sensing system was detected in all strains, although species specific differences were observed. Our study provides the basis for in-depth investigations regarding the molecular mechanisms underlying virulence strategies in the battle between plants and microbes.

  8. Genome Sequencing and Comparative Analysis of the Biocontrol Agent Trichoderma harzianum sensu stricto TR274

    SciTech Connect

    Steindorff, Andrei S.; Noronha, Elilane F.; Ulhoa, Cirano J.; Kuo, Alan; Salamov, Asaf A.; Haridas, Sajeet; Riley, Robert W.; Druzhinina, Irina S.; Kubicek, Christian P.; Grigoriev, Igor V.

    2015-03-17

    Biological control is a complex process which requires many mechanisms and a high diversity of biochemical pathways. The species of Trichoderma harzianum are well known for their biocontrol activity against many plant pathogens. To gain new insights into the biocontrol mechanism used by T. harzianum, we sequenced the isolate TR274 genome using Illumina. The assembly was performed using AllPaths-LG with a maximum coverage of 100x. The assembly resulted in 2282 contigs with a N50 of 37033bp. The genome size generated was 40.8 Mb and the GC content was 47.7%, similar to other Trichoderma genomes. Using the JGI Annotation Pipeline we predicted 13,932 genes with a high transcriptome support. CEGMA tests suggested 100% genome completeness and 97.9% of RNA-SEQ reads were mapped to the genome. The phylogenetic comparison using orthologous proteins with all Trichoderma genomes sequenced at JGI, corroborates the Trichoderma (T. asperellum and T. atroviride), Longibrachiatum (T. reesei and T. longibrachiatum) and Pachibasium (T. harzianum and T. virens) section division described previously. The comparison between two Trichoderma harzianum species suggests a high genome similarity but some strain-specific expansions. Analyses of the secondary metabolites, CAZymes, transporters, proteases, transcription factors were performed. The Pachybasium section expanded virtually all categories analyzed compared with the other sections, specially Longibrachiatum section, that shows a clear contraction. These results suggests that these proteins families have an important role in their respective phenotypes. Future analysis will improve the understanding of this complex genus and give some insights about its lifestyle and the interactions with the environment.

  9. Comparative genomic analysis of multiple strains of two unusual plant pathogens: Pseudomonas corrugata and Pseudomonas mediterranea

    PubMed Central

    Trantas, Emmanouil A.; Licciardello, Grazia; Almeida, Nalvo F.; Witek, Kamil; Strano, Cinzia P.; Duxbury, Zane; Ververidis, Filippos; Goumas, Dimitrios E.; Jones, Jonathan D. G.; Guttman, David S.; Catara, Vittoria; Sarris, Panagiotis F.

    2015-01-01

    The non-fluorescent pseudomonads, Pseudomonas corrugata (Pcor) and P. mediterranea (Pmed), are closely related species that cause pith necrosis, a disease of tomato that causes severe crop losses. However, they also show strong antagonistic effects against economically important pathogens, demonstrating their potential for utilization as biological control agents. In addition, their metabolic versatility makes them attractive for the production of commercial biomolecules and bioremediation. An extensive comparative genomics study is required to dissect the mechanisms that Pcor and Pmed employ to cause disease, prevent disease caused by other pathogens, and to mine their genomes for genes that encode proteins involved in commercially important chemical pathways. Here, we present the draft genomes of nine Pcor and Pmed strains from different geographical locations. This analysis covered significant genetic heterogeneity and allowed in-depth genomic comparison. All examined strains were able to trigger symptoms in tomato plants but not all induced a hypersensitive-like response in Nicotiana benthamiana. Genome-mining revealed the absence of type III secretion system and known type III effector-encoding genes from all examined Pcor and Pmed strains. The lack of a type III secretion system appears to be unique among the plant pathogenic pseudomonads. Several gene clusters coding for type VI secretion system were detected in all genomes. Genome-mining also revealed the presence of gene clusters for biosynthesis of siderophores, polyketides, non-ribosomal peptides, and hydrogen cyanide. A highly conserved quorum sensing system was detected in all strains, although species specific differences were observed. Our study provides the basis for in-depth investigations regarding the molecular mechanisms underlying virulence strategies in the battle between plants and microbes. PMID:26300874

  10. Comparative genomic analysis of Drosophila melanogaster and vector mosquito developmental genes.

    PubMed

    Behura, Susanta K; Haugen, Morgan; Flannery, Ellen; Sarro, Joseph; Tessier, Charles R; Severson, David W; Duman-Scheel, Molly

    2011-01-01

    Genome sequencing projects have presented the opportunity for analysis of developmental genes in three vector mosquito species: Aedes aegypti, Culex quinquefasciatus, and Anopheles gambiae. A comparative genomic analysis of developmental genes in Drosophila melanogaster and these three important vectors of human disease was performed in this investigation. While the study was comprehensive, special emphasis centered on genes that 1) are components of developmental signaling pathways, 2) regulate fundamental developmental processes, 3) are critical for the development of tissues of vector importance, 4) function in developmental processes known to have diverged within insects, and 5) encode microRNAs (miRNAs) that regulate developmental transcripts in Drosophila. While most fruit fly developmental genes are conserved in the three vector mosquito species, several genes known to be critical for Drosophila development were not identified in one or more mosquito genomes. In other cases, mosquito lineage-specific gene gains with respect to D. melanogaster were noted. Sequence analyses also revealed that numerous repetitive sequences are a common structural feature of Drosophila and mosquito developmental genes. Finally, analysis of predicted miRNA binding sites in fruit fly and mosquito developmental genes suggests that the repertoire of developmental genes targeted by miRNAs is species-specific. The results of this study provide insight into the evolution of developmental genes and processes in dipterans and other arthropods, serve as a resource for those pursuing analysis of mosquito development, and will promote the design and refinement of functional analysis experiments.

  11. Comparative 3D genome structure analysis of the fission and the budding yeast.

    PubMed

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species.

  12. Comparative Genomic Analysis among Four Representative Isolates of Phytophthora sojae Reveals Genes under Evolutionary Selection

    PubMed Central

    Ye, Wenwu; Wang, Yang; Tyler, Brett M.; Wang, Yuanchao

    2016-01-01

    Comparative genomic analysis is useful for identifying genes affected by evolutionary selection and for studying adaptive variation in gene functions. In Phytophthora sojae, a model oomycete plant pathogen, the related study is lacking. We compared sequence data among four isolates of P. sojae, which represent its four major genotypes. These isolates exhibited >99.688%, >99.864%, and >98.981% sequence identities at genome, gene, and non-gene regions, respectively. One hundred and fifty-three positive selection and 139 negative selection candidate genes were identified. Between the two categories of genes, the positive selection genes were flanked by larger intergenic regions, poorly annotated in function, and less conserved; they had relatively lower transcription levels but many genes had increased transcripts during infection. Genes coding for predicted secreted proteins, particularly effectors, were overrepresented in positive selection. Several RxLR effector genes were identified as positive selection genes, exhibiting much stronger positive selection levels. In addition, candidate genes with presence/absence polymorphism were analyzed. This study provides a landscape of genomic variation among four representative P. sojae isolates and characterized several evolutionary selection-affected gene candidates. The results suggest a relatively covert two-speed genome evolution pattern in P. sojae and will provide clues for identification of new virulence factors in the oomycete plant pathogens. PMID:27746768

  13. Comparative genome analysis of Prevotella ruminicola and Prevotella bryantii: insights into their environmental niche.

    PubMed

    Purushe, Janaki; Fouts, Derrick E; Morrison, Mark; White, Bryan A; Mackie, Roderick I; Coutinho, Pedro M; Henrissat, Bernard; Nelson, Karen E

    2010-11-01

    The Prevotellas comprise a diverse group of bacteria that has received surprisingly limited attention at the whole genome-sequencing level. In this communication, we present the comparative analysis of the genomes of Prevotella ruminicola 23 (GenBank: CP002006) and Prevotella bryantii B(1)4 (GenBank: ADWO00000000), two gastrointestinal isolates. Both P. ruminicola and P. bryantii have acquired an extensive repertoire of glycoside hydrolases that are targeted towards non-cellulosic polysaccharides, especially GH43 bifunctional enzymes. Our analysis demonstrates the diversity of this genus. The results from these analyses highlight their role in the gastrointestinal tract, and provide a template for additional work on genetic characterization of these species.

  14. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods.

    PubMed

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from "Tua Nao" of Thailand traces a different evolutionary process from other strains.

  15. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods

    PubMed Central

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from “Tua Nao” of Thailand traces a different evolutionary process from other strains. PMID:26505996

  16. Evolution of Carbapenem-Resistant Acinetobacter baumannii Revealed through Whole-Genome Sequencing and Comparative Genomic Analysis

    PubMed Central

    Li, Henan; Liu, Fei; Zhang, Yawei; Wang, Xiaojuan; Zhao, Chunjiang; Chen, Hongbin; Zhang, Feifei; Zhu, Baoli

    2014-01-01

    Acinetobacter baumannii is a globally important nosocomial pathogen characterized by an evolving multidrug resistance. A total of 35 representative clinical A. baumannii strains isolated from 13 hospitals in nine cities in China from 1999 to 2011, including 32 carbapenem-resistant and 3 carbapenem-susceptible A. baumannii strains, were selected for whole-genome sequencing and comparative genomic analysis. Phylogenetic analysis revealed that the earliest strain, strain 1999BJAB11, and two strains isolated in Zhejiang Province in 2004 were the founder strains of carbapenem-resistant A. baumannii. Ten types of AbaR resistance islands were identified, and a previously unreported AbaR island, which comprised a two-component response regulator, resistance-related proteins, and RND efflux system proteins, was identified in two strains isolated in Zhejiang in 2004. Multiple transposons or insertion sequences (ISs) existed in each strain, and these gradually tended to diversify with evolution. Some of these IS elements or transposons were the first to be reported, and most of them were mainly found in strains from two provinces. Genome feature analysis illustrated diversified resistance genes, surface polysaccharides, and a restriction-modification system, even in strains that were phylogenetically and epidemiologically very closely related. IS-mediated deletions were identified in the type VI secretion system region, the csuE region, and core lipooligosaccharide (LOS) loci. Recombination occurred in the heme utilization region, and intrinsic resistance genes (blaADC and blaOXA-51-like variants) and three novel blaOXA-51-like variants (blaOXA-424, blaOXA-425, and blaOXA-426) were identified. Our results could improve the understanding of the evolutionary processes that contribute to the emergence of carbapenem-resistant A. baumannii strains and help elucidate the molecular evolutionary mechanism in A. baumannii. PMID:25487793

  17. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Simple sequence repeats (SSR) or microsatellite markers are one of the most informative and versatile DNA-based markers. The use of next-generation sequencing technologies allow whole genome sequencing and make it possible to develop large numbers of SSRs through bioinformatic analysis of genome da...

  18. A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens.

    PubMed

    Katz, Lee S; Griswold, Taylor; Williams-Newkirk, Amanda J; Wagner, Darlene; Petkau, Aaron; Sieffert, Cameron; Van Domselaar, Gary; Deng, Xiangyu; Carleton, Heather A

    2017-01-01

    Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized

  19. A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens

    PubMed Central

    Katz, Lee S.; Griswold, Taylor; Williams-Newkirk, Amanda J.; Wagner, Darlene; Petkau, Aaron; Sieffert, Cameron; Van Domselaar, Gary; Deng, Xiangyu; Carleton, Heather A.

    2017-01-01

    Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized

  20. Plasmodium vivax apicoplast genome: a comparative analysis of major genes from Indian field isolates.

    PubMed

    Saxena, Vishal; Garg, Shilpi; Tripathi, Jyotsna; Sharma, Sonal; Pakalapati, Deepak; Subudhi, Amit K; Boopathi, P A; Saggu, Gagandeep S; Kochar, Sanjay K; Kochar, Dhanpat K; Das, Ashis

    2012-04-01

    The apicomplexan parasite Plasmodium vivax is responsible for causing more than 70% of human malaria cases in Central and South America, Southeastern Asia and the Indian subcontinent. The rising severity of the disease and the increasing incidences of resistance shown by this parasite towards usual therapeutic regimens have necessitated investigation of putative novel drug targets to combat this disease. The apicoplast, an organelle of procaryotic origin, and its circular genome carrying genes of possible functional importance, are being looked upon as potential drug targets. The genes on this circular genome are believed to be highly conserved among all Plasmodium species. Till date, the plastid genome of P. falciparum, P. berghei and P. chabaudi have been detailed while partial sequences of some genes from other parasites including P. vivax have been studied for identifying evolutionary positions of these parasites. The functional aspects and significance of most of these genes are still hypothetical. In one of our previous reports, we have detailed the complete sequence, as well as structural and functional characteristics of the Elongation factor encoding tufA gene from the plastid genome of P. vivax. We present here the sequences of large and small subunit rRNA (lsu and ssu rRNA) genes, sufB (ORF470) gene, RNA polymerase (rpo B, C) subunit genes and clpC (casienolytic protease) gene from the plastid genome of P. vivax. A comparative analysis of these genes between P. vivax and P. falciparum reveals approximately 5-16% differences. A codon usage analysis of major plastid genes has shown a high frequency of codons rich in A/T at any or all of the three positions in all the species. TTA, AAT, AAA, TAT, and ATA are the major preferred codons. The sequences, functional domains and structural analysis of respective proteins do not show any variations in the active sites. A comparative analysis of these Indian P. vivax plastid genome encoded genes has also been done

  1. Array comparative genomic hybridization and cytogenetic analysis in pediatric acute leukemias.

    PubMed

    Dawson, A J; Yanofsky, R; Vallente, R; Bal, S; Schroedter, I; Liang, L; Mai, S

    2011-10-01

    Most patients with acute lymphocytic leukemia (all) are reported to have acquired chromosomal abnormalities in their leukemic bone marrow cells. Many established chromosome rearrangements have been described, and their associations with specific clinical, biologic, and prognostic features are well defined. However, approximately 30% of pediatric and 50% of adult patients with all do not have cytogenetic abnormalities of clinical significance. Despite significant improvements in outcome for pediatric all, therapy fails in approximately 25% of patients, and these failures often occur unpredictably in patients with a favorable prognosis and "good" cytogenetics at diagnosis.It is well known that karyotype analysis in hematologic malignancies, although genome-wide, is limited because of altered cell kinetics (mitotic rate), a propensity of leukemic blasts to undergo apoptosis in culture, overgrowth by normal cells, and chromosomes of poor quality in the abnormal clone. Array comparative genomic hybridization (acgh-"microarray") has a greatly increased genomic resolution over classical cytogenetics. Cytogenetic microarray, which uses genomic dna, is a powerful tool in the analysis of unbalanced chromosome rearrangements, such as copy number gains and losses, and it is the method of choice when the mitotic index is low and the quality of metaphases is suboptimal. The copy number profile obtained by microarray is often called a "molecular karyotype."In the present study, microarray was applied to 9 retrospective cases of pediatric all either with initial high-risk features or with at least 1 relapse. The conventional karyotype was compared to the "molecular karyotype" to assess abnormalities as interpreted by classical cytogenetics. Not only were previously undetected chromosome losses and gains identified by microarray, but several karyotypes interpreted by classical cytogenetics were shown to be discordant with the microarray results. The complementary use of microarray

  2. Comparative and phylogenetic analysis of the mitochondrial genomes in basal hymenopterans

    PubMed Central

    Song, Sheng-Nan; Tang, Pu; Wei, Shu-Jun; Chen, Xue-Xin

    2016-01-01

    The Symphyta is traditionally accepted as a paraphyletic group located in a basal position of the order Hymenoptera. Herein, we conducted a comparative analysis of the mitochondrial genomes in the Symphyta by describing two newly sequenced ones, from Trichiosoma anthracinum, representing the first mitochondrial genome in family Cimbicidae, and Asiemphytus rufocephalus, from family Tenthredinidae. The sequenced lengths of these two mitochondrial genomes were 15,392 and 14,864 bp, respectively. Within the sequenced region, trnC and trnY were rearranged to the upstream of trnI-nad2 in T. anthracinum, while in A. rufocephalus all sequenced genes were arranged in the putative insect ancestral gene arrangement. Rearrangement of the tRNA genes is common in the Symphyta. The rearranged genes are mainly from trnL1 and two tRNA clusters of trnI-trnQ-trnM and trnW-trnC-trnY. The mitochondrial genomes of Symphyta show a biased usage of A and T rather than G and C. Protein-coding genes in Symphyta species show a lower evolutionary rate than those of Apocrita. The Ka/Ks ratios were all less than 1, indicating purifying selection of Symphyta species. Phylogenetic analyses supported the paraphyly and basal position of Symphyta in Hymenoptera. The well-supported phylogenetic relationship in the study is Tenthredinoidea + (Cephoidea + (Orussoidea + Apocrita)). PMID:26879745

  3. Genetic linkage map and comparative genome analysis for the estuarine Atlantic killifish (Fundulus heteroclitus)

    EPA Pesticide Factsheets

    Genetic linkage maps are valuable tools in evolutionary biology; however, their availability for wild populations is extremely limited. Fundulus heteroclitus (Atlantic killifish) is a non-migratory estuarine fish that exhibits high allelic and phenotypic diversity partitioned among subpopulations that reside in disparate environmental conditions. An ideal candidate model organism for studying gene-environment interactions, the molecular toolbox for F. heteroclitus is limited. We identified hundreds of novel microsatellites which, when combined with existing microsatellites and single nucleotide polymorphisms (SNPs), were used to construct the first genetic linkage map for this species. By integrating independent linkage maps from three genetic crosses, we developed a consensus map containing 24 linkage groups, consistent with the number of chromosomes reported for this species. These linkage groups span 2300 centimorgans (cM) of recombinant genomic space, intermediate in size relative to the current linkage maps for the teleosts, medaka and zebrafish. Comparisons between fish genomes support a high degree of synteny between the consensus F. heteroclitus linkage map and the medaka and (to a lesser extent) zebrafish physical genome assemblies.This dataset is associated with the following publication:Waits , E., J. Martinson , B. Rinner, S. Morris, D. Proestou, D. Champlin , and D. Nacci. Genetic linkage map and comparative genome analysis for the estuarine Atlanti

  4. Comparative analysis of the complete chloroplast genome sequences in psammophytic Haloxylon species (Amaranthaceae)

    PubMed Central

    Dong, Wenpan; Xu, Chao; Li, Delu; Jin, Xiaobai; Li, Ruili

    2016-01-01

    The Haloxylon genus belongs to the Amaranthaceae (formerly Chenopodiaceae) family. The small trees or shrubs in this genus are referred to as the King of psammophytic plants, and perform important functions in environmental protection, including wind control and sand fixation in deserts. To better understand these beneficial plants, we sequenced the chloroplast (cp) genomes of Haloxylon ammodendron (HA) and Haloxylon persicum (HP) and conducted comparative genomic analyses on these and two other representative Amaranthaceae species. Similar to other higher plants, we found that the Haloxylon cp genome is a quadripartite, double-stranded, circular DNA molecule of 151,570 bp in HA and 151,586 bp in HP. It contains a pair of inverted repeats (24,171 bp in HA and 24,177 bp in HP) that separate the genome into a large single copy region of 84,214 bp in HA and 84,217 bp in HP, and a small single copy region of 19,014 bp in HA and 19,015 bp in HP. Each Haloxylon cp genome contains 112 genes, including 78 coding, 30 tRNA, and four ribosomal RNA genes. We detected 59 different simple sequence repeat loci, including 44 mono-nucleotide, three di-nucleotide, one tri-nucleotide, and 11 tetra-nucleotide repeats. Comparative analysis revealed only 67 mutations between the two species, including 44 substitutions, 23 insertions/deletions, and two micro-inversions. The two inversions, with lengths of 14 and 3 bp, occur in the petA-psbJ intergenic region and rpl16 intron, respectively, and are predicted to form hairpin structures with repeat sequences of 27 and 19 bp, respectively, at the two ends. The ratio of transitions to transversions was 0.76. These results are valuable for future studies on Haloxylon genetic diversity and will enhance our understanding of the phylogenetic evolution of Amaranthaceae. PMID:27867769

  5. Correction: Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2014-01-01

    Abstract The version of this article published in BMC Genomics 2013, 14: 274, contains 9 unpublished genomes (Botryobasidium botryosum, Gymnopus luxurians, Hypholoma sublateritium, Jaapia argillacea, Hebeloma cylindrosporum, Conidiobolus coronatus, Laccaria amethystina, Paxillus involutus, and P. rubicundulus) downloaded from JGI website. In this correction, we removed these genomes after discussion with editors and data producers whom we should have contacted before downloading these genomes. Removing these data did not alter the principle results and conclusions of our original work. The relevant Figures 1, 2, 3, 4 and 6; and Table 1 have been revised. Additional files 1, 3, 4, and 5 were also revised. We would like to apologize for any confusion or inconvenience this may have caused. Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 94 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed

  6. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    PubMed

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements.

  7. Gramene database: navigating plant comparative genomics resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationship...

  8. Comparative Mitochondrial Genome Analysis of Eligma narcissus and other Lepidopteran Insects Reveals Conserved Mitochondrial Genome Organization and Phylogenetic Relationships

    PubMed Central

    Dai, Li-Shang; Zhu, Bao-Jian; Zhao, Yue; Zhang, Cong-Fen; Liu, Chao-Liang

    2016-01-01

    In this study, we sequenced the complete mitochondrial genome of Eligma narcissus and compared it with 18 other lepidopteran species. The mitochondrial genome (mitogenome) was a circular molecule of 15,376 bp containing 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes and an adenine (A) + thymine (T) − rich region. The positive AT skew (0.007) indicated the occurrence of more As than Ts. The arrangement of 13 PCGs was similar to that of other sequenced lepidopterans. All PCGs were initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which was initiated by the CGA sequence, as observed in other lepidopterans. The results of the codon usage analysis indicated that Asn, Ile, Leu, Tyr and Phe were the five most frequent amino acids. All tRNA genes were shown to be folded into the expected typical cloverleaf structure observed for mitochondrial tRNA genes. Phylogenetic relationships were analyzed based on the nucleotide sequences of 13 PCGs from other insect mitogenomes, which confirmed that E. narcissus is a member of the Noctuidae superfamily. PMID:27222440

  9. Comparative genome sequence analysis of Choristoneura occidentalis Freeman and C. rosaceana Harris (Lepidoptera: Tortricidae) alphabaculoviruses.

    PubMed

    Thumbi, David K; Béliveau, Catherine; Cusson, Michel; Lapointe, Renée; Lucarotti, Christopher J

    2013-01-01

    The complete genome sequences of Choristoneura occidentalis and C. rosaceana nucleopolyhedroviruses (ChocNPV and ChroNPV, respectively) (Baculoviridae: Alphabaculovirus) were determined and compared with each other and with those of other baculoviruses, including the genome of the closely related C. fumiferana NPV (CfMNPV). The ChocNPV genome was 128,446 bp in length (1147 bp smaller than that of CfMNPV), had a G+C content of 50.1%, and contained 148 open reading frames (ORFs). In comparison, the ChroNPV genome was 129,052 bp in length, had a G+C content of 48.6% and contained 149 ORFs. ChocNPV and ChroNPV shared 144 ORFs in common, and had a 77% sequence identity with each other and 96.5% and 77.8% sequence identity, respectively, with CfMNPV. Five homologous regions (hrs), with sequence similarities to those of CfMNPV, were identified in ChocNPV, whereas the ChroNPV genome contained three hrs featuring up to 14 repeats. Both genomes encoded three inhibitors of apoptosis (IAP-1, IAP-2, and IAP-3), as reported for CfMNPV, and the ChocNPV IAP-3 gene represented the most divergent functional region of this genome relative to CfMNPV. Two ORFs were unique to ChocNPV, and four were unique to ChroNPV. ChroNPV ORF chronpv38 is a eukaryotic initiation factor 5 (eIF-5) homolog that has also been identified in the C. occidentalis granulovirus (ChocGV) and is believed to be the product of horizontal gene transfer from the host. Based on levels of sequence identity and phylogenetic analysis, both ChocNPV and ChroNPV fall within group I alphabaculoviruses, where ChocNPV appears to be more closely related to CfMNPV than does ChroNPV. Our analyses suggest that it may be appropriate to consider ChocNPV and CfMNPV as variants of the same virus species.

  10. Comparative Genome Sequence Analysis of Choristoneura occidentalis Freeman and C. rosaceana Harris (Lepidoptera: Tortricidae) Alphabaculoviruses

    PubMed Central

    Thumbi, David K.; Béliveau, Catherine; Cusson, Michel; Lapointe, Renée; Lucarotti, Christopher J.

    2013-01-01

    The complete genome sequences of Choristoneura occidentalis and C. rosaceana nucleopolyhedroviruses (ChocNPV and ChroNPV, respectively) (Baculoviridae: Alphabaculovirus) were determined and compared with each other and with those of other baculoviruses, including the genome of the closely related C. fumiferana NPV (CfMNPV). The ChocNPV genome was 128,446 bp in length (1147 bp smaller than that of CfMNPV), had a G+C content of 50.1%, and contained 148 open reading frames (ORFs). In comparison, the ChroNPV genome was 129,052 bp in length, had a G+C content of 48.6% and contained 149 ORFs. ChocNPV and ChroNPV shared 144 ORFs in common, and had a 77% sequence identity with each other and 96.5% and 77.8% sequence identity, respectively, with CfMNPV. Five homologous regions (hrs), with sequence similarities to those of CfMNPV, were identified in ChocNPV, whereas the ChroNPV genome contained three hrs featuring up to 14 repeats. Both genomes encoded three inhibitors of apoptosis (IAP-1, IAP-2, and IAP-3), as reported for CfMNPV, and the ChocNPV IAP-3 gene represented the most divergent functional region of this genome relative to CfMNPV. Two ORFs were unique to ChocNPV, and four were unique to ChroNPV. ChroNPV ORF chronpv38 is a eukaryotic initiation factor 5 (eIF-5) homolog that has also been identified in the C. occidentalis granulovirus (ChocGV) and is believed to be the product of horizontal gene transfer from the host. Based on levels of sequence identity and phylogenetic analysis, both ChocNPV and ChroNPV fall within group I alphabaculoviruses, where ChocNPV appears to be more closely related to CfMNPV than does ChroNPV. Our analyses suggest that it may be appropriate to consider ChocNPV and CfMNPV as variants of the same virus species. PMID:23861954

  11. Comparative genome analysis of 19 Ureaplasma urealyticum and Ureaplasma parvum strains

    PubMed Central

    2012-01-01

    Background Ureaplasma urealyticum (UUR) and Ureaplasma parvum (UPA) are sexually transmitted bacteria among humans implicated in a variety of disease states including but not limited to: nongonococcal urethritis, infertility, adverse pregnancy outcomes, chorioamnionitis, and bronchopulmonary dysplasia in neonates. There are 10 distinct serotypes of UUR and 4 of UPA. Efforts to determine whether difference in pathogenic potential exists at the ureaplasma serovar level have been hampered by limitations of antibody-based typing methods, multiple cross-reactions and poor discriminating capacity in clinical samples containing two or more serovars. Results We determined the genome sequences of the American Type Culture Collection (ATCC) type strains of all UUR and UPA serovars as well as four clinical isolates of UUR for which we were not able to determine serovar designation. UPA serovars had 0.75−0.78 Mbp genomes and UUR serovars were 0.84−0.95 Mbp. The original classification of ureaplasma isolates into distinct serovars was largely based on differences in the major ureaplasma surface antigen called the multiple banded antigen (MBA) and reactions of human and animal sera to the organisms. Whole genome analysis of the 14 serovars and the 4 clinical isolates showed the mba gene was part of a large superfamily, which is a phase variable gene system, and that some serovars have identical sets of mba genes. Most of the differences among serovars are hypothetical genes, and in general the two species and 14 serovars are extremely similar at the genome level. Conclusions Comparative genome analysis suggests UUR is more capable of acquiring genes horizontally, which may contribute to its greater virulence for some conditions. The overwhelming evidence of extensive horizontal gene transfer among these organisms from our previous studies combined with our comparative analysis indicates that ureaplasmas exist as quasi-species rather than as stable serovars in their native

  12. Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization

    PubMed Central

    2011-01-01

    Background It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb) survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN), which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE) tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH). Methods Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA) was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested. Results The mean of the derivative log ratio spread (DLRSpread), which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05). The concordance of results between aCGH and fluorescence in situ hybridization (FISH) for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively). The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15). Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40). Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005) independent factor which was associated with larger total length of CNA of breast cancers. Conclusions Thus, archival FFPE tissues from A-bomb survivors are useful for genome-wide a

  13. Genome sequence of the model sulfate reducer Desulfovibrio gigas: a comparative analysis within the Desulfovibrio genus.

    PubMed

    Morais-Silva, Fabio O; Rezende, Antonio Mauro; Pimentel, Catarina; Santos, Catia I; Clemente, Carla; Varela-Raposo, Ana; Resende, Daniela M; da Silva, Sofia M; de Oliveira, Luciana Márcia; Matos, Marcia; Costa, Daniela A; Flores, Orfeu; Ruiz, Jerónimo C; Rodrigues-Pousada, Claudina

    2014-08-01

    Desulfovibrio gigas is a model organism of sulfate-reducing bacteria of which energy metabolism and stress response have been extensively studied. The complete genomic context of this organism was however, not yet available. The sequencing of the D. gigas genome provides insights into the integrated network of energy conserving complexes and structures present in this bacterium. Comparison with genomes of other Desulfovibrio spp. reveals the presence of two different CRISPR/Cas systems in D. gigas. Phylogenetic analysis using conserved protein sequences (encoded by rpoB and gyrB) indicates two main groups of Desulfovibrio spp, being D. gigas more closely related to D. vulgaris and D. desulfuricans strains. Gene duplications were found such as those encoding fumarate reductase, formate dehydrogenase, and superoxide dismutase. Complexes not yet described within Desulfovibrio genus were identified: Mnh complex, a v-type ATP-synthase as well as genes encoding the MinCDE system that could be responsible for the larger size of D. gigas when compared to other members of the genus. A low number of hydrogenases and the absence of the codh/acs and pfl genes, both present in D. vulgaris strains, indicate that intermediate cycling mechanisms may contribute substantially less to the energy gain in D. gigas compared to other Desulfovibrio spp. This might be compensated by the presence of other unique genomic arrangements of complexes such as the Rnf and the Hdr/Flox, or by the presence of NAD(P)H related complexes, like the Nuo, NfnAB or Mnh.

  14. Genome sequence of the model sulfate reducer Desulfovibrio gigas: a comparative analysis within the Desulfovibrio genus*

    PubMed Central

    Morais-Silva, Fabio O; Rezende, Antonio Mauro; Pimentel, Catarina; Santos, Catia I; Clemente, Carla; Varela–Raposo, Ana; Resende, Daniela M; da Silva, Sofia M; de Oliveira, Luciana Márcia; Matos, Marcia; Costa, Daniela A; Flores, Orfeu; Ruiz, Jerónimo C; Rodrigues-Pousada, Claudina

    2014-01-01

    Desulfovibrio gigas is a model organism of sulfate-reducing bacteria of which energy metabolism and stress response have been extensively studied. The complete genomic context of this organism was however, not yet available. The sequencing of the D. gigas genome provides insights into the integrated network of energy conserving complexes and structures present in this bacterium. Comparison with genomes of other Desulfovibrio spp. reveals the presence of two different CRISPR/Cas systems in D. gigas. Phylogenetic analysis using conserved protein sequences (encoded by rpoB and gyrB) indicates two main groups of Desulfovibrio spp, being D. gigas more closely related to D. vulgaris and D. desulfuricans strains. Gene duplications were found such as those encoding fumarate reductase, formate dehydrogenase, and superoxide dismutase. Complexes not yet described within Desulfovibrio genus were identified: Mnh complex, a v-type ATP-synthase as well as genes encoding the MinCDE system that could be responsible for the larger size of D. gigas when compared to other members of the genus. A low number of hydrogenases and the absence of the codh/acs and pfl genes, both present in D. vulgaris strains, indicate that intermediate cycling mechanisms may contribute substantially less to the energy gain in D. gigas compared to other Desulfovibrio spp. This might be compensated by the presence of other unique genomic arrangements of complexes such as the Rnf and the Hdr/Flox, or by the presence of NAD(P)H related complexes, like the Nuo, NfnAB or Mnh. PMID:25055974

  15. Comparative Genomic Analysis Reveals Organization, Function and Evolution of ars Genes in Pantoea spp.

    PubMed Central

    Wang, Liying; Wang, Jin; Jing, Chuanyong

    2017-01-01

    Numerous genes are involved in various strategies to resist toxic arsenic (As). However, the As resistance strategy in genus Pantoea is poorly understood. In this study, a comparative genome analysis of 23 Pantoea genomes was conducted. Two vertical genetic arsC-like genes without any contribution to As resistance were found to exist in the 23 Pantoea strains. Besides the two arsC-like genes, As resistance gene clusters arsRBC or arsRBCH were found in 15 Pantoea genomes. These ars clusters were found to be acquired by horizontal gene transfer (HGT) from sources related to Franconibacter helveticus, Serratia marcescens, and Citrobacter freundii. During the history of evolution, the ars clusters were acquired more than once in some species, and were lost in some strains, producing strains without As resistance capability. This study revealed the organization, distribution and the complex evolutionary history of As resistance genes in Pantoea spp.. The insights gained in this study improved our understanding on the As resistance strategy of Pantoea spp. and its roles in the biogeochemical cycling of As. PMID:28377759

  16. The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae

    PubMed Central

    Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

    2016-01-01

    Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965

  17. The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae.

    PubMed

    Hao, Zhaodong; Cheng, Tielong; Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

    2016-01-01

    Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild.

  18. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  19. Comparative Genomics Analysis of Mycobacterium ulcerans for the Identification of Putative Essential Genes and Therapeutic Candidates

    PubMed Central

    Tahir, Shifa; Tong, Yigang

    2012-01-01

    Mycobacterium ulcerans, the causative agent of Buruli ulcer, is the third most common mycobacterial disease after tuberculosis and leprosy. The present treatment options are limited and emergence of treatment resistant isolates represents a serious concern and a need for better therapeutics. Conventional drug discovery methods are time consuming and labor-intensive. Unfortunately, the slow growing nature of M. ulcerans in experimental conditions is also a barrier for drug discovery and development. In contrast, recent advancements in complete genome sequencing, in combination with cheminformatics and computational biology, represent an attractive alternative approach for the identification of therapeutic candidates worthy of experimental research. A computational, comparative genomics workflow was defined for the identification of novel therapeutic candidates against M. ulcerans, with the aim that a selected target should be essential to the pathogen, and have no homology in the human host. Initially, a total of 424 genes were predicted as essential from the M. ulcerans genome, via homology searching of essential genome content from 20 different bacteria. Metabolic pathway analysis showed that the most essential genes are associated with carbohydrate and amino acid metabolism. Among these, 236 proteins were identified as non-host and essential, and could serve as potential drug and vaccine candidates. Several drug target prioritization parameters including druggability were also calculated. Enzymes from several pathways are discussed as potential drug targets, including those from cell wall synthesis, thiamine biosynthesis, protein biosynthesis, and histidine biosynthesis. It is expected that our data will facilitate selection of M. ulcerans proteins for successful entry into drug design pipelines. PMID:22912793

  20. Genome Sequencing and Comparative Analysis of Saccharomyces cerevisiae Strains of the Peterhof Genetic Collection

    PubMed Central

    Drozdova, Polina B.; Tarasov, Oleg V.; Matveenko, Andrew G.; Radchenko, Elina A.; Sopova, Julia V.; Polev, Dmitrii E.; Inge-Vechtomov, Sergey G.; Dobrynin, Pavel V.

    2016-01-01

    The Peterhof genetic collection of Saccharomyces cerevisiae strains (PGC) is a large laboratory stock that has accumulated several thousands of strains for over than half a century. It originated independently of other common laboratory stocks from a distillery lineage (race XII). Several PGC strains have been extensively used in certain fields of yeast research but their genomes have not been thoroughly explored yet. Here we employed whole genome sequencing to characterize five selected PGC strains including one of the closest to the progenitor, 15V-P4, and several strains that have been used to study translation termination and prions in yeast (25-25-2V-P3982, 1B-D1606, 74-D694, and 6P-33G-D373). The genetic distance between the PGC progenitor and S288C is comparable to that between two geographically isolated populations. The PGC seems to be closer to two bakery strains than to S288C-related laboratory stocks or European wine strains. In genomes of the PGC strains, we found several loci which are absent from the S288C genome; 15V-P4 harbors a rare combination of the gene cluster characteristic for wine strains and the RTM1 cluster. We closely examined known and previously uncharacterized gene variants of particular strains and were able to establish the molecular basis for known phenotypes including phenylalanine auxotrophy, clumping behavior and galactose utilization. Finally, we made sequencing data and results of the analysis available for the yeast community. Our data widen the knowledge about genetic variation between Saccharomyces cerevisiae strains and can form the basis for planning future work in PGC-related strains and with PGC-derived alleles. PMID:27152522

  1. Complete mitochondrial genome of the aluminum-tolerant fungus Rhodotorula taiwanensis RS1 and comparative analysis of Basidiomycota mitochondrial genomes

    PubMed Central

    Zhao, Xue Qiang; Aizawa, Tomoko; Schneider, Jessica; Wang, Chao; Shen, Ren Fang; Sunairi, Michio

    2013-01-01

    The complete mitochondrial genome of Rhodotorula taiwanensis RS1, an aluminum-tolerant Basidiomycota fungus, was determined and compared with the known mitochondrial genomes of 12 Basidiomycota species. The mitochondrial genome of R. taiwanensis RS1 is a circular DNA molecule of 40,392 bp and encodes the typical 15 mitochondrial proteins, 23 tRNAs, and small and large rRNAs as well as 10 intronic open reading frames. These genes are apparently transcribed in two directions and do not show syntenies in gene order with other investigated Basidiomycota species. The average G+C content (41%) of the mitochondrial genome of R. taiwanensis RS1 is the highest among the Basidiomycota species. Two introns were detected in the sequence of the atp9 gene of R. taiwanensis RS1, but not in that of other Basidiomycota species. Rhodotorula taiwanensis is the first species of the genus Rhodotorula whose full mitochondrial genome has been sequenced; and the data presented here supply valuable information for understanding the evolution of fungal mitochondrial genomes and researching the mechanism of aluminum tolerance in microorganisms. PMID:23427135

  2. Exploring a Nonmodel Teleost Genome Through RAD Sequencing-Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis.

    PubMed

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C; Tsigenopoulos, Costas S

    2015-12-29

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts.

  3. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    PubMed Central

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  4. Use of methylation filtration and C(0)t fractionation for analysis of genome composition and comparative genomics in bread wheat.

    PubMed

    Bandopadhyay, Rajib; Rustgi, Sachin; Chaudhuri, Rajat Kanti; Khurana, Paramjit; Khurana, Jitendra Paul; Tyagi, Akhilesh Kumar; Balyan, Harindra Singh; Houben, Andreas; Gupta, Pushpendra Kumar

    2011-07-20

    We investigated the compositional and structural differences in sequences derived from different fractions of wheat genomic DNA obtained using methylation filtration and C(0)t fractionation. Comparative analysis of these sequences revealed large compositional and structural variations in terms of GC content, different structural elements including repeat sequences (e.g., transposable elements and simple sequence repeats), protein coding genes, and non-coding RNA genes. A correlation between methylation status [determined on the basis of selective inclusion/exclusion in methylation-filtered (MF) library] of different repeat elements and expression level was observed. The expression levels were determined by comparing MF sequences with expressed sequence tags (ESTs) available in the public domain. Only a limited overlap among MF, high C(0)t (HC), and ESTs was observed, suggesting that these sequences may largely either represent the low-copy non-transcribed sequences or include genes with low expression levels. Thus, these results indicated a need to study MF and HC sequences along with ESTs to fully appreciate complexity of wheat gene space.

  5. Comparative Functional Genomic Analysis of Two Vibrio Phages Reveals Complex Metabolic Interactions with the Host Cell

    PubMed Central

    Skliros, Dimitrios; Kalatzis, Panos G.; Katharios, Pantelis; Flemetakis, Emmanouil

    2016-01-01

    Sequencing and annotation was performed for two large double stranded DNA bacteriophages, φGrn1 and φSt2 of the Myoviridae family, considered to be of great interest for phage therapy against Vibrios in aquaculture live feeds. In addition, phage–host metabolic interactions and exploitation was studied by transcript profiling of selected viral and host genes. Comparative genomic analysis with other large Vibrio phages was also performed to establish the presence and location of homing endonucleases highlighting distinct features for both phages. Phylogenetic analysis revealed that they belong to the “schizoT4like” clade. Although many reports of newly sequenced viruses have provided a large set of information, basic research related to the shift of the bacterial metabolism during infection remains stagnant. The function of many viral protein products in the process of infection is still unknown. Genome annotation identified the presence of several viral open reading frames (ORFs) participating in metabolism, including a Sir2/cobB (sirtuin) protein and a number of genes involved in auxiliary NAD+ and nucleotide biosynthesis, necessary for phage DNA replication. Key genes were subsequently selected for detail study of their expression levels during infection. This work suggests a complex metabolic interaction and exploitation of the host metabolic pathways and biochemical processes, including a possible post-translational protein modification, by the virus during infection. PMID:27895630

  6. Comparative Functional Genomic Analysis of Two Vibrio Phages Reveals Complex Metabolic Interactions with the Host Cell.

    PubMed

    Skliros, Dimitrios; Kalatzis, Panos G; Katharios, Pantelis; Flemetakis, Emmanouil

    2016-01-01

    Sequencing and annotation was performed for two large double stranded DNA bacteriophages, φGrn1 and φSt2 of the Myoviridae family, considered to be of great interest for phage therapy against Vibrios in aquaculture live feeds. In addition, phage-host metabolic interactions and exploitation was studied by transcript profiling of selected viral and host genes. Comparative genomic analysis with other large Vibrio phages was also performed to establish the presence and location of homing endonucleases highlighting distinct features for both phages. Phylogenetic analysis revealed that they belong to the "schizoT4like" clade. Although many reports of newly sequenced viruses have provided a large set of information, basic research related to the shift of the bacterial metabolism during infection remains stagnant. The function of many viral protein products in the process of infection is still unknown. Genome annotation identified the presence of several viral open reading frames (ORFs) participating in metabolism, including a Sir2/cobB (sirtuin) protein and a number of genes involved in auxiliary NAD(+) and nucleotide biosynthesis, necessary for phage DNA replication. Key genes were subsequently selected for detail study of their expression levels during infection. This work suggests a complex metabolic interaction and exploitation of the host metabolic pathways and biochemical processes, including a possible post-translational protein modification, by the virus during infection.

  7. A comparative genomic analysis of the alkalitolerant soil bacterium Bacillus lehensis G1.

    PubMed

    Noor, Yusuf Muhammad; Samsulrizal, Nurul Hidayah; Jema'on, Noor Azah; Low, Kheng Oon; Ramli, Aizi Nor Mazila; Alias, Noor Izawati; Damis, Siti Intan Rosdianah; Fuzi, Siti Fatimah Zaharah Mohd; Isa, Mohd Noor Mat; Murad, Abdul Munir Abdul; Raih, Mohd Firdaus Mohd; Bakar, Farah Diba Abu; Najimudin, Nazalan; Mahadi, Nor Muhammad; Illias, Rosli Md

    2014-07-25

    Bacillus lehensis G1 is a Gram-positive, moderately alkalitolerant bacterium isolated from soil samples. B. lehensis produces cyclodextrin glucanotransferase (CGTase), an enzyme that has enabled the extensive use of cyclodextrin in foodstuffs, chemicals, and pharmaceuticals. The genome sequence of B. lehensis G1 consists of a single circular 3.99 Mb chromosome containing 4017 protein-coding sequences (CDSs), of which 2818 (70.15%) have assigned biological roles, 936 (23.30%) have conserved domains with unknown functions, and 263 (6.55%) have no match with any protein database. Bacillus clausii KSM-K16 was established as the closest relative to B. lehensis G1 based on gene content similarity and 16S rRNA phylogenetic analysis. A total of 2820 proteins from B. lehensis G1 were found to have orthologues in B. clausii, including sodium-proton antiporters, transport proteins, and proteins involved in ATP synthesis. A comparative analysis of these proteins and those in B. clausii and other alkaliphilic Bacillus species was carried out to investigate their contributions towards the alkalitolerance of the microorganism. The similarities and differences in alkalitolerance-related genes among alkalitolerant/alkaliphilic Bacillus species highlight the complex mechanism of pH homeostasis. The B. lehensis G1 genome was also mined for proteins and enzymes with potential viability for industrial and commercial purposes.

  8. Genome-Wide Comparative Analysis of Chemosensory Gene Families in Five Tsetse Fly Species

    PubMed Central

    Macharia, Rosaline; Mireji, Paul; Murungi, Edwin; Murilla, Grace; Christoffels, Alan; Aksoy, Serap; Masiga, Daniel

    2016-01-01

    For decades, odour-baited traps have been used for control of tsetse flies (Diptera; Glossinidae), vectors of African trypanosomes. However, differential responses to known attractants have been reported in different Glossina species, hindering establishment of a universal vector control tool. Availability of full genome sequences of five Glossina species offers an opportunity to compare their chemosensory repertoire and enhance our understanding of their biology in relation to chemosensation. Here, we identified and annotated the major chemosensory gene families in Glossina. We identified a total of 118, 115, 124, and 123 chemosensory genes in Glossina austeni, G. brevipalpis, G. f. fuscipes, G. pallidipes, respectively, relative to 127 reported in G. m. morsitans. Our results show that tsetse fly genomes have fewer chemosensory genes when compared to other dipterans such as Musca domestica (n>393), Drosophila melanogaster (n = 246) and Anopheles gambiae (n>247). We also found that Glossina chemosensory genes are dispersed across distantly located scaffolds in their respective genomes, in contrast to other insects like D. melanogaster whose genes occur in clusters. Further, Glossina appears to be devoid of sugar receptors and to have expanded CO2 associated receptors, potentially reflecting Glossina's obligate hematophagy and the need to detect hosts that may be out of sight. We also identified, in all species, homologs of Ir84a; a Drosophila-specific ionotropic receptor that promotes male courtship suggesting that this is a conserved trait in tsetse flies. Notably, our selection analysis revealed that a total of four gene loci (Gr21a, GluRIIA, Gr28b, and Obp83a) were under positive selection, which confers fitness advantage to species. These findings provide a platform for studies to further define the language of communication of tsetse with their environment, and influence development of novel approaches for control. PMID:26886411

  9. Characterization and comparative genomic analysis of bacteriophages infecting members of the Bacillus cereus group.

    PubMed

    Lee, Ju-Hoon; Shin, Hakdong; Ryu, Sangryeol

    2014-05-01

    The Bacillus cereus group phages infecting B. cereus, B. anthracis, and B. thuringiensis (Bt) have been studied at the molecular level and, recently, at the genomic level to control the pathogens B. cereus and B. anthracis and to prevent phage contamination of the natural insect pesticide Bt. A comparative phylogenetic analysis has revealed three different major phage groups with different morphologies (Myoviridae for group I, Siphoviridae for group II, and Tectiviridae for group III), genome size (group I > group II > group III), and lifestyle (virulent for group I and temperate for group II and III). A subsequent phage genome comparison using a dot plot analysis showed that phages in each group are highly homologous, substantiating the grouping of B. cereus phages. Endolysin is a host lysis protein that contains two conserved domains: a cell-wall-binding domain (CBD) and an enzymatic activity domain (EAD). In B. cereus sensu lato phage group I, four different endolysin groups have been detected, according to combinations of two types of CBD and four types of EAD. Group I phages have two copies of tail lysins and one copy of endolysin, but the functions of the tail lysins are still unknown. In the B. cereus sensu lato phage group II, the B. anthracis phages have been studied and applied for typing and rapid detection of pathogenic host strains. In the B. cereus sensu lato phage group III, the B. thuringiensis phages Bam35 and GIL01 have been studied to understand phage entry and lytic switch regulation mechanisms. In this review, we suggest that further study of the B. cereus group phages would be useful for various phage applications, such as biocontrol, typing, and rapid detection of the pathogens B. cereus and B. anthracis and for the prevention of phage contamination of the natural insect pesticide Bt.

  10. Comparative Genomic Analysis of Sulfurospirillum cavolei MES Reconstructed from the Metagenome of an Electrosynthetic Microbiome

    PubMed Central

    Ross, Daniel E.; Marshall, Christopher W.; May, Harold D.; Norman, R. Sean

    2016-01-01

    Sulfurospirillum spp. play an important role in sulfur and nitrogen cycling, and contain metabolic versatility that enables reduction of a wide range of electron acceptors, including thiosulfate, tetrathionate, polysulfide, nitrate, and nitrite. Here we describe the assembly of a Sulfurospirillum genome obtained from the metagenome of an electrosynthetic microbiome. The ubiquity and persistence of this organism in microbial electrosynthesis systems suggest it plays an important role in reactor stability and performance. Understanding why this organism is present and elucidating its genetic repertoire provide a genomic and ecological foundation for future studies where Sulfurospirillum are found, especially in electrode-associated communities. Metabolic comparisons and in-depth analysis of unique genes revealed potential ecological niche-specific capabilities within the Sulfurospirillum genus. The functional similarities common to all genomes, i.e., core genome, and unique gene clusters found only in a single genome were identified. Based upon 16S rRNA gene phylogenetic analysis and average nucleotide identity, the Sulfurospirillum draft genome was found to be most closely related to Sulfurospirillum cavolei. Characterization of the draft genome described herein provides pathway-specific details of the metabolic significance of the newly described Sulfurospirillum cavolei MES and, importantly, yields insight to the ecology of the genus as a whole. Comparison of eleven sequenced Sulfurospirillum genomes revealed a total of 6246 gene clusters in the pan-genome. Of the total gene clusters, 18.5% were shared among all eleven genomes and 50% were unique to a single genome. While most Sulfurospirillum spp. reduce nitrate to ammonium, five of the eleven Sulfurospirillum strains encode for a nitrous oxide reductase (nos) cluster with an atypical nitrous-oxide reductase, suggesting a utility for this genus in reduction of the nitrous oxide, and as a potential sink for this

  11. In silico comparative genome analysis of malaria parasite Plasmodium falciparum and Plasmodium vivax chromosome 4.

    PubMed

    Taherian Fard, Atefeh; Salman, Amna; Kazemi, Bahram; Bokhari, Habib

    2009-06-01

    Malarial parasite has long been a subject of research for a large community of scientists and has yet to be conquered. One of the main obstacles to effectively control this disease is rapidly evolving genetic structure of Plasmodium parasite itself. In this study, we focused on chromosome 4 of the Plasmodium falciparum and Plasmodium vivax species and carried out comparative studies of genes that are responsible for antigenic variation in respective species. Comparative analysis of genes responsible for antigenic variation (var and vir genes in P. falciparum and P. vivax, respectively) showed significant difference in their respective nucleotide sequence lengths as well as amino acid composition. The possible association of exon's length on pathogenecity of respective Plasmodium species was also investigated, and analysis of gene structure showed that on the whole, exon lengths in P. falciparum are larger compared to P. vivax. Analysis of tandem repeats across the genome has shown that the size of repetitive sequences has a direct effect on chromosomes length, which can also be a potential reason for P. falciparum's greater variability and hence pathogenecity than P. vivax.

  12. Comparative Genome Analysis of Three Thiocyanate Oxidizing Thioalkalivibrio Species Isolated from Soda Lakes.

    PubMed

    Berben, Tom; Overmars, Lex; Sorokin, Dimitry Y; Muyzer, Gerard

    2017-01-01

    Thiocyanate is a C1 compound containing carbon, nitrogen, and sulfur. It is a (by)product in a number of natural and industrial processes. Because thiocyanate is toxic to many organisms, including humans, its removal from industrial waste streams is an important problem. Although a number of bacteria can use thiocyanate as a nitrogen source, only a few can use it as an electron donor. There are two distinct pathways to use thiocyanate: (i) the "carbonyl sulfide pathway," which has been extensively studied, and (ii) the "cyanate pathway," whose key enzyme, thiocyanate dehydrogenase, was recently purified and studied. Three species of Thioalkalivibrio, a group of haloalkaliphilic sulfur-oxidizing bacteria isolated from soda lakes, have been described as thiocyanate oxidizers: (i) Thioalkalivibrio paradoxus ("cyanate pathway"), (ii) Thioalkalivibrio thiocyanoxidans ("cyanate pathway") and (iii) Thioalkalivibrio thiocyanodenitrificans ("carbonyl sulfide pathway"). In this study we provide a comparative genome analysis of these described thiocyanate oxidizers, with genomes ranging in size from 2.5 to 3.8 million base pairs. While focusing on thiocyanate degradation, we also analyzed the differences in sulfur, carbon, and nitrogen metabolism. We found that the thiocyanate dehydrogenase gene is present in 10 different Thioalkalivibrio strains, in two distinct genomic contexts/genotypes. The first genotype is defined by having genes for flavocytochrome c sulfide dehydrogenase upstream from the thiocyanate dehydrogenase operon (present in two strains including the type strain of Tv. paradoxus), whereas in the second genotype these genes are located downstream, together with two additional genes of unknown function (present in eight strains, including the type strains of Tv. thiocyanoxidans). Additionally, we found differences in the presence/absence of genes for various sulfur oxidation pathways, such as sulfide:quinone oxidoreductase, dissimilatory sulfite reductase, and

  13. Comparative Genome Analysis of Three Thiocyanate Oxidizing Thioalkalivibrio Species Isolated from Soda Lakes

    PubMed Central

    Berben, Tom; Overmars, Lex; Sorokin, Dimitry Y.; Muyzer, Gerard

    2017-01-01

    Thiocyanate is a C1 compound containing carbon, nitrogen, and sulfur. It is a (by)product in a number of natural and industrial processes. Because thiocyanate is toxic to many organisms, including humans, its removal from industrial waste streams is an important problem. Although a number of bacteria can use thiocyanate as a nitrogen source, only a few can use it as an electron donor. There are two distinct pathways to use thiocyanate: (i) the “carbonyl sulfide pathway,” which has been extensively studied, and (ii) the “cyanate pathway,” whose key enzyme, thiocyanate dehydrogenase, was recently purified and studied. Three species of Thioalkalivibrio, a group of haloalkaliphilic sulfur-oxidizing bacteria isolated from soda lakes, have been described as thiocyanate oxidizers: (i) Thioalkalivibrio paradoxus (“cyanate pathway”), (ii) Thioalkalivibrio thiocyanoxidans (“cyanate pathway”) and (iii) Thioalkalivibrio thiocyanodenitrificans (“carbonyl sulfide pathway”). In this study we provide a comparative genome analysis of these described thiocyanate oxidizers, with genomes ranging in size from 2.5 to 3.8 million base pairs. While focusing on thiocyanate degradation, we also analyzed the differences in sulfur, carbon, and nitrogen metabolism. We found that the thiocyanate dehydrogenase gene is present in 10 different Thioalkalivibrio strains, in two distinct genomic contexts/genotypes. The first genotype is defined by having genes for flavocytochrome c sulfide dehydrogenase upstream from the thiocyanate dehydrogenase operon (present in two strains including the type strain of Tv. paradoxus), whereas in the second genotype these genes are located downstream, together with two additional genes of unknown function (present in eight strains, including the type strains of Tv. thiocyanoxidans). Additionally, we found differences in the presence/absence of genes for various sulfur oxidation pathways, such as sulfide:quinone oxidoreductase, dissimilatory

  14. Comparative analysis of genomic data: A global look at structural and regulatory features

    SciTech Connect

    Michaels, G.S.; Taylor, R.; Hagstrom, R.; Price, M.; Overbeek, R.

    1993-12-31

    One of the goals of any large scale DNA sequencing project is to understand the molecular details about the metabolic control sites that will be found in the sequence of the chromosome region being studied. In addition, once an interesting observation has been made, questions will quickly arise concerning the distribution of such sites within the genome and how well the same observations hold between related species. This paper will discuss the authors` approach toward building a flexible analysis environment that facilitates the analysis of genomic sequence data. The Integrated Genomic Database (IGD), developed by Ray Hagstrom, Ross Overbeek, Morgan Price and Dave Zawada at the Argonne National Laboratory, organizes genome mapping and sequencing data to provide a global chromosome view for multiple genomes. The authors describe here their use of the IGD system and how they employ it for relational analysis of sequence features that are found distributed throughout the genome under study. The primary goal of this work is to provide a system to support research on the global organization of genomic regulation patterns.

  15. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-08

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

  16. Customized Array Comparative Genomic Hybridization Analysis of 25 Phosphatase-encoding Genes in Colorectal Cancer Tissues

    PubMed Central

    LACZMANSKA, IZABELA; SKIBA, PAWEL; KARPINSKI, PAWEL; BEBENEK, MAREK; M. SASIADEK, MARIA

    2016-01-01

    Background/Aim: Molecular mechanisms of alterations in protein tyrosine phosphatases (PTPs) genes in cancer have been previously described and include chromosomal aberrations, gene mutations, and epigenetic silencing. However, little is known about small intragenic gains and losses that may lead to either changes in expression or enzyme activity and even loss of protein function. Materials and Methods: The aim of this study was to investigate 25 phosphatase genes using customized array comparative genomic hybridization in 16 sporadic colorectal cancer tissues. Results: The analysis revealed two unique small alterations: of 2 kb in PTPN14 intron 1 and of 1 kb in PTPRJ intron 1. We also found gains and losses of whole PTPs gene sequences covered by large chromosome aberrations. Conclusion: In our preliminary studies using high-resolution custom microarray we confirmed that PTPs are frequently subjected to whole-gene rearrangements in colorectal cancer, and we revealed that non-polymorphic intragenic changes are rare. PMID:28031238

  17. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc.

  18. The (d)evolution of methanotrophy in the Beijerinckiaceae—a comparative genomics analysis

    PubMed Central

    Tamas, Ivica; Smirnova, Angela V; He, Zhiguo; Dunfield, Peter F

    2014-01-01

    The alphaproteobacterial family Beijerinckiaceae contains generalists that grow on a wide range of substrates, and specialists that grow only on methane and methanol. We investigated the evolution of this family by comparing the genomes of the generalist organotroph Beijerinckia indica, the facultative methanotroph Methylocella silvestris and the obligate methanotroph Methylocapsa acidiphila. Highly resolved phylogenetic construction based on universally conserved genes demonstrated that the Beijerinckiaceae forms a monophyletic cluster with the Methylocystaceae, the only other family of alphaproteobacterial methanotrophs. Phylogenetic analyses also demonstrated a vertical inheritance pattern of methanotrophy and methylotrophy genes within these families. Conversely, many lateral gene transfer (LGT) events were detected for genes encoding carbohydrate transport and metabolism, energy production and conversion, and transcriptional regulation in the genome of B. indica, suggesting that it has recently acquired these genes. A key difference between the generalist B. indica and its specialist methanotrophic relatives was an abundance of transporter elements, particularly periplasmic-binding proteins and major facilitator transporters. The most parsimonious scenario for the evolution of methanotrophy in the Alphaproteobacteria is that it occurred only once, when a methylotroph acquired methane monooxygenases (MMOs) via LGT. This was supported by a compositional analysis suggesting that all MMOs in Alphaproteobacteria methanotrophs are foreign in origin. Some members of the Beijerinckiaceae subsequently lost methanotrophic functions and regained the ability to grow on multicarbon energy substrates. We conclude that B. indica is a recidivist multitroph, the only known example of a bacterium having completely abandoned an evolved lifestyle of specialized methanotrophy. PMID:23985741

  19. Emergence and evolutionary analysis of the human DDR network: implications in comparative genomics and downstream analyses.

    PubMed

    Arcas, Aida; Fernández-Capetillo, Oscar; Cases, Ildefonso; Rojas, Ana M

    2014-04-01

    The DNA damage response (DDR) is a crucial signaling network that preserves the integrity of the genome. This network is an ensemble of distinct but often overlapping subnetworks, where different components fulfill distinct functions in precise spatial and temporal scenarios. To understand how these elements have been assembled together in humans, we performed comparative genomic analyses in 47 selected species to trace back their emergence using systematic phylogenetic analyses and estimated gene ages. The emergence of the contribution of posttranslational modifications to the complex regulation of DDR was also investigated. This is the first time a systematic analysis has focused on the evolution of DDR subnetworks as a whole. Our results indicate that a DDR core, mostly constructed around metabolic activities, appeared soon after the emergence of eukaryotes, and that additional regulatory capacities appeared later through complex evolutionary process. Potential key posttranslational modifications were also in place then, with interacting pairs preferentially appearing at the same evolutionary time, although modifications often led to the subsequent acquisition of new targets afterwards. We also found extensive gene loss in essential modules of the regulatory network in fungi, plants, and arthropods, important for their validation as model organisms for DDR studies.

  20. Comparative genomic analysis and phenazine production of Pseudomonas chlororaphis, a plant growth-promoting rhizobacterium

    PubMed Central

    Chen, Yawen; Shen, Xuemei; Peng, Huasong; Hu, Hongbo; Wang, Wei; Zhang, Xuehong

    2015-01-01

    Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30–84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869–6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield. PMID:26484173

  1. Comparative Analysis of Lacinutrix Genomes and Their Association with Bacterial Habitat

    PubMed Central

    Lee, Yung Mi; Kim, Mi-Kyeong; Ahn, Do Hwan; Kim, Han-Woo; Park, Hyun; Shin, Seung Chul

    2016-01-01

    The genus Lacinutrix, which belongs to the family Flavobacteriaceae, consists of seven bacterial species that were mainly isolated from marine life and sediments. As most bacteria in the family Flavobacteriaceae favor aerobic conditions, the seven bacterial species in the genus Lacinutrix also showed aerobic growth. We selected four monophyletic bacterial species living in a polar environment. Two of these species were isolated from sediment and two types were isolated from algae. In a comparative analysis, we investigated how these different environments were related to genomic features of these four species in the genus Lacinutrix. We found that the gene sets for glycolysis, the Krebs cycle, and oxidative phosphorylation were conserved in these four type strains. However, the presence of nitrous oxide reductase for denitrification and the absence of essential components related to thiamin biosynthesis for aerobic respiration were only found in isolates from sediment. Elevated bacterial metabolism on the surface of marine sediments might limit the oxygen penetration into sediment, and such an environment might affect the genomes of bacteria isolated from these habitats. PMID:26882010

  2. Comparative genomics and transcriptome analysis of Aspergillus niger and metabolic engineering for citrate production

    PubMed Central

    Yin, Xian; Shin, Hyun-dong; Li, Jianghua; Du, Guocheng; Liu, Long; Chen, Jian

    2017-01-01

    Despite a long and successful history of citrate production in Aspergillus niger, the molecular mechanism of citrate accumulation is only partially understood. In this study, we used comparative genomics and transcriptome analysis of citrate-producing strains—namely, A. niger H915-1 (citrate titer: 157 g L−1), A1 (117 g L−1), and L2 (76 g L−1)—to gain a genome-wide view of the mechanism of citrate accumulation. Compared with A. niger A1 and L2, A. niger H915-1 contained 92 mutated genes, including a succinate-semialdehyde dehydrogenase in the γ-aminobutyric acid shunt pathway and an aconitase family protein involved in citrate synthesis. Furthermore, transcriptome analysis of A. niger H915-1 revealed that the transcription levels of 479 genes changed between the cell growth stage (6 h) and the citrate synthesis stage (12 h, 24 h, 36 h, and 48 h). In the glycolysis pathway, triosephosphate isomerase was up-regulated, whereas pyruvate kinase was down-regulated. Two cytosol ATP-citrate lyases, which take part in the cycle of citrate synthesis, were up-regulated, and may coordinate with the alternative oxidases in the alternative respiratory pathway for energy balance. Finally, deletion of the oxaloacetate acetylhydrolase gene in H915-1 eliminated oxalate formation but neither influence on pH decrease nor difference in citrate production were observed. PMID:28106122

  3. Comparative genome analysis of lignin biosynthesis gene families across the plant kingdom

    PubMed Central

    2009-01-01

    Background As a major component of plant cell wall, lignin plays important roles in mechanical support, water transport, and stress responses. As the main cause for the recalcitrance of plant cell wall, lignin modification has been a major task for bioenergy feedstock improvement. The study of the evolution and function of lignin biosynthesis genes thus has two-fold implications. First, the lignin biosynthesis pathway provides an excellent model to study the coordinative evolution of a biochemical pathway in plants. Second, understanding the function and evolution of lignin biosynthesis genes will guide us to develop better strategies for bioenergy feedstock improvement. Results We analyzed lignin biosynthesis genes from fourteen plant species and one symbiotic fungal species. Comprehensive comparative genome analysis was carried out to study the distribution, relatedness, and family expansion of the lignin biosynthesis genes across the plant kingdom. In addition, we also analyzed the comparative synteny map between rice and sorghum to study the evolution of lignin biosynthesis genes within the Poaceae family and the chromosome evolution between the two species. Comprehensive lignin biosynthesis gene expression analysis was performed in rice, poplar and Arabidopsis. The representative data from rice indicates that different fates of gene duplications exist for lignin biosynthesis genes. In addition, we also carried out the biomass composition analysis of nine Arabidopsis mutants with both MBMS analysis and traditional wet chemistry methods. The results were analyzed together with the genomics analysis. Conclusion The research revealed that, among the species analyzed, the complete lignin biosynthesis pathway first appeared in moss; the pathway is absent in green algae. The expansion of lignin biosynthesis gene families correlates with substrate diversity. In addition, we found that the expansion of the gene families mostly occurred after the divergence of monocots

  4. Comparative genomics of nematodes.

    PubMed

    Mitreva, Makedonka; Blaxter, Mark L; Bird, David M; McCarter, James P

    2005-10-01

    Recent transcriptome and genome projects have dramatically expanded the biological data available across the phylum Nematoda. Here we summarize analyses of these sequences, which have revealed multiple unexpected results. Despite a uniform body plan, nematodes are more diverse at the molecular level than was previously recognized, with many species- and group-specific novel genes. In the genus Caenorhabditis, changes in chromosome arrangement, particularly local inversions, are also rapid, with breakpoints occurring at 50-fold the rate in vertebrates. Tylenchid plant parasitic nematode genomes contain several genes closely related to genes in bacteria, implicating horizontal gene transfer events in the origins of plant parasitism. Functional genomics techniques are also moving from Caenorhabditis elegans to application throughout the phylum. Soon, eight more draft nematode genome sequences will be available. This unique resource will underpin both molecular understanding of these most abundant metazoan organisms and aid in the examination of the dynamics of genome evolution in animals.

  5. Comparative genomic analysis of clinical and environmental Vibrio vulnificus isolates revealed biotype 3 evolutionary relationships

    PubMed Central

    Koton, Yael; Gordon, Michal; Chalifa-Caspi, Vered; Bisharat, Naiel

    2015-01-01

    In 1996 a common-source outbreak of severe soft tissue and bloodstream infections erupted among Israeli fish farmers and fish consumers due to changes in fish marketing policies. The causative pathogen was a new strain of Vibrio vulnificus, named biotype 3, which displayed a unique biochemical and genotypic profile. Initial observations suggested that the pathogen erupted as a result of genetic recombination between two distinct populations. We applied a whole genome shotgun sequencing approach using several V. vulnificus strains from Israel in order to study the pan genome of V. vulnificus and determine the phylogenetic relationship of biotype 3 with existing populations. The core genome of V. vulnificus based on 16 draft and complete genomes consisted of 3068 genes, representing between 59 and 78% of the whole genome of 16 strains. The accessory genome varied in size from 781 to 2044 kbp. Phylogenetic analysis based on whole, core, and accessory genomes displayed similar clustering patterns with two main clusters, clinical (C) and environmental (E), all biotype 3 strains formed a distinct group within the E cluster. Annotation of accessory genomic regions found in biotype 3 strains and absent from the core genome yielded 1732 genes, of which the vast majority encoded hypothetical proteins, phage-related proteins, and mobile element proteins. A total of 1916 proteins (including 713 hypothetical proteins) were present in all human pathogenic strains (both biotype 3 and non-biotype 3) and absent from the environmental strains. Clustering analysis of the non-hypothetical proteins revealed 148 protein clusters shared by all human pathogenic strains; these included transcriptional regulators, arylsulfatases, methyl-accepting chemotaxis proteins, acetyltransferases, GGDEF family proteins, transposases, type IV secretory system (T4SS) proteins, and integrases. Our study showed that V. vulnificus biotype 3 evolved from environmental populations and formed a genetically

  6. Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2013-01-01

    Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 103 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed. Importantly, cellulases of some GH families are present in fungi that are not known to have cellulose-degrading ability. In addition, our results also showed that in general, plant pathogenic fungi have the highest number of CAZymes. Biotrophic fungi tend to have fewer CAZymes than necrotrophic and hemibiotrophic fungi. Pathogens of dicots often contain more pectinases than fungi infecting monocots. Interestingly, besides yeasts, many saprophytic fungi that are highly active in degrading plant biomass contain fewer CAZymes than plant pathogenic fungi. Furthermore, analysis of the gene expression profile of the wheat scab fungus Fusarium graminearum revealed that most of the CAZyme genes related to cell wall degradation were up-regulated during plant infection. Phylogenetic analysis also

  7. Comparative Analysis of the Complete Chloroplast Genomes of Five Quercus Species

    PubMed Central

    Yang, Yanci; Zhou, Tao; Duan, Dong; Yang, Jia; Feng, Li; Zhao, Guifang

    2016-01-01

    Quercus is considered economically and ecologically one of the most important genera in the Northern Hemisphere. Oaks are taxonomically perplexing because of shared interspecific morphological traits and intraspecific morphological variation, which are mainly attributed to hybridization. Universal plastid markers cannot provide a sufficient number of variable sites to explore the phylogeny of this genus, and chloroplast genome-scale data have proven to be useful in resolving intractable phylogenetic relationships. In this study, the complete chloroplast genomes of four Quercus species were sequenced, and one published chloroplast genome of Quercus baronii was retrieved for comparative analyses. The five chloroplast genomes ranged from 161,072 bp (Q. baronii) to 161,237 bp (Q. dolicholepis) in length, and their gene organization and order, and GC content, were similar to those of other Fagaceae species. We analyzed nucleotide substitutions, indels, and repeats in the chloroplast genomes, and found 19 relatively highly variable regions that will potentially provide plastid markers for further taxonomic and phylogenetic studies within Quercus. We observed that four genes (ndhA, ndhK, petA, and ycf1) were subject to positive selection. The phylogenetic relationships of the Quercus species inferred from the chloroplast genomes obtained moderate-to-high support, indicating that chloroplast genome data may be useful in resolving relationships in this genus. PMID:27446185

  8. Comparative Genomic Analysis of Mycobacterium avium subspecies Obtained from Multiple Host Species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comparative genomic approach was used to identify large sequence polymorphisms among Mycobacterium avium (M. avium) subspecies obtained from a variety of host animals. DNA microarrays were used as a platform for comparing mycobacterial isolates with the sequenced bovine isolate M. avium subsp. p...

  9. Characterization of the Xanthomonas translucens Complex Using Draft Genomes, Comparative Genomics, Phylogenetic Analysis, and Diagnostic LAMP Assays.

    PubMed

    Langlois, Paul A; Snelling, Jacob; Hamilton, John P; Bragard, Claude; Koebnik, Ralf; Verdier, Valérie; Triplett, Lindsay R; Blom, Jochen; Tisserat, Ned A; Leach, Jan E

    2017-03-21

    Prevalence of Xanthomonas translucens, which causes cereal leaf streak (CLS) in cereal crops and bacterial wilt in forage and turfgrass species, has increased in many regions in recent years. Because the pathogen is seedborne in economically important cereals, it is a concern for international and interstate germplasm exchange and, thus, reliable and robust protocols for its detection in seed are needed. However, historical confusion surrounding the taxonomy within the species has complicated the development of accurate and reliable diagnostic tools for X. translucens. Therefore, we sequenced genomes of 15 X. translucens strains representing six different pathovars and compared them with additional publicly available X. translucens genome sequences to obtain a genome-based phylogeny for robust classification of this species. Our results reveal three main clusters: one consisting of pv. cerealis, one consisting of pvs. undulosa and translucens, and a third consisting of pvs. arrhenatheri, graminis, phlei, and poae. Based on genomic differences, diagnostic loop-mediated isothermal amplification (LAMP) primers were developed that clearly distinguish strains that cause disease on cereals, such as pvs. undulosa, translucens, hordei, and secalis, from strains that cause disease on noncereal hosts, such as pvs. arrhenatheri, cerealis, graminis, phlei, and poae. Additional LAMP assays were developed that selectively amplify strains belonging to pvs. cerealis and poae, distinguishing them from other pathovars. These primers will be instrumental in diagnostics when implementing quarantine regulations to limit further geographic spread of X. translucens pathovars.

  10. A three-way comparative genomic analysis of Mannheimia haemolytica isolates

    PubMed Central

    2010-01-01

    Background Mannhemia haemolytica is a Gram-negative bacterium and the principal etiological agent associated with bovine respiratory disease complex. They transform from a benign commensal to a deadly pathogen, during stress such as viral infection and transportation to feedlots and cause acute pleuropneumonia commonly known as shipping fever. The U.S beef industry alone loses more than one billion dollars annually due to shipping fever. Despite its enormous economic importance there are no specific and accurate genetic markers, which will aid in understanding the pathogenesis and epidemiology of M. haemolytica at molecular level and assist in devising an effective control strategy. Description During our comparative genomic sequence analysis of three Mannheimia haemolytica isolates, we identified a number of genes that are unique to each strain. These genes are "high value targets" for future studies that attempt to correlate the variable gene pool with phenotype. We also identified a number of high confidence single nucleotide polymorphisms (hcSNPs) spread throughout the genome and focused on non-synonymous SNPs in known virulence genes. These SNPs will be used to design new hcSNP arrays to study variation across strains, and will potentially aid in understanding gene regulation and the mode of action of various virulence factors. Conclusions During our analysis we identified previously unknown possible type III secretion effector proteins, clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated sequences (Cas). The presence of CRISPR regions is indicative of likely co-evolution with an associated phage. If proven functional, the presence of a type III secretion system in M. haemolytica will help us re-evaluate our approach to study host-pathogen interactions. We also identified various adhesins containing immuno-dominant domains, which may interfere with host-innate immunity and which could potentially serve as effective vaccine

  11. [Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

    PubMed

    Kai, Xia; Xinle, Liang; Yudong, Li

    2015-12-01

    The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.

  12. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    SciTech Connect

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott. E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; Lopez de Leon, Alfredo; Rey, Michael W.; Tsang, Adrian

    2011-05-16

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  13. Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

    SciTech Connect

    Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert; Salamov, Asaf; Grimwood, Jane; Reid, Ian; Ishmael, Nadeeza; John, Tricia; Darmond, Corinne; Moisan, Marie-Claude; Henrissat, Bernard; Coutinho, Pedro M.; Lombard, Vincent; Natvig, Donald O.; Lindquist, Erika; Schmutz, Jeremy; Lucas, Susan; Harris, Paul; Powlowski, Justin; Bellemare, Annie; Taylor, David; Butler, Gregory; de Vries, Ronald P.; Allijn, Iris E.; van den Brink, Joost; Ushinsky, Sophia; Storms, Reginald; Powell, Amy J.; Paulsen, Ian T.; Elbourne, Liam D. H.; Baker, Scott E.; Magnuson, Jon; LaBoissiere, Sylvie; Clutterbuck, A. John; Martinez, Diego; Wogulis, Mark; de Leon, Alfredo Lopez; Rey, Michael W.; Tsang, Adrian

    2011-10-02

    Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanisms in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.

  14. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    PubMed

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome.

  15. Comparative genomic analysis of phylogenetically closely related Hydrogenobaculum sp. isolates from Yellowstone National Park.

    PubMed

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L; McDermott, Timothy R

    2013-05-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥ 99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized.

  16. Comparative Genomic Analysis of Phylogenetically Closely Related Hydrogenobaculum sp. Isolates from Yellowstone National Park

    PubMed Central

    Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L.

    2013-01-01

    We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891

  17. Evolution of vertebrate genes related to prion and Shadoo proteins--clues from comparative genomic analysis.

    PubMed

    Premzl, Marko; Gready, Jill E; Jermiin, Lars S; Simonic, Tatjana; Marshall Graves, Jennifer A

    2004-12-01

    Recent findings of new genes in fish related to the prion protein (PrP) gene PRNP, including our recent report of SPRN coding for Shadoo (Sho) protein found also in mammals, raise issues of their function and evolution. Here we report additional novel fish genes found in public databases, including a duplicated SPRN gene, SPRNB, in Fugu, Tetraodon, carp, and zebrafish encoding the Sho2 protein, and we use comparative genomic analysis to analyze the evolutionary relationships and to infer evolutionary trajectories of the complete data set. Phylogenetic footprinting performed on aligned human, mouse, and Fugu SPRN genes to define candidate regulatory promoter regions, detected 16 conserved motifs, three of which are known transcription factor-binding sites for a receptor and transcription factors specific to or associated with expression in brain. This result and other homology-based (VISTA global genomic alignment; protein sequence alignment and phylogenetics) and context-dependent (genomic context; relative gene order and orientation) criteria indicate fish and mammalian SPRN genes are orthologous and suggest a strongly conserved basic function in brain. Whereas tetrapod PRNPs share context with the analogous stPrP-2-coding gene in fish, their sequences are diverged, suggesting that the tetrapod and fish genes are likely to have significantly different functions. Phylogenetic analysis predicts the SPRN/SPRNB duplication occurred before divergence of fish from tetrapods, whereas that of stPrP-1 and stPrP-2 occurred in fish. Whereas Sho appears to have a conserved function in vertebrate brain, PrP seems to have an adaptive role fine-tuned in a lineage-specific fashion. An evolutionary model consistent with our findings and literature knowledge is proposed that has an ancestral prevertebrate SPRN-like gene leading to all vertebrate PrP-related and Sho-related genes. This provides a new framework for exploring the evolution of this unusual family of proteins and for

  18. Complete genome sequence of Nitrobacter hamburgensis X14 and comparative genomic analysis of species within the genus Nitrobacter.

    SciTech Connect

    Starkenburg, Shawn R; Larimer, Frank W; Stein, Lisa Y; Klotz, Martin G; Chain, Patrick S. G.; Sayavedra-Soto, LA; Poret-Peterson, Amisha T.; Gentry, ME; Arp, D J; Ward, Bess B.; Bottomley, Peter J

    2008-05-01

    The alphaproteobacterium Nitrobacter hamburgensis X14 is a gram-negative facultative chemolithoautotroph that conserves energy from the oxidation of nitrite to nitrate. Sequencing and analysis of the Nitrobacter hamburgensis X14 genome revealed four replicons comprised of one chromosome (4.4 Mbp) and three plasmids (294, 188, and 121 kbp). Over 20% of the genome is composed of pseudogenes and paralogs. Whole-genome comparisons were conducted between N. hamburgensis and the finished and draft genome sequences of Nitrobacter winogradskyi and Nitrobacter sp. strain Nb-311A, respectively. Most of the plasmid-borne genes were unique to N. hamburgensis and encode a variety of functions (central metabolism, energy conservation, conjugation, and heavy metal resistance), yet approximately 21 kb of a approximately 28-kb "autotrophic" island on the largest plasmid was conserved in the chromosomes of Nitrobacter winogradskyi Nb-255 and Nitrobacter sp. strain Nb-311A. The N. hamburgensis chromosome also harbors many unique genes, including those for heme-copper oxidases, cytochrome b(561), and putative pathways for the catabolism of aromatic, organic, and one-carbon compounds, which help verify and extend its mixotrophic potential. A Nitrobacter "subcore" genome was also constructed by removing homologs found in strains of the closest evolutionary relatives, Bradyrhizobium japonicum and Rhodopseudomonas palustris. Among the Nitrobacter subcore inventory (116 genes), copies of genes or gene clusters for nitrite oxidoreductase (NXR), cytochromes associated with a dissimilatory nitrite reductase (NirK), PII-like regulators, and polysaccharide formation were identified. Many of the subcore genes have diverged significantly from, or have origins outside, the alphaproteobacterial lineage and may indicate some of the unique genetic requirements for nitrite oxidation in Nitrobacter.

  19. Complete Genome Sequence of Nitrobacter hamburgensis X14 and Comparative Genomic Analysis of Species within the Genus Nitrobacter▿ †

    PubMed Central

    Starkenburg, Shawn R.; Larimer, Frank W.; Stein, Lisa Y.; Klotz, Martin G.; Chain, Patrick S. G.; Sayavedra-Soto, Luis A.; Poret-Peterson, Amisha T.; Gentry, Mira E.; Arp, Daniel J.; Ward, Bess; Bottomley, Peter J.

    2008-01-01

    The alphaproteobacterium Nitrobacter hamburgensis X14 is a gram-negative facultative chemolithoautotroph that conserves energy from the oxidation of nitrite to nitrate. Sequencing and analysis of the Nitrobacter hamburgensis X14 genome revealed four replicons comprised of one chromosome (4.4 Mbp) and three plasmids (294, 188, and 121 kbp). Over 20% of the genome is composed of pseudogenes and paralogs. Whole-genome comparisons were conducted between N. hamburgensis and the finished and draft genome sequences of Nitrobacter winogradskyi and Nitrobacter sp. strain Nb-311A, respectively. Most of the plasmid-borne genes were unique to N. hamburgensis and encode a variety of functions (central metabolism, energy conservation, conjugation, and heavy metal resistance), yet ∼21 kb of a ∼28-kb “autotrophic” island on the largest plasmid was conserved in the chromosomes of Nitrobacter winogradskyi Nb-255 and Nitrobacter sp. strain Nb-311A. The N. hamburgensis chromosome also harbors many unique genes, including those for heme-copper oxidases, cytochrome b561, and putative pathways for the catabolism of aromatic, organic, and one-carbon compounds, which help verify and extend its mixotrophic potential. A Nitrobacter “subcore” genome was also constructed by removing homologs found in strains of the closest evolutionary relatives, Bradyrhizobium japonicum and Rhodopseudomonas palustris. Among the Nitrobacter subcore inventory (116 genes), copies of genes or gene clusters for nitrite oxidoreductase (NXR), cytochromes associated with a dissimilatory nitrite reductase (NirK), PII-like regulators, and polysaccharide formation were identified. Many of the subcore genes have diverged significantly from, or have origins outside, the alphaproteobacterial lineage and may indicate some of the unique genetic requirements for nitrite oxidation in Nitrobacter. PMID:18326675

  20. The first complete chloroplast genome sequences of Ulmus species by de novo sequencing: Genome comparative and taxonomic position analysis.

    PubMed

    Zuo, Li-Hui; Shang, Ai-Qin; Zhang, Shuang; Yu, Xiao-Yue; Ren, Ya-Chao; Yang, Min-Sheng; Wang, Jin-Mao

    2017-01-01

    further analysis of their nuclear genomes. This study is the first report on Ulmus chloroplast genomes, which has significance for understanding photosynthesis, evolution, and chloroplast transgenic engineering.

  1. The first complete chloroplast genome sequences of Ulmus species by de novo sequencing: Genome comparative and taxonomic position analysis

    PubMed Central

    Zhang, Shuang; Yu, Xiao-Yue; Ren, Ya-Chao; Yang, Min-Sheng; Wang, Jin-Mao

    2017-01-01

    further analysis of their nuclear genomes. This study is the first report on Ulmus chloroplast genomes, which has significance for understanding photosynthesis, evolution, and chloroplast transgenic engineering. PMID:28158318

  2. Physiological and comparative genomic analysis of Acidithiobacillus ferrivorans PQ33 provides psychrotolerant fitness evidence for oxidation at low temperature.

    PubMed

    Ccorahua-Santo, Robert; Eca, Anika; Abanto, Michel; Guerra, Gregory; Ramírez, Pablo

    2017-02-21

    Friendly environmental hydrometallurgy at low temperatures is principally promoted by Acidithiobacillus ferrivorans. Until recently, the synergy between cold tolerance and the molecular mechanism of ferrous iron (Fe(2+)) oxidation was unknown. In the present paper, we conducted a physiological and comparative genomics analysis of the new strain A. ferrivorans PQ33 to elucidate the oxidation mechanism at low temperatures, with emphasis placed on trehalose and the Rus operon. PQ33 exhibited a doubling time of 66.6 h in Fe(2+) at pH 1.6 and 63.6 h in CuS at 5 °C. Genomic island (GI) identification and comparative genome analysis were performed with four available genomes of Acidithiobacillus sp. The genome comprised 3,298,172 bp and 56.55% GC content. In contrast to ATCC Acidithiobacillus ferrooxidans strains, the genome of A. ferrivorans PQ33 harbors one GI, which contains a RusB gene. Moreover, five genes of peptidyl-prolyl cis-trans isomerase (PPIases) were observed. Furthermore, comparative analysis of the trehalose operon suggested the presence of a horizontal transfer event. In addition, comparison of rusticyanin proteins revealed that RusB has better intrinsic flexibility than RusA. This comparison suggests psychrotolerant fitness and supports the genetic canalization of A. ferrivorans PQ33 for oxidation at low temperature.

  3. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  4. Comparative analysis of the peanut witches'-broom phytoplasma genome reveals horizontal transfer of potential mobile units and effectors.

    PubMed

    Chung, Wan-Chia; Chen, Ling-Ling; Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution.

  5. Comparative Analysis of the Peanut Witches'-Broom Phytoplasma Genome Reveals Horizontal Transfer of Potential Mobile Units and Effectors

    PubMed Central

    Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution. PMID:23626855

  6. Sequence and Comparative Genomic Analysis of Actin-related ProteinsD⃞

    PubMed Central

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-01-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of ∼700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  7. Stochastic segmentation models for array-based comparative genomic hybridization data analysis.

    PubMed

    Lai, Tze Leung; Xing, Haipeng; Zhang, Nancy

    2008-04-01

    Array-based comparative genomic hybridization (array-CGH) is a high throughput, high resolution technique for studying the genetics of cancer. Analysis of array-CGH data typically involves estimation of the underlying chromosome copy numbers from the log fluorescence ratios and segmenting the chromosome into regions with the same copy number at each location. We propose for the analysis of array-CGH data, a new stochastic segmentation model and an associated estimation procedure that has attractive statistical and computational properties. An important benefit of this Bayesian segmentation model is that it yields explicit formulas for posterior means, which can be used to estimate the signal directly without performing segmentation. Other quantities relating to the posterior distribution that are useful for providing confidence assessments of any given segmentation can also be estimated by using our method. We propose an approximation method whose computation time is linear in sequence length which makes our method practically applicable to the new higher density arrays. Simulation studies and applications to real array-CGH data illustrate the advantages of the proposed approach.

  8. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data.

    PubMed

    Vallenet, David; Belda, Eugeni; Calteau, Alexandra; Cruveiller, Stéphane; Engelen, Stefan; Lajus, Aurélie; Le Fèvre, François; Longin, Cyrille; Mornico, Damien; Roche, David; Rouy, Zoé; Salvignol, Gregory; Scarpelli, Claude; Thil Smith, Adam Alexander; Weiman, Marion; Médigue, Claudine

    2013-01-01

    MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest.

  9. Analysis of Molecular Cytogenetic Alteration in Rhabdomyosarcoma by Array Comparative Genomic Hybridization

    PubMed Central

    Liu, Chunxia; Li, Dongliang; Jiang, Jinfang; Hu, Jianming; Zhang, Wei; Chen, Yunzhao; Cui, Xiaobin; Qi, Yan; Zou, Hong; Zhang, WenJie; Li, Feng

    2014-01-01

    Rhabdomyosarcoma (RMS) is the most common pediatric soft tissue sarcoma with poor prognosis. The genetic etiology of RMS remains largely unclear underlying its development and progression. To reveal novel genes more precisely and new therapeutic targets associated with RMS, we used high-resolution array comparative genomic hybridization (aCGH) to explore tumor-associated copy number variations (CNVs) and genes in RMS. We confirmed several important genes by quantitative real-time polymerase chain reaction (QRT-PCR). We then performed bioinformatics-based functional enrichment analysis for genes located in the genomic regions with CNVs. In addition, we identified miRNAs located in the corresponding amplification and deletion regions and performed miRNA functional enrichment analysis. aCGH analyses revealed that all RMS showed specific gains and losses. The amplification regions were 12q13.12, 12q13.3, and 12q13.3–q14.1. The deletion regions were 1p21.1, 2q14.1, 5q13.2, 9p12, and 9q12. The recurrent regions with gains were 12q13.3, 12q13.3–q14.1, 12q14.1, and 17q25.1. The recurrent regions with losses were 9p12–p11.2, 10q11.21–q11.22, 14q32.33, 16p11.2, and 22q11.1. The mean mRNA level of GLI1 in RMS was 6.61-fold higher than that in controls (p = 0.0477) by QRT-PCR. Meanwhile, the mean mRNA level of GEFT in RMS samples was 3.92-fold higher than that in controls (p = 0.0354). Bioinformatic analysis showed that genes were enriched in functions such as immunoglobulin domain, induction of apoptosis, and defensin. Proto-oncogene functions were involved in alveolar RMS. miRNAs that located in the amplified regions in RMS tend to be enriched in oncogenic activity (miR-24 and miR-27a). In conclusion, this study identified a number of CNVs in RMS and functional analyses showed enrichment for genes and miRNAs located in these CNVs regions. These findings may potentially help the identification of novel biomarkers and/or drug targets implicated in diagnosis of

  10. Comparative Analysis of Genome and Epigenome in Closely Related Medaka Species Identifies Conserved Sequence Preferences for DNA Hypomethylated Domains.

    PubMed

    Uno, Ayako; Nakamura, Ryohei; Tsukahara, Tatsuya; Qu, Wei; Sugano, Sumio; Suzuki, Yutaka; Morishita, Shinichi; Takeda, Hiroyuki

    2016-08-01

    The genomes of vertebrates are globally methylated, but a small portion of genomic regions are known to be hypomethylated. Although hypomethylated domains (HMDs) have been implicated in transcriptional regulation in various ways, how a HMD is determined in a particular genomic region remains elusive. To search for DNA motifs essential for the formation of HMDs, we performed the genome-wide comparative analysis of genome and DNA methylation patterns of the two medaka inbred lines, Hd-rRII1 and HNI-II, which are derived from northern and southern subpopulations of Japan and exhibit high levels of genetic variations (SNP, ∼ 3%). We successfully mapped > 70% of HMDs in both genomes and found that the majority of those mapped HMDs are conserved between the two lines (common HMDs). Unexpectedly, the average genetic variations are similar in the common HMD and other genome regions. However, we identified short well-conserved motifs that are specifically enriched in HMDs, suggesting that they may play roles in the establishment of HMDs in the medaka genome.

  11. Complete Chloroplast Genome Sequence of Omani Lime (Citrus aurantiifolia) and Comparative Analysis within the Rosids

    PubMed Central

    Su, Huei-Jiun; Hogenhout, Saskia A.; Al-Sadi, Abdullah M.; Kuo, Chih-Horng

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia). The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis) chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs) that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution. PMID:25398081

  12. Complete chloroplast genome sequence of Omani lime (Citrus aurantiifolia) and comparative analysis within the rosids.

    PubMed

    Su, Huei-Jiun; Hogenhout, Saskia A; Al-Sadi, Abdullah M; Kuo, Chih-Horng

    2014-01-01

    The genus Citrus contains many economically important fruits that are grown worldwide for their high nutritional and medicinal value. Due to frequent hybridizations among species and cultivars, the exact number of natural species and the taxonomic relationships within this genus are unclear. To compare the differences between the Citrus chloroplast genomes and to develop useful genetic markers, we used a reference-assisted approach to assemble the complete chloroplast genome of Omani lime (C. aurantiifolia). The complete C. aurantiifolia chloroplast genome is 159,893 bp in length; the organization and gene content are similar to most of the rosids lineages characterized to date. Through comparison with the sweet orange (C. sinensis) chloroplast genome, we identified three intergenic regions and 94 simple sequence repeats (SSRs) that are potentially informative markers with resolution for interspecific relationships. These markers can be utilized to better understand the origin of cultivated Citrus. A comparison among 72 species belonging to 10 families of representative rosids lineages also provides new insights into their chloroplast genome evolution.

  13. Assembly and comparative analysis of complete mitochondrial genome sequence of an economic plant Salix suchowensis

    PubMed Central

    Wang, Xuelin; Li, Juan; Bi, Changwei; Xu, Yiqing; Wu, Dongyang; Ye, Qiaolin

    2017-01-01

    Willow is a widely used dioecious woody plant of Salicaceae family in China. Due to their high biomass yields, willows are promising sources for bioenergy crops. In this study, we assembled the complete mitochondrial (mt) genome sequence of S. suchowensis with the length of 644,437 bp using Roche-454 GS FLX Titanium sequencing technologies. Base composition of the S. suchowensis mt genome is A (27.43%), T (27.59%), C (22.34%), and G (22.64%), which shows a prevalent GC content with that of other angiosperms. This long circular mt genome encodes 58 unique genes (32 protein-coding genes, 23 tRNA genes and 3 rRNA genes), and 9 of the 32 protein-coding genes contain 17 introns. Through the phylogenetic analysis of 35 species based on 23 protein-coding genes, it is supported that Salix as a sister to Populus. With the detailed phylogenetic information and the identification of phylogenetic position, some ribosomal protein genes and succinate dehydrogenase genes are found usually lost during evolution. As a native shrub willow species, this worthwhile research of S. suchowensis mt genome will provide more desirable information for better understanding the genomic breeding and missing pieces of sex determination evolution in the future. PMID:28367378

  14. Comparative genome analysis and identification of competitive and cooperative interactions in a polymicrobial disease.

    PubMed

    Endo, Akiko; Watanabe, Takayasu; Ogata, Nachiko; Nozawa, Takashi; Aikawa, Chihiro; Arakawa, Shinichi; Maruyama, Fumito; Izumi, Yuichi; Nakagawa, Ichiro

    2015-03-01

    Polymicrobial diseases are caused by combinations of multiple bacteria, which can lead to not only mild but also life-threatening illnesses. Periodontitis represents a polymicrobial disease; Porphyromonas gingivalis, Treponema denticola and Tannerella forsythia, called 'the red complex', have been recognized as the causative agents of periodontitis. Although molecular interactions among the three species could be responsible for progression of periodontitis, the relevant genetic mechanisms are unknown. In this study, we uncovered novel interactions in comparative genome analysis among the red complex species. Clustered regularly interspaced short palindromic repeats (CRISPRs) of T. forsythia might attack the restriction modification system of P. gingivalis, and possibly work as a defense system against DNA invasion from P. gingivalis. On the other hand, gene deficiencies were mutually compensated in metabolic pathways when the genes of all the three species were taken into account, suggesting that there are cooperative relationships among the three species. This notion was supported by the observation that each of the three species had its own virulence factors, which might facilitate persistence and manifestations of virulence of the three species. Here, we propose new mechanisms of bacterial symbiosis in periodontitis; these mechanisms consist of competitive and cooperative interactions. Our results might shed light on the pathogenesis of periodontitis and of other polymicrobial diseases.

  15. Proteomic and comparative genomic analysis of two Brassica napus lines differing in oil content.

    PubMed

    Gan, Lu; Zhang, Chun-yu; Wang, Xiao-dong; Wang, Hao; Long, Yan; Yin, Yong-tai; Li, Dian-rong; Tian, Jian-Hua; Li, Zai-yun; Lin, Zhi-wei; Yu, Long-Jiang; Li, Mao-Teng

    2013-11-01

    Ultrastructural observations, combined with proteomic and comparative genomic analyses, were applied to interpret the differences in protein composition and oil-body characteristics of mature seed of two Brassica napus lines with high and low oil contents of 55.19% and 36.49%, respectively. The results showed that oil bodies were arranged much closer in the high than in the low oil content line, and differences in cell size and thickness of cell walls were also observed. There were 119 and 32 differentially expressed proteins (DEPs) of total and oil-body proteins identified. The 119 DEPs of total protein were mainly involved in the oil-related, dehydration-related, storage and defense/disease, and some of these may be related to oil formation. The DEPs involved with dehydration-related were both detected in total and oil-body proteins for high and low oil lines and may be correlated with the number and size of oil bodies in the different lines. Some genes that corresponded to DEPs were confirmed by quantitative trait loci (QTL) mapping analysis for oil content. The results revealed that some candidate genes deduced from DEPs were located in the confidence intervals of QTL for oil content. Finally, the function of one gene that coded storage protein was verified by using a collection of Arabidopsis lines that can conditionally express the full length cDNA from developing seeds of B. napus.

  16. Genome-Wide Comparative Analysis of Flowering-Related Genes in Arabidopsis, Wheat, and Barley

    PubMed Central

    Peng, Fred Y.; Hu, Zhiqiu; Yang, Rong-Cai

    2015-01-01

    Early flowering is an important trait influencing grain yield and quality in wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) in short-season cropping regions. However, due to large and complex genomes of these species, direct identification of flowering genes and their molecular characterization remain challenging. Here, we used a bioinformatic approach to predict flowering-related genes in wheat and barley from 190 known Arabidopsis (Arabidopsis thaliana (L.) Heynh.) flowering genes. We identified 900 and 275 putative orthologs in wheat and barley, respectively. The annotated flowering-related genes were clustered into 144 orthologous groups with one-to-one, one-to-many, many-to-one, and many-to-many orthology relationships. Our approach was further validated by domain and phylogenetic analyses of flowering-related proteins and comparative analysis of publicly available microarray data sets for in silico expression profiling of flowering-related genes in 13 different developmental stages of wheat and barley. These further analyses showed that orthologous gene pairs in three critical flowering gene families (PEBP, MADS, and BBX) exhibited similar expression patterns among 13 developmental stages in wheat and barley, suggesting similar functions among the orthologous genes with sequence and expression similarities. The predicted candidate flowering genes can be confirmed and incorporated into molecular breeding for early flowering wheat and barley in short-season cropping regions. PMID:26435710

  17. Genome-wide characterization and comparative analysis of the MLO gene family in cotton.

    PubMed

    Wang, Xiaoyan; Ma, Qifeng; Dou, Lingling; Liu, Zhen; Peng, Renhai; Yu, Shuxun

    2016-06-01

    In plants, MLO (Mildew Locus O) gene encodes a plant-specific seven transmembrane (TM) domain protein involved in several cellular processes, including susceptibility to powdery mildew (PM). In this study, a genome-wide characterization of the MLO gene family in G. raimondii L., G. arboreum L. and G. hirsutum L. was performed. In total, 22, 17 and 38 homologous sequences were identified for each species, respectively. Gene organization, including chromosomal location, gene clustering and gene duplication, was investigated. Homologues related to PM susceptibility in upland cotton were inferred by phylogenetic relationships with functionally characterized MLO proteins. To conduct a comparative analysis between MLO candidate genes from G. raimondii L., G. arboreum L. and G. hirsutum L., orthologous relationships and conserved synteny blocks were constructed. The transcriptional variation of 38 GhMLO genes in response to exogenous application of salt, mannitol (Man), abscisic acid (ABA), ethylene (ETH), jasmonic acid (JA) and salicylic acid (SA) was monitored. Further studies should be conducted to elucidate the functions of MLO genes in PM susceptibility and phytohormone signalling pathways.

  18. Identification of mesoderm development (mesd) candidate genes by comparative mapping and genome sequence analysis.

    PubMed

    Wines, M E; Lee, L; Katari, M S; Zhang, L; DeRossi, C; Shi, Y; Perkins, S; Feldman, M; McCombie, W R; Holdener, B C

    2001-02-15

    The proximal albino deletions identify several functional regions on mouse Chromosome 7 critical for differentiation of mesoderm (mesd), development of the hypothalamus neuroendocrine lineage (nelg), and function of the liver (hsdr1). Using comparative mapping and genomic sequence analysis, we have identified four novel genes and Il16 in the mesd deletion interval. Two of the novel genes, mesdc1 and mesdc2, are located within the mesd critical region defined by BAC transgenic rescue. We have investigated the fetal role of genes located outside the mesd critical region using BAC transgenic complementation of the mesd early embryonic lethality. Using human radiation hybrid mapping and BAC contig construction, we have identified a conserved region of human chromosome 15 homologous to the mesd, nelg, and hsdr1 functional regions. Three human diseases cosegregate with microsatellite markers used in construction of the human BAC/YAC physical map, including autosomal dominant nocturnal frontal lobe epilepsy (ENFL2; also known as ADNFLE), a syndrome of mental retardation, spasticity, and tapetoretinal degeneration (MRST); and a pyogenic arthritis, pyoderma gangrenosum, and acne syndrome (PAPA).

  19. Comparative genomic analysis of equilibrative nucleoside transporters suggests conserved protein structure despite limited sequence identity.

    PubMed

    Sankar, Narendra; Machado, Jerry; Abdulla, Parween; Hilliker, Arthur J; Coe, Imogen R

    2002-10-15

    Equilibrative nucleoside transporters (ENTs) are a recently characterized and poorly understood group of membrane proteins that are important in the uptake of endogenous nucleosides required for nucleic acid and nucleoside triphosphate synthesis. Despite their central importance in cellular metabolism and nucleoside analog chemotherapy, no human ENT gene has been described and nothing is known about gene structure and function. To gain insight into the ENT gene family, we used experimental and in silico comparative genomic approaches to identify ENT genes in three evolutionarily diverse organisms with completely (or almost completely) sequenced genomes, Homo sapiens, Caenorhabditis elegans and Drosophila melanogaster. We describe the chromosomal location, the predicted ENT gene structure and putative structural topologies of predicted ENT proteins derived from the open reading frames. Despite variations in genomic layout and limited ortholog protein sequence identity (< or =27.45%), predicted topologies of ENT proteins are strikingly similar, suggesting an evolutionary conservation of a prototypic structure. In addition, a similar distribution of protein domains on exons is apparent in all three taxa. These data demonstrate that comparative sequence analyses should be combined with other approaches (such as genomic and proteomic analyses) to fully understand structure, function and evolution of protein families.

  20. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    SciTech Connect

    Wu X.; van der Lelie D.; Monchy, S.; Taghavi, S.; Zhu, W.; Ramos, J.

    2011-03-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands.

  1. Comparative genomic analysis of the swine pathogen Bordetella bronchisepticastrain KM22.

    PubMed

    Nicholson, Tracy L; Shore, Sarah M; Register, Karen B; Bayles, Darrell O; Kingsley, Robert A; Brunelle, Brain W

    2016-01-01

    The well-characterized Bordetella bronchiseptica strain KM22, originally isolated from a pig with atrophic rhinitis, has been used to develop a reproducible swine respiratory disease model. The goal of this study was to identify genetic features unique to KM22 by comparing the genome sequence of KM22 to the laboratory reference strain RB50. To gain a broader perspective of the genetic relationship of KM22 among other B. bronchiseptica strains, selected genes of KM22 were then compared to five other B. bronchiseptica strains isolated from different hosts. Overall, the KM22 genome sequence is more similar to the genome sequences of the strains isolated from animals than the strains isolated from humans. The majority of virulence gene expression in Bordetella is positively regulated by the two-component sensory transduction system BvgAS. bopN, bvgA, fimB, and fimC were the most highly conserved BvgAS-regulated genes present in all seven strains analyzed. In contrast, the BvgAS-regulated genes present in all seven strains with the highest sequence divergence werefimN, fim2, fhaL, andfhaS. A total of eight major fimbrial subunit genes were identified in KM22. Quantitative real-time PCR data demonstrated that seven of the eight fimbrial subunit genes identified in KM22 are expressed and regulated by BvgAS. The annotation of the KM22 genome sequence, coupled with the comparative genomic analyses reported in this study, can be used to facilitate the development of vaccines with improved efficacy towards B. bronchiseptica in swine to decrease the prevalence and disease burden caused by this pathogen.

  2. [Comparative analysis of variable regions in the genomes of variola virus].

    PubMed

    Babkin, I V; Nepomniashchikh, T S; Maksiutov, R A; Gutorov, V V; Babkina, I N; Shchelkunov, S N

    2008-01-01

    Nucleotide sequences of two extended segments of the terminal variable regions in variola virus genome were determined. The size of the left segment was 13.5 kbp and of the right, 10.5 kbp. Totally, over 540 kbp were sequenced for 22 variola virus strains. The conducted phylogenetic analysis and the data published earlier allowed us to find the interrelations between 70 variola virus isolates, the character of their clustering, and the degree of intergroup and intragroup variations of the clusters of variola virus strains. The most polymorphic loci of the genome segments studied were determined. It was demonstrated that that these loci are localized to either noncoding genome regions or to the regions of destroyed open reading frames, characteristic of the ancestor virus. These loci are promising for development of the strategy for genotyping variola virus strains. Analysis of recombination using various methods demonstrated that, with the only exception, no statistically significant recombinational events in the genomes of variola virus strains studied were detectable.

  3. Comparative analysis of the Oenococcus oeni pan genome reveals genetic diversity in industrially-relevant pathways

    PubMed Central

    2012-01-01

    Background Oenococcus oeni, a member of the lactic acid bacteria, is one of a limited number of microorganisms that not only survive, but actively proliferate in wine. It is also unusual as, unlike the majority of bacteria present in wine, it is beneficial to wine quality rather than causing spoilage. These benefits are realised primarily through catalysing malolactic fermentation, but also through imparting other positive sensory properties. However, many of these industrially-important secondary attributes have been shown to be strain-dependent and their genetic basis it yet to be determined. Results In order to investigate the scale and scope of genetic variation in O. oeni, we have performed whole-genome sequencing on eleven strains of this bacterium, bringing the total number of strains for which genome sequences are available to fourteen. While any single strain of O. oeni was shown to contain around 1800 protein-coding genes, in-depth comparative annotation based on genomic synteny and protein orthology identified over 2800 orthologous open reading frames that comprise the pan genome of this species, and less than 1200 genes that make up the conserved genomic core present in all of the strains. The expansion of the pan genome relative to the coding potential of individual strains was shown to be due to the varied presence and location of multiple distinct bacteriophage sequences and also in various metabolic functions with potential impacts on the industrial performance of this species, including cell wall exopolysaccharide biosynthesis, sugar transport and utilisation and amino acid biosynthesis. Conclusions By providing a large cohort of sequenced strains, this study provides a broad insight into the genetic variation present within O. oeni. This data is vital to understanding and harnessing the phenotypic variation present in this economically-important species. PMID:22863143

  4. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  5. The Princeton Protein Orthology Database (P-POD): A Comparative Genomics Analysis Tool for Biologists

    PubMed Central

    Kang, Fan; Angiuoli, Samuel V.; White, Owen; Botstein, David; Dolinski, Kara

    2007-01-01

    Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic relationships among predicted orthologs (based on the OrthoMCL method) to a query gene from any of eight eukaryotic organisms, and to see the orthologs in a wider evolutionary context (based on the Jaccard clustering method). In addition to the phylogenetic information, the database contains experimental results manually collected from the literature that can be compared to the computational analyses, as well as links to relevant human disease and gene information via the OMIM, model organism, and sequence databases. Our aim is for the P-POD resource to be extremely useful to typical experimental biologists wanting to learn more about the evolutionary context of their favorite genes. P-POD is based on the commonly used Generic Model Organism Database (GMOD) schema and can be downloaded in its entirety for installation on one's own system. Thus, bioinformaticians and software developers may also find P-POD useful because they can use the P-POD database infrastructure when developing their own comparative genomics resources and database tools. PMID:17712414

  6. Putative drug and vaccine target protein identification using comparative genomic analysis of KEGG annotated metabolic pathways of Mycoplasma hyopneumoniae.

    PubMed

    Damte, Dereje; Suh, Joo-Won; Lee, Seung-Jin; Yohannes, Sileshi Belew; Hossain, Md Akil; Park, Seung-Chun

    2013-07-01

    In the present study, a computational comparative and subtractive genomic/proteomic analysis aimed at the identification of putative therapeutic target and vaccine candidate proteins from Kyoto Encyclopedia of Genes and Genomes (KEGG) annotated metabolic pathways of Mycoplasma hyopneumoniae was performed for drug design and vaccine production pipelines against M.hyopneumoniae. The employed comparative genomic and metabolic pathway analysis with a predefined computational systemic workflow extracted a total of 41 annotated metabolic pathways from KEGG among which five were unique to M. hyopneumoniae. A total of 234 proteins were identified to be involved in these metabolic pathways. Although 125 non homologous and predicted essential proteins were found from the total that could serve as potential drug targets and vaccine candidates, additional prioritizing parameters characterize 21 proteins as vaccine candidate while druggability of each of the identified proteins evaluated by the DrugBank database prioritized 42 proteins suitable for drug targets.

  7. Comparative analysis of two phenotypically-similar but genomically-distinct Burkholderia cenocepacia-specific bacteriophages

    PubMed Central

    2012-01-01

    Background Genomic analysis of bacteriophages infecting the Burkholderia cepacia complex (BCC) is an important preliminary step in the development of a phage therapy protocol for these opportunistic pathogens. The objective of this study was to characterize KL1 (vB_BceS_KL1) and AH2 (vB_BceS_AH2), two novel Burkholderia cenocepacia-specific siphoviruses isolated from environmental samples. Results KL1 and AH2 exhibit several unique phenotypic similarities: they infect the same B. cenocepacia strains, they require prolonged incubation at 30°C for the formation of plaques at low titres, and they do not form plaques at similar titres following incubation at 37°C. However, despite these similarities, we have determined using whole-genome pyrosequencing that these phages show minimal relatedness to one another. The KL1 genome is 42,832 base pairs (bp) in length and is most closely related to Pseudomonas phage 73 (PA73). In contrast, the AH2 genome is 58,065 bp in length and is most closely related to Burkholderia phage BcepNazgul. Using both BLASTP and HHpred analysis, we have identified and analyzed the putative virion morphogenesis, lysis, DNA binding, and MazG proteins of these two phages. Notably, MazG homologs identified in cyanophages have been predicted to facilitate infection of stationary phase cells and may contribute to the unique plaque phenotype of KL1 and AH2. Conclusions The nearly indistinguishable phenotypes but distinct genomes of KL1 and AH2 provide further evidence of both vast diversity and convergent evolution in the BCC-specific phage population. PMID:22676492

  8. Study of Modern Human Evolution via Comparative Analysis with the Neanderthal Genome

    PubMed Central

    Ahmed, Musaddeque

    2013-01-01

    Many other human species appeared in evolution in the last 6 million years that have not been able to survive to modern times and are broadly known as archaic humans, as opposed to the extant modern humans. It has always been considered fascinating to compare the modern human genome with that of archaic humans to identify modern human-specific sequence variants and figure out those that made modern humans different from their predecessors or cousin species. Neanderthals are the latest humans to become extinct, and many factors made them the best representatives of archaic humans. Even though a number of comparisons have been made sporadically between Neanderthals and modern humans, mostly following a candidate gene approach, the major breakthrough took place with the sequencing of the Neanderthal genome. The initial genome-wide comparison, based on the first draft of the Neanderthal genome, has generated some interesting inferences regarding variations in functional elements that are not shared by the two species and the debated admixture question. However, there are certain other genetic elements that were not included or included at a smaller scale in those studies, and they should be compared comprehensively to better understand the molecular make-up of modern humans and their phenotypic characteristics. Besides briefly discussing the important outcomes of the comparative analyses made so far between modern humans and Neanderthals, we propose that future comparative studies may include retrotransposons, pseudogenes, and conserved non-coding regions, all of which might have played significant roles during the evolution of modern humans. PMID:24465235

  9. The comparative chloroplast genomic analysis of photosynthetic orchids and developing DNA markers to distinguish Phalaenopsis orchids.

    PubMed

    Jheng, Cheng-Fong; Chen, Tien-Chih; Lin, Jhong-Yi; Chen, Ting-Chieh; Wu, Wen-Luan; Chang, Ching-Chun

    2012-07-01

    The chloroplast genome of Phalaenopsis equestris was determined and compared to those of Phalaenopsis aphrodite and Oncidium Gower Ramsey in Orchidaceae. The chloroplast genome of P. equestris is 148,959 bp, and a pair of inverted repeats (25,846 bp) separates the genome into large single-copy (85,967 bp) and small single-copy (11,300 bp) regions. The genome encodes 109 genes, including 4 rRNA, 30 tRNA and 75 protein-coding genes, but loses four ndh genes (ndhA, E, F and H) and seven other ndh genes are pseudogenes. The rate of inter-species variation between the two moth orchids was 0.74% (1107 sites) for single nucleotide substitution and 0.24% for insertions (161 sites; 1388 bp) and deletions (189 sites; 1393 bp). The IR regions have a lower rate of nucleotide substitution (3.5-5.8-fold) and indels (4.3-7.1-fold) than single-copy regions. The intergenic spacers are the most divergent, and based on the length variation of the three intergenic spacers, 11 native Phalaenopsis orchids could be successfully distinguished. The coding genes, IR junction and RNA editing sites are relatively more conserved between the two moth orchids than between those of Phalaenopsis and Oncidium spp.

  10. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication

    PubMed Central

    Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L.; Searle, Steven M. J.; Minx, Patrick; Hillier, LaDeana W.; Koboldt, Daniel C.; Davis, Brian W.; Driscoll, Carlos A.; Barr, Christina S.; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W. C.; Hahn, Matthew W.; Menotti-Raymond, Marilyn; O’Brien, Stephen J.; Wilson, Richard K.; Lyons, Leslie A.; Murphy, William J.; Warren, Wesley C.

    2014-01-01

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae. PMID:25385592

  11. Comparative genomic analysis of three white spot syndrome virus isolates of different virulence.

    PubMed

    Li, Fang; Gao, Meiling; Xu, Limei; Yang, Feng

    2017-04-01

    Three white spot syndrome virus (WSSV) isolates of different virulence were identified in our previous study, the high-virulent strain WSSV-CN01, the moderate-virulent strain WSSV-CN02 and the low-virulent strain WSSV-CN03. In this study, the genomes of these three WSSV isolates were sequenced, annotated and compared. The genome sizes for WSSV-CN01, WSSV-CN02, and WSSV-CN03 are 309,286, 294,261, and 284,148 bp, bearing 177, 164, and 154 putative protein-coding genes, respectively. The genomic variations including insertions, deletions, and substitutions were investigated. Thirty four genes show >20% variation in their sequences in WSSV-CN02 or WSSV-CN03, in comparison with WSSV-CN01, including six envelope protein genes (wsv237/vp41A, wsv238/vp52A, wsv338/vp62, wsv339/vp39, wsv077/vp36A, and wsv242/vp41B), and two immediate-early genes (wsv108 and wsv178). The genomic variations among WSSV isolates of different virulence, especially those in the coding regions, certainly provide new insight into the understanding of the molecular basis of WSSV pathogenesis.

  12. Comparative analysis of field-isolate and monkey-adapted Plasmodium vivax genomes.

    PubMed

    Chan, Ernest R; Barnwell, John W; Zimmerman, Peter A; Serre, David

    2015-03-01

    Significant insights into the biology of Plasmodium vivax have been gained from the ability to successfully adapt human infections to non-human primates. P. vivax strains grown in monkeys serve as a renewable source of parasites for in vitro and ex vivo experimental studies and functional assays, or for studying in vivo the relapse characteristics, mosquito species compatibilities, drug susceptibility profiles or immune responses towards potential vaccine candidates. Despite the importance of these studies, little is known as to how adaptation to a different host species may influence the genome of P. vivax. In addition, it is unclear whether these monkey-adapted strains consist of a single clonal population of parasites or if they retain the multiclonal complexity commonly observed in field isolates. Here we compare the genome sequences of seven P. vivax strains adapted to New World monkeys with those of six human clinical isolates collected directly in the field. We show that the adaptation of P. vivax parasites to monkey hosts, and their subsequent propagation, did not result in significant modifications of their genome sequence and that these monkey-adapted strains recapitulate the genomic diversity of field isolates. Our analyses also reveal that these strains are not always genetically homogeneous and should be analyzed cautiously. Overall, our study provides a framework to better leverage this important research material and fully utilize this resource for improving our understanding of P. vivax biology.

  13. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication.

    PubMed

    Montague, Michael J; Li, Gang; Gandolfi, Barbara; Khan, Razib; Aken, Bronwen L; Searle, Steven M J; Minx, Patrick; Hillier, LaDeana W; Koboldt, Daniel C; Davis, Brian W; Driscoll, Carlos A; Barr, Christina S; Blackistone, Kevin; Quilez, Javier; Lorente-Galdos, Belen; Marques-Bonet, Tomas; Alkan, Can; Thomas, Gregg W C; Hahn, Matthew W; Menotti-Raymond, Marilyn; O'Brien, Stephen J; Wilson, Richard K; Lyons, Leslie A; Murphy, William J; Warren, Wesley C

    2014-12-02

    Little is known about the genetic changes that distinguish domestic cat populations from their wild progenitors. Here we describe a high-quality domestic cat reference genome assembly and comparative inferences made with other cat breeds, wildcats, and other mammals. Based upon these comparisons, we identified positively selected genes enriched for genes involved in lipid metabolism that underpin adaptations to a hypercarnivorous diet. We also found positive selection signals within genes underlying sensory processes, especially those affecting vision and hearing in the carnivore lineage. We observed an evolutionary tradeoff between functional olfactory and vomeronasal receptor gene repertoires in the cat and dog genomes, with an expansion of the feline chemosensory system for detecting pheromones at the expense of odorant detection. Genomic regions harboring signatures of natural selection that distinguish domestic cats from their wild congeners are enriched in neural crest-related genes associated with behavior and reward in mouse models, as predicted by the domestication syndrome hypothesis. Our description of a previously unidentified allele for the gloving pigmentation pattern found in the Birman breed supports the hypothesis that cat breeds experienced strong selection on specific mutations drawn from random bred populations. Collectively, these findings provide insight into how the process of domestication altered the ancestral wildcat genome and build a resource for future disease mapping and phylogenomic studies across all members of the Felidae.

  14. Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways.

    PubMed

    Tello-Ruiz, Marcela Karey; Stein, Joshua; Wei, Sharon; Youens-Clark, Ken; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene is an integrated informatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for economically important and research model crops, including wheat, potato, tomato, banana, grape, poplar, and Chlamydomonas. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) view a phylogenetic tree for a family of transcription factors, (2) explore genetic variation in the orthologues of a gene with a known trait association, and (3) upload, visualize, and privately share end user data into a new genome browser track.Moreover, this is the first publication describing Gramene's new web interface-intended to provide a simplified portal to the most complete and up-to-date set of plant genome and pathway annotations.

  15. Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis.

    PubMed

    Lees, Jonathan G; Lee, David; Studer, Romain A; Dawson, Natalie L; Sillitoe, Ian; Das, Sayoni; Yeats, Corin; Dessailly, Benoit H; Rentzsch, Robert; Orengo, Christine A

    2014-01-01

    Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year.

  16. Actin, actin-related proteins and profilin in diatoms: a comparative genomic analysis.

    PubMed

    Aumeier, Charlotte; Polinski, Ellen; Menzel, Diedrik

    2015-10-01

    Diatoms are heterokont unicellular algae with a widespread distribution throughout all aquatic habitats. Research on diatoms has advanced significantly over the last decade due to available genetic transformation methods and publicly available genome databases. Yet up to now, proteins involved in the regulation of the cytoskeleton in diatoms are largely unknown. Consequently, this work focuses on actin and actin-related proteins (ARPs) encoded in the diatom genomes of Thalassiosira pseudonana, Thalassiosira oceanica, Phaeodactylum tricornutum, Fragilariopsis cylindrus and Pseudo-nitzschia multiseries. Our comparative genomic study revealed that most diatoms possess only a single conventional actin and a small set of ARPs. Among these are the highly conserved cytoplasmic Arp1 protein and the nuclear Arp4 as well as Arp6. Diatom genomes contain genes coding for two structurally different homologues of Arp4 that might serve specific functions. All diatom species examined here lack ARP2 and ARP3 proteins, suggesting that diatoms are not capable of forming the Arp2/3 complex, which is essential in most eukaryotes for actin filament branching and plus-end dynamics. Interestingly, none of the sequenced representatives of the Bacillariophyta phylum code for profilin. Profilin is an essential actin-binding protein regulating the monomer actin pool and is involved in filament plus-end dynamics. This is the first report of organisms not containing profilin.

  17. Complete plastid genome of Eriobotrya japonica (Thunb.) Lindl and comparative analysis in Rosaceae.

    PubMed

    Shen, Liqun; Guan, Qijie; Amin, Awais; Zhu, Wei; Li, Mengzhu; Li, Ximin; Zhang, Lin; Tian, Jingkui

    2016-01-01

    Eriobotrya japonica (Thunb.) Lindl (loquat) is an evergreen Rosaceae fruit tree widely distributed in subtropical regions. Its leaves are considered as traditional Chinese medicine and are of high medical value especially for cough and emesis. Thus, we sequenced the complete plastid genome of E. japonica to better utilize this important species. The complete plastid genome of E. japonica is 159,137 bp in length, which contains a typical quadripartite structure with a pair of inverted repeats (IR, 26,326 bp) separated by large (LSC, 89,202 bp) and small (SSC, 19,283 bp) single-copy regions. The E. japonica plastid genome encodes 112 unique genes which consist of 78 protein-coding genes, 30 tRNA genes and 4 rRNA genes. Gene structure and content of E. japonica plastid genome are quite conserved and show similarity among Rosaceous species. Five large indels are unique to E. japonica in comparison with Pyrus pyrifolia and Prunus persica, which could be utilized as molecular markers. A total of 72 simple sequence repeats (SSRs) were detected and most of them are mononucleotide repeats composed of A or T, indicating a strong A or T bias for base composition. The Ka and Ks ratios of most genes are lower than 1, which suggests that most genes are under purifying selection. The phylogenetic analysis described the evolutionary relationship within Rosaceae and fully supported a close relationship between E. japonica and P. pyrifolia.

  18. Comparative Reannotation of 21 Aspergillus Genomes

    SciTech Connect

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  19. The complete chloroplast genome sequence of Gentiana lawrencei var. farreri (Gentianaceae) and comparative analysis with its congeneric species

    PubMed Central

    Fu, Peng-Cheng; Zhang, Yan-Zhao; Geng, Hui-Min

    2016-01-01

    Background The chloroplast (cp) genome is useful in plant systematics, genetic diversity analysis, molecular identification and divergence dating. The genus Gentiana contains 362 species, but there are only two valuable complete cp genomes. The purpose of this study is to report the characterization of complete cp genome of G. lawrencei var. farreri, which is endemic to the Qinghai-Tibetan Plateau (QTP). Methods Using high throughput sequencing technology, we got the complete nucleotide sequence of the G. lawrencei var. farreri cp genome. The comparison analysis including genome difference and gene divergence was performed with its congeneric species G. straminea. The simple sequence repeats (SSRs) and phylogenetics were studied as well. Results The cp genome of G. lawrencei var. farreri is a circular molecule of 138,750 bp, containing a pair of 24,653 bp inverted repeats which are separated by small and large single-copy regions of 11,365 and 78,082 bp, respectively. The cp genome contains 130 known genes, including 85 protein coding genes (PCGs), eight ribosomal RNA genes and 37 tRNA genes. Comparative analyses indicated that G. lawrencei var. farreri is 10,241 bp shorter than its congeneric species G. straminea. Four large gaps were detected that are responsible for 85% of the total sequence loss. Further detailed analyses revealed that 10 PCGs were included in the four gaps that encode nine NADH dehydrogenase subunits. The cp gene content, order and orientation are similar to those of its congeneric species, but with some variation among the PCGs. Three genes, ndhB, ndhF and clpP, have high nonsynonymous to synonymous values. There are 34 SSRs in the G. lawrencei var. farreri cp genome, of which 25 are mononucleotide repeats: no dinucleotide repeats were detected. Comparison with the G. straminea cp genome indicated that five SSRs have length polymorphisms and 23 SSRs are species-specific. The phylogenetic analysis of 48 PCGs from 12 Gentianales taxa cp genomes

  20. Array-Based Genomic Comparative Hybridization Analysis of Field Strains of Mycoplasma hyopneumoniae▿ †

    PubMed Central

    Madsen, Melissa L.; Oneal, Michael J.; Gardner, Stuart W.; Strait, Erin L.; Nettleton, Dan; Thacker, Eileen L.; Minion, F. Chris

    2007-01-01

    Mycoplasma hyopneumoniae is the causative agent of porcine enzootic pneumonia and a major factor in the porcine respiratory disease complex. A clear understanding of the mechanisms of pathogenesis does not exist, although it is clear that M. hyopneumoniae adheres to porcine ciliated epithelium by action of a protein called P97. Previous studies have shown variation in the gene encoding the P97 cilium adhesin in different strains of M. hyopneumoniae, but the extent of genetic variation among field strains across the genome is not known. Since M. hyopneumoniae is a worldwide problem, it is reasonable to expect that a wide range of genetic variability may exist given all of the different breeds and housing conditions. This variation may impact the overall virulence of a single strain. Using microarray technology, this study examined the potential variation of 14 field strains compared to strain 232, on which the array was based. Genomic DNA was obtained, amplified with TempliPhi, and labeled indirectly with Alexa dyes. After genomic hybridization, the arrays were scanned and data were analyzed using a linear statistical model. The results indicated that genetic variation could be detected in all 14 field strains but across different loci, suggesting that variation occurs throughout the genome. Fifty-nine percent of the variable loci were hypothetical genes. Twenty-two percent of the lipoprotein genes showed variation in at least one field strain. A permutation test identified a location in the M. hyopneumoniae genome where there is spatial clustering of variability between the field strains and strain 232. PMID:17873054

  1. Comparative Genome Analysis Reveals Metabolic Versatility and Environmental Adaptations of Sulfobacillus thermosulfidooxidans Strain ST

    PubMed Central

    Guo, Xue; Yin, Huaqun; Liang, Yili; Hu, Qi; Zhou, Xishu; Xiao, Yunhua; Ma, Liyuan; Zhang, Xian; Qiu, Guanzhou; Liu, Xueduan

    2014-01-01

    The genus Sulfobacillus is a cohort of mildly thermophilic or thermotolerant acidophiles within the phylum Firmicutes and requires extremely acidic environments and hypersalinity for optimal growth. However, our understanding of them is still preliminary partly because few genome sequences are available. Here, the draft genome of Sulfobacillus thermosulfidooxidans strain ST was deciphered to obtain a comprehensive insight into the genetic content and to understand the cellular mechanisms necessary for its survival. Furthermore, the expressions of key genes related with iron and sulfur oxidation were verified by semi-quantitative RT-PCR analysis. The draft genome sequence of Sulfobacillus thermosulfidooxidans strain ST, which encodes 3225 predicted coding genes on a total length of 3,333,554 bp and a 48.35% G+C, revealed the high degree of heterogeneity with other Sulfobacillus species. The presence of numerous transposases, genomic islands and complete CRISPR/Cas defence systems testifies to its dynamic evolution consistent with the genome heterogeneity. As expected, S. thermosulfidooxidans encodes a suit of conserved enzymes required for the oxidation of inorganic sulfur compounds (ISCs). The model of sulfur oxidation in S. thermosulfidooxidans was proposed, which showed some different characteristics from the sulfur oxidation of Gram-negative A. ferrooxidans. Sulfur oxygenase reductase and heterodisulfide reductase were suggested to play important roles in the sulfur oxidation. Although the iron oxidation ability was observed, some key proteins cannot be identified in S. thermosulfidooxidans. Unexpectedly, a predicted sulfocyanin is proposed to transfer electrons in the iron oxidation. Furthermore, its carbon metabolism is rather flexible, can perform the transformation of pentose through the oxidative and non-oxidative pentose phosphate pathways and has the ability to take up small organic compounds. It encodes a multitude of heavy metal resistance systems to

  2. Comparative genome analysis reveals metabolic versatility and environmental adaptations of Sulfobacillus thermosulfidooxidans strain ST.

    PubMed

    Guo, Xue; Yin, Huaqun; Liang, Yili; Hu, Qi; Zhou, Xishu; Xiao, Yunhua; Ma, Liyuan; Zhang, Xian; Qiu, Guanzhou; Liu, Xueduan

    2014-01-01

    The genus Sulfobacillus is a cohort of mildly thermophilic or thermotolerant acidophiles within the phylum Firmicutes and requires extremely acidic environments and hypersalinity for optimal growth. However, our understanding of them is still preliminary partly because few genome sequences are available. Here, the draft genome of Sulfobacillus thermosulfidooxidans strain ST was deciphered to obtain a comprehensive insight into the genetic content and to understand the cellular mechanisms necessary for its survival. Furthermore, the expressions of key genes related with iron and sulfur oxidation were verified by semi-quantitative RT-PCR analysis. The draft genome sequence of Sulfobacillus thermosulfidooxidans strain ST, which encodes 3225 predicted coding genes on a total length of 3,333,554 bp and a 48.35% G+C, revealed the high degree of heterogeneity with other Sulfobacillus species. The presence of numerous transposases, genomic islands and complete CRISPR/Cas defence systems testifies to its dynamic evolution consistent with the genome heterogeneity. As expected, S. thermosulfidooxidans encodes a suit of conserved enzymes required for the oxidation of inorganic sulfur compounds (ISCs). The model of sulfur oxidation in S. thermosulfidooxidans was proposed, which showed some different characteristics from the sulfur oxidation of Gram-negative A. ferrooxidans. Sulfur oxygenase reductase and heterodisulfide reductase were suggested to play important roles in the sulfur oxidation. Although the iron oxidation ability was observed, some key proteins cannot be identified in S. thermosulfidooxidans. Unexpectedly, a predicted sulfocyanin is proposed to transfer electrons in the iron oxidation. Furthermore, its carbon metabolism is rather flexible, can perform the transformation of pentose through the oxidative and non-oxidative pentose phosphate pathways and has the ability to take up small organic compounds. It encodes a multitude of heavy metal resistance systems to

  3. Comparative genome-scale analysis of niche-based stress-responsive genes in Lactobacillus helveticus strains.

    PubMed

    Senan, Suja; Prajapati, Jashbhai B; Joshi, Chaitanya G

    2014-04-01

    Next generation sequencing technologies with advanced bioinformatic tools present a unique opportunity to compare genomes from diverse niches. The identification of niche-specific stress-responsive genes can help in characterizing robust strains for multiple applications. In this study, we attempted to compare the stress-responsive genes of a potential probiotic strain, Lactobacillus helveticus MTCC 5463, and a cheese starter strain, Lactobacillus helveticus DPC 4571, from a gut and dairy niche, respectively. Sequencing of MTCC 5463 was done using 454 GS FLX, and contigs were assembled using GS Assembler software. Genome analysis was done using BLAST hits and the prokaryotic annotation server RAST. The MTCC 5463 genome carried multiple orthologs of genes governing stress responses, whereas the DPC 4571 genome lacked in the number of major stress-response proteins. The absence of the bile salt hydrolase gene in DPC 4571 and its presence in MTCC 5463 clearly indicated niche adaptation. Further, MTCC 5463 carried higher copy numbers of genes contributing towards heat, cold, osmotic, and oxidative stress resistance as compared with DPC 4571. Through comparative genomics, we could thus identify stress-responsive gene sets required to adapt to gut and dairy niches.

  4. Comparative genomic analysis reveals a distant liver enhancer upstream of the COUP-TFII gene

    SciTech Connect

    Baroukh, Nadine; Ahituv, Nadav; Chang, Jessie; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len A.

    2004-08-20

    COUP-TFII is a central nuclear hormone receptor that tightly regulates the expression of numerous target lipid metabolism genes in vertebrates. However, it remains unclear how COUP-TFII itself is transcriptionally controlled since studies with its promoter and upstream region fail to recapitulate the genes liver expression. In an attempt to identify liver enhancers in the vicinity of COUP-TFII, we employed a comparative genomic approach. Initial comparisons between humans and mice of the 3,470kb gene poor region surrounding COUP-TFII revealed 2,023 conserved non-coding elements. To prioritize a subset of these elements for functional studies, we performed further genomic comparisons with the orthologous pufferfish (Fugu rubripes) locus and uncovered two anciently conserved non-coding sequences (CNS) upstream of COUP-TFII (CNS-62kb and CNS-66kb). Testing these two elements using reporter constructs in liver (HepG2) cells revealed that CNS-66kb, but not CNS-62kb, yielded robust in vitro enhancer activity. In addition, an in vivo reporter assay using naked DNA transfer with CNS-66kb linked to luciferase displayed strong reproducible liver expression in adult mice, further supporting its role as a liver enhancer. Together, these studies further support the utility of comparative genomics to uncover gene regulatory sequences based on evolutionary conservation and provide the substrates to better understand the regulation and expression of COUP-TFII.

  5. Gramene: a growing plant comparative genomics resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  6. Map-Based Comparative Genomic Analysis of Virulent Haemophilus Parasuis Serovars 4 and 5

    PubMed Central

    Lawrence, Paulraj; Bey, Russell

    2015-01-01

    Haemophilus parasuis is a commensal bacterium of the upper respiratory tract of healthy pigs. However, in conjunction with viral infections in immunocompromised animals H. parasuis can transform into a pathogen that is responsible for causing Glasser's disease which is typically characterized by fibrinous polyserositis, polyarthritis, meningitis and sometimes acute pneumonia and septicemia in pigs. Haemophilus parasuis serovar 5 is highly virulent and more frequently isolated from respiratory and systemic infection in pigs. Recently a highly virulent H. parasuis serovar 4 was isolated from the tissues of diseased pigs. To understand the differences in virulence and virulence-associated genes between H. parasuis serovar 5 and highly virulent H. parasuis serovar 4 strains, a genomic library was generated by TruSeq preparation and sequenced on Illumina HiSeq 2000 obtaining 50 bp PE reads. A three-way comparative genomic analysis was conducted between two highly virulent H. parasuis serovar 4 strains and H. parasuis serovar 5. Haemophilus parasuis serovar 5 GenBank isolate SH0165 (GenBank accession number CP001321.1) was used as reference strain for assembly. Results of these analysis revealed the highly virulent H. parasuis serovar 4 lacks genes encoding for, glycosyl transferases, polysaccharide biosynthesis protein capD, spore coat polysaccharide biosynthesis protein C, polysaccharide export protein and sialyltransferase which can modify the lipopolysaccharide forming a short-chain LPS lacking O-specific polysaccharide chains often referred to as lipooligosaccharide (LOS). In addition, it can modify the outer membrane protein (OMP) structure. The lack of sialyltransferase significantly reduced the amount of sialic acid incorporated into LOS, a major and essential component of the cell wall and an important virulence determinant. These molecules may be involved in various stages of pathogenesis through molecular mimicry and by causing host cell cytotoxicity, reduced

  7. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

    PubMed

    Borziak, Kirill; Posner, Mareike G; Upadhyay, Abhishek; Danson, Michael J; Bagby, Stefan; Dorus, Steve

    2014-01-01

    Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea

  8. Complete Sequence and Comparative Analysis of the Chloroplast Genome of Coconut Palm (Cocos nucifera)

    PubMed Central

    Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703

  9. Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

    PubMed

    Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

    2013-01-01

    Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.

  10. Comparative genomic analysis of aspartic proteases in eight parasitic platyhelminths: insights into functions and evolution.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Wang, Sen; Hu, Songnian; Cai, Xuepeng

    2015-03-15

    We performed genome-wide identifications and comparative genomic analyses of the predicted aspartic proteases (APs) from eight parasitic flatworms, focusing on their evolution, potentials as drug targets and expression patterns. The results revealed that: i) More members of family A01 were identified from the schistosomes than from the cestodes; some evidence implied gene loss events along the class Cestoda, which may be related to the different ways to ingest host nutrition; ii) members in family A22 were evolutionarily highly conserved among all the parasites; iii) one retroviral-like AP in family A28 shared a highly similar predicted 3D structure with the HIV protease, implying its potential to be inhibited by HIV inhibitor-like molecules; and iiii) retrotransposon-associated APs were extensively expanded among these parasites. These results implied that the evolutionary histories of some APs in these parasites might relate to adaptations to their parasitism and some APs might have potential serving as intervention targets.

  11. Comparative Genomic Analysis Reveals a Possible Novel Non-Tuberculous Mycobacterium Species with High Pathogenic Potential

    PubMed Central

    Choo, Siew Woh; Dutta, Avirup; Wong, Guat Jah; Wee, Wei Yee; Ang, Mia Yang; Siow, Cheuk Chuen

    2016-01-01

    Mycobacteria have been reported to cause a wide range of human diseases. We present the first whole-genome study of a Non-Tuberculous Mycobacterium, Mycobacterium sp. UM_CSW (referred to hereafter as UM_CSW), isolated from a patient diagnosed with bronchiectasis. Our data suggest that this clinical isolate is likely a novel mycobacterial species, supported by clear evidence from molecular phylogenetic, comparative genomic, ANI and AAI analyses. UM_CSW is closely related to the Mycobacterium avium complex. While it has characteristic features of an environmental bacterium, it also shows a high pathogenic potential with the presence of a wide variety of putative genes related to bacterial virulence and shares very similar pathogenomic profiles with the known pathogenic mycobacterial species. Thus, we conclude that this possible novel Mycobacterium species should be tightly monitored for its possible causative role in human infections. PMID:27035710

  12. Comparative genome analysis between Agrostis stolonifera and members of the Pooideae subfamily, including Brachypodium distachyon.

    PubMed

    Araneda, Loreto; Sim, Sung-Chur; Bae, Jin-Joo; Chakraborty, Nanda; Curley, Joe; Chang, Taehyun; Inoue, Maiko; Warnke, Scott; Jung, Geunhwa

    2013-01-01

    Creeping bentgrass (Agrostis stolonifera, allotetraploid 2n = 4x = 28) is one of the major cool-season turfgrasses. It is widely used on golf courses due to its tolerance to low mowing and aggressive growth habit. In this study, we investigated genome relationships of creeping bentgrass relative to the Triticeae (a consensus map of Triticum aestivum, T. tauschii, Hordeum vulgare, and H. spontaneum), oat, rice, and ryegrass maps using a common set of 229 EST-RFLP markers. The genome comparisons based on the RFLP markers revealed large-scale chromosomal rearrangements on different numbers of linkage groups (LGs) of creeping bentgrass relative to the Triticeae (3 LGs), oat (4 LGs), and rice (8 LGs). However, we detected no chromosomal rearrangement between creeping bentgrass and ryegrass, suggesting that these recently domesticated species might be closely related, despite their memberships to different Pooideae tribes. In addition, the genome of creeping bentgrass was compared with the complete genome sequence of Brachypodium distachyon in Pooideae subfamily using both sequences of the above-mentioned mapped EST-RFLP markers and sequences of 8,470 publicly available A. stolonifera ESTs (AgEST). We discovered large-scale chromosomal rearrangements on six LGs of creeping bentgrass relative to B. distachyon. Also, a total of 24 syntenic blocks based on 678 orthologus loci were identified between these two grass species. The EST orthologs can be utilized in further comparative mapping of Pooideae species. These results will be useful for genetic improvement of Agrostis species and will provide a better understanding of evolution within Pooideae species.

  13. Comparative Genome Analysis between Agrostis stolonifera and Members of the Pooideae Subfamily, including Brachypodium distachyon

    PubMed Central

    Bae, Jin-Joo; Chakraborty, Nanda; Curley, Joe; Chang, Taehyun; Inoue, Maiko; Warnke, Scott; Jung, Geunhwa

    2013-01-01

    Creeping bentgrass (Agrostis stolonifera, allotetraploid 2n = 4x = 28) is one of the major cool-season turfgrasses. It is widely used on golf courses due to its tolerance to low mowing and aggressive growth habit. In this study, we investigated genome relationships of creeping bentgrass relative to the Triticeae (a consensus map of Triticum aestivum, T. tauschii, Hordeum vulgare, and H. spontaneum), oat, rice, and ryegrass maps using a common set of 229 EST-RFLP markers. The genome comparisons based on the RFLP markers revealed large-scale chromosomal rearrangements on different numbers of linkage groups (LGs) of creeping bentgrass relative to the Triticeae (3 LGs), oat (4 LGs), and rice (8 LGs). However, we detected no chromosomal rearrangement between creeping bentgrass and ryegrass, suggesting that these recently domesticated species might be closely related, despite their memberships to different Pooideae tribes. In addition, the genome of creeping bentgrass was compared with the complete genome sequence of Brachypodium distachyon in Pooideae subfamily using both sequences of the above-mentioned mapped EST-RFLP markers and sequences of 8,470 publicly available A. stolonifera ESTs (AgEST). We discovered large-scale chromosomal rearrangements on six LGs of creeping bentgrass relative to B. distachyon. Also, a total of 24 syntenic blocks based on 678 orthologus loci were identified between these two grass species. The EST orthologs can be utilized in further comparative mapping of Pooideae species. These results will be useful for genetic improvement of Agrostis species and will provide a better understanding of evolution within Pooideae species. PMID:24244501

  14. Association between chromosomal aberration of COX8C and tethered spinal cord syndrome: array-based comparative genomic hybridization analysis

    PubMed Central

    Zhao, Qiu-jiong; Bai, Shao-cong; Cheng, Cheng; Tao, Ben-zhang; Wang, Le-kai; Liang, Shuang; Yin, Ling; Hang, Xing-yi; Shang, Ai-jia

    2016-01-01

    Copy number variations have been found in patients with neural tube abnormalities. In this study, we performed genome-wide screening using high-resolution array-based comparative genomic hybridization in three children with tethered spinal cord syndrome and two healthy parents. Of eight copy number variations, four were non-polymorphic. These non-polymorphic copy number variations were associated with Angelman and Prader-Willi syndromes, and microcephaly. Gene function enrichment analysis revealed that COX8C, a gene associated with metabolic disorders of the nervous system, was located in the copy number variation region of Patient 1. Our results indicate that array-based comparative genomic hybridization can be used to diagnose tethered spinal cord syndrome. Our results may help determine the pathogenesis of tethered spinal cord syndrome and prevent occurrence of this disease. PMID:27651783

  15. Comparative genomic analysis of Aspergillus oryzae strains 3.042 and RIB40 for soy sauce fermentation.

    PubMed

    Zhao, Guozhong; Yao, Yunping; Wang, Chunling; Hou, Lihua; Cao, Xiaohong

    2013-06-17

    The filamentous fungus Aspergillus oryzae 3.042 (Chinese strain) is a close relative of A. oryzae RIB40 (Japanese strain), which is the important agent used for soy sauce fermentation. The genome of A. oryzae 3.042 was sequenced and compared with A. oryzae RIB40 in an attempt to understand why different soy sauce flavors are produced by these strains. The A. oryzae 3.042 chromosome is 36,547,279bp and contains 11,399 protein-encoding genes. MUMmer analysis revealed that the genomes of A. oryzae 3.042 and RIB40 are mostly collinear. Genome sequence data and comparative analysis of the two strains identified several strain-specific genes that encode putative proteins involved in cell growth, salt tolerance, environmental resistance and flavor formation. A. oryzae 3.042 showed stronger potential for mycelial growth. Some genes unique to A. oryzae RIB40 were related to salt tolerance, especially genes for K(+) transport, while others were associated with ester formation and amino acid metabolism, which likely contribute to flavor formation. In conclusion, comparative genome analysis provided insights into the different genetic traits of the two A. oryzae strains. The unique genes that we found in A. oryzae would make sense to the soy sauce fermentation.

  16. Culex genome is not just another genome for comparative genomics.

    PubMed

    Reddy, B P Niranjan; Labbé, Pierrick; Corbel, Vincent

    2012-03-30

    Formal publication of the Culex genome sequence has closed the human disease vector triangle by meeting the Anopheles gambiae and Aedes aegypti genome sequences. Compared to these other mosquitoes, Culex quinquefasciatus possesses many specific hallmark characteristics, and may thus provide different angles for research which ultimately leads to a practical solution for controlling the ever increasing burden of insect-vector-borne diseases around the globe. We argue the special importance of the cosmopolitan species- Culex genome sequence by invoking many interesting questions and the possible of potential of the Culex genome to answer those.

  17. Comparative Genomic Analysis of the Streptococcus dysgalactiae Species Group: Gene Content, Molecular Adaptation, and Promoter Evolution

    PubMed Central

    Suzuki, Haruo; Lefébure, Tristan; Hubisz, Melissa Jane; Pavinski Bitar, Paulina; Lang, Ping; Siepel, Adam; Stanhope, Michael J.

    2011-01-01

    Comparative genomics of closely related bacterial species with different pathogenesis and host preference can provide a means of identifying the specifics of adaptive differences. Streptococcus dysgalactiae (SD) is comprised of two subspecies: S. dysgalactiae subsp. equisimilis is both a human commensal organism and a human pathogen, and S. dysgalactiae subsp. dysgalactiae is strictly an animal pathogen. Here, we present complete genome sequences for both taxa, with analyses involving other species of Streptococcus but focusing on adaptation in the SD species group. We found little evidence for enrichment in biochemical categories of genes carried by each SD strain, however, differences in the virulence gene repertoire were apparent. Some of the differences could be ascribed to prophage and integrative conjugative elements. We identified approximately 9% of the nonrecombinant core genome to be under positive selection, some of which involved known virulence factors in other bacteria. Analyses of proteomes by pooling data across genes, by biochemical category, clade, or branch, provided evidence for increased rates of evolution in several gene categories, as well as external branches of the tree. Promoters were primarily evolving under purifying selection but with certain categories of genes evolving faster. Many of these fast-evolving categories were the same as those associated with rapid evolution in proteins. Overall, these results suggest that adaptation to changing environments and new hosts in the SD species group has involved the acquisition of key virulence genes along with selection of orthologous protein-coding loci and operon promoters. PMID:21282711

  18. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis

    PubMed Central

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense. PMID:27031249

  19. Sequencing and comparative analysis of the straw mushroom (Volvariella volvacea) genome.

    PubMed

    Bao, Dapeng; Gong, Ming; Zheng, Huajun; Chen, Mingjie; Zhang, Liang; Wang, Hong; Jiang, Jianping; Wu, Lin; Zhu, Yongqiang; Zhu, Gang; Zhou, Yan; Li, Chuanhua; Wang, Shengyue; Zhao, Yan; Zhao, Guoping; Tan, Qi

    2013-01-01

    Volvariella volvacea, the edible straw mushroom, is a highly nutritious food source that is widely cultivated on a commercial scale in many parts of Asia using agricultural wastes (rice straw, cotton wastes) as growth substrates. However, developments in V. volvacea cultivation have been limited due to a low biological efficiency (i.e. conversion of growth substrate to mushroom fruit bodies), sensitivity to low temperatures, and an unclear sexuality pattern that has restricted the breeding of improved strains. We have now sequenced the genome of V. volvacea and assembled it into 62 scaffolds with a total genome size of 35.7 megabases (Mb), containing 11,084 predicted gene models. Comparative analyses were performed with the model species in basidiomycete on mating type system, carbohydrate active enzymes, and fungal oxidative lignin enzymes. We also studied transcriptional regulation of the response to low temperature (4°C). We found that the genome of V. volvacea has many genes that code for enzymes, which are involved in the degradation of cellulose, hemicellulose, and pectin. The molecular genetics of the mating type system in V. volvacea was also found to be similar to the bipolar system in basidiomycetes, suggesting that it is secondary homothallism. Sensitivity to low temperatures could be due to the lack of the initiation of the biosynthesis of unsaturated fatty acids, trehalose and glycogen biosyntheses in this mushroom. Genome sequencing of V. volvacea has improved our understanding of the biological characteristics related to the degradation of the cultivating compost consisting of agricultural waste, the sexual reproduction mechanism, and the sensitivity to low temperatures at the molecular level which in turn will enable us to increase the industrial production of this mushroom.

  20. Sequencing and Comparative Analysis of the Straw Mushroom (Volvariella volvacea) Genome

    PubMed Central

    Bao, Dapeng; Gong, Ming; Zheng, Huajun; Chen, Mingjie; Zhang, Liang; Wang, Hong; Jiang, Jianping; Wu, Lin; Zhu, Yongqiang; Zhu, Gang; Zhou, Yan; Li, Chuanhua; Wang, Shengyue; Zhao, Yan; Zhao, Guoping; Tan, Qi

    2013-01-01

    Volvariella volvacea, the edible straw mushroom, is a highly nutritious food source that is widely cultivated on a commercial scale in many parts of Asia using agricultural wastes (rice straw, cotton wastes) as growth substrates. However, developments in V. volvacea cultivation have been limited due to a low biological efficiency (i.e. conversion of growth substrate to mushroom fruit bodies), sensitivity to low temperatures, and an unclear sexuality pattern that has restricted the breeding of improved strains. We have now sequenced the genome of V. volvacea and assembled it into 62 scaffolds with a total genome size of 35.7 megabases (Mb), containing 11,084 predicted gene models. Comparative analyses were performed with the model species in basidiomycete on mating type system, carbohydrate active enzymes, and fungal oxidative lignin enzymes. We also studied transcriptional regulation of the response to low temperature (4°C). We found that the genome of V. volvacea has many genes that code for enzymes, which are involved in the degradation of cellulose, hemicellulose, and pectin. The molecular genetics of the mating type system in V. volvacea was also found to be similar to the bipolar system in basidiomycetes, suggesting that it is secondary homothallism. Sensitivity to low temperatures could be due to the lack of the initiation of the biosynthesis of unsaturated fatty acids, trehalose and glycogen biosyntheses in this mushroom. Genome sequencing of V. volvacea has improved our understanding of the biological characteristics related to the degradation of the cultivating compost consisting of agricultural waste, the sexual reproduction mechanism, and the sensitivity to low temperatures at the molecular level which in turn will enable us to increase the industrial production of this mushroom. PMID:23526973

  1. Comparative genomics analysis of Streptococcus isolates from the human small intestine reveals their adaptation to a highly dynamic ecosystem.

    PubMed

    Van den Bogert, Bartholomeus; Boekhorst, Jos; Herrmann, Ruth; Smid, Eddy J; Zoetendal, Erwin G; Kleerebezem, Michiel

    2013-01-01

    The human small-intestinal microbiota is characterised by relatively large and dynamic Streptococcus populations. In this study, genome sequences of small-intestinal streptococci from S. mitis, S. bovis, and S. salivarius species-groups were determined and compared with those from 58 Streptococcus strains in public databases. The Streptococcus pangenome consists of 12,403 orthologous groups of which 574 are shared among all sequenced streptococci and are defined as the Streptococcus core genome. Genome mining of the small-intestinal streptococci focused on functions playing an important role in the interaction of these streptococci in the small-intestinal ecosystem, including natural competence and nutrient-transport and metabolism. Analysis of the small-intestinal Streptococcus genomes predicts a high capacity to synthesize amino acids and various vitamins as well as substantial divergence in their carbohydrate transport and metabolic capacities, which is in agreement with observed physiological differences between these Streptococcus strains. Gene-specific PCR-strategies enabled evaluation of conservation of Streptococcus populations in intestinal samples from different human individuals, revealing that the S. salivarius strains were frequently detected in the small-intestine microbiota, supporting the representative value of the genomes provided in this study. Finally, the Streptococcus genomes allow prediction of the effect of dietary substances on Streptococcus population dynamics in the human small-intestine.

  2. Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds.

    PubMed

    Miyazaki, Ryo; Bertelli, Claire; Benaglio, Paola; Canton, Jonas; De Coi, Nicoló; Gharib, Walid H; Gjoksi, Bebeka; Goesmann, Alexander; Greub, Gilbert; Harshman, Keith; Linke, Burkhard; Mikulic, Josip; Mueller, Linda; Nicolas, Damien; Robinson-Rechavi, Marc; Rivolta, Carlo; Roggo, Clémence; Roy, Shantanu; Sentchilo, Vladimir; Siebenthal, Alexandra Von; Falquet, Laurent; van der Meer, Jan Roelof

    2015-01-01

    Pseudomonas knackmussii B13 was the first strain to be isolated in 1974 that could degrade chlorinated aromatic hydrocarbons. This discovery was the prologue for subsequent characterization of numerous bacterial metabolic pathways, for genetic and biochemical studies, and which spurred ideas for pollutant bioremediation. In this study, we determined the complete genome sequence of B13 using next generation sequencing technologies and optical mapping. Genome annotation indicated that B13 has a variety of metabolic pathways for degrading monoaromatic hydrocarbons including chlorobenzoate, aminophenol, anthranilate and hydroxyquinol, but not polyaromatic compounds. Comparative genome analysis revealed that B13 is closest to Pseudomonas denitrificans and Pseudomonas aeruginosa. The B13 genome contains at least eight genomic islands [prophages and integrative conjugative elements (ICEs)], which were absent in closely related pseudomonads. We confirm that two ICEs are identical copies of the 103 kb self-transmissible element ICEclc that carries the genes for chlorocatechol metabolism. Comparison of ICEclc showed that it is composed of a variable and a 'core' region, which is very conserved among proteobacterial genomes, suggesting a widely distributed family of so far uncharacterized ICE. Resequencing of two spontaneous B13 mutants revealed a number of single nucleotide substitutions, as well as excision of a large 220 kb region and a prophage that drastically change the host metabolic capacity and survivability.

  3. Comparative genomic hybridization and transcriptome analysis with a pan-genome microarray reveal distinctions between JP2 and non-JP2 genotypes of Aggregatibacter actinomycetemcomitans.

    PubMed

    Huang, Y; Kittichotirat, W; Mayer, M P A; Hall, R; Bumgarner, R; Chen, C

    2013-02-01

    It was postulated that the highly virulent JP2 genotype of Aggregatibacter actinomycetemcomitans may possess a constellation of distinct virulence determinants not found in non-JP2 genotypes. This study compared the genome content and the transcriptome of the serotype b JP2 genotype and the closely related serotype b non-JP2 genotype of A. actinomycetemcomitans. A custom-designed pan-genomic microarray of A. actinomycetemcomitans was constructed and validated against a panel of 11 sequenced reference strains. The microarray was subsequently used for comparative genomic hybridization of serotype b strains of JP2 (six strains) and non-JP2 (six strains) genotypes, and for transcriptome analysis of strains of JP2 (three strains) and non-JP2 (two strains). Two JP2-specific and two non-JP2-specific genomic islands were identified. In one instance, distinct genomic islands were found to be inserted into the same locus among strains of different genotypes. Transcriptome analysis identified five operons, including the leukotoxin operon, to have at least two genes with an expression ratio of 2 or greater between genotypes. Two of the differentially expressed operons were members of the membrane-bound nitrate reductase system (nap operon) and the Tol-Pal system of gram-negative bacterial species. This study is the first to demonstrate the differences in the full genome content and gene expression between A. actinomycetemcomitans strains of JP2 and non-JP2 genotypes. The information is essential for designing hypothesis-driven experiments to examine the pathogenic mechanisms of A. actinomycetemcomitans.

  4. Comparative genomic analysis of primary tumors and metastases in breast cancer.

    PubMed

    Bertucci, François; Finetti, Pascal; Guille, Arnaud; Adélaïde, José; Garnier, Séverine; Carbuccia, Nadine; Monneur, Audrey; Charafe-Jauffret, Emmanuelle; Goncalves, Anthony; Viens, Patrice; Birnbaum, Daniel; Chaffanet, Max

    2016-05-10

    Personalized medicine uses genomic information for selecting therapy in patients with metastatic cancer. An issue is the optimal tissue source (primary tumor or metastasis) for testing. We compared the DNA copy number and mutational profiles of primary breast cancers and paired metastases from 23 patients using whole-genome array-comparative genomic hybridization and next-generation sequencing of 365 "cancer-associated" genes. Primary tumors and metastases harbored copy number alterations (CNAs) and mutations common in breast cancer and showed concordant profiles. The global concordance regarding CNAs was shown by clustering and correlation matrix, which showed that each metastasis correlated more strongly with its paired tumor than with other samples. Genes with recurrent amplifications in breast cancer showed 100% (ERBB2, FGFR1), 96% (CCND1), and 88% (MYC) concordance for the amplified/non-amplified status. Among all samples, 499 mutations were identified, including 39 recurrent (AKT1, ERBB2, PIK3CA, TP53) and 460 non-recurrent variants. The tumors/metastases concordance of variants was 75%, higher for recurrent (92%) than for non-recurrent (73%) variants. Further mutational discordance came from very different variant allele frequencies for some variants. We showed that the chosen targeted therapy in two clinical trials of personalized medicine would be concordant in all but one patient (96%) when based on the molecular profiling of tumor and paired metastasis. Our results suggest that the genotyping of primary tumor may be acceptable to guide systemic treatment if the metastatic sample is not obtainable. However, given the rare but potentially relevant divergences for some actionable driver genes, the profiling of metastatic sample is recommended.

  5. Comparative genomic analysis of primary tumors and metastases in breast cancer

    PubMed Central

    Bertucci, François; Carbuccia, Nadine; Monneur, Audrey; Charafe-Jauffret, Emmanuelle; Goncalves, Anthony; Viens, Patrice; Birnbaum, Daniel; Chaffanet, Max

    2016-01-01

    Personalized medicine uses genomic information for selecting therapy in patients with metastatic cancer. An issue is the optimal tissue source (primary tumor or metastasis) for testing. We compared the DNA copy number and mutational profiles of primary breast cancers and paired metastases from 23 patients using whole-genome array-comparative genomic hybridization and next-generation sequencing of 365 “cancer-associated” genes. Primary tumors and metastases harbored copy number alterations (CNAs) and mutations common in breast cancer and showed concordant profiles. The global concordance regarding CNAs was shown by clustering and correlation matrix, which showed that each metastasis correlated more strongly with its paired tumor than with other samples. Genes with recurrent amplifications in breast cancer showed 100% (ERBB2, FGFR1), 96% (CCND1), and 88% (MYC) concordance for the amplified/non-amplified status. Among all samples, 499 mutations were identified, including 39 recurrent (AKT1, ERBB2, PIK3CA, TP53) and 460 non-recurrent variants. The tumors/metastases concordance of variants was 75%, higher for recurrent (92%) than for non-recurrent (73%) variants. Further mutational discordance came from very different variant allele frequencies for some variants. We showed that the chosen targeted therapy in two clinical trials of personalized medicine would be concordant in all but one patient (96%) when based on the molecular profiling of tumor and paired metastasis. Our results suggest that the genotyping of primary tumor may be acceptable to guide systemic treatment if the metastatic sample is not obtainable. However, given the rare but potentially relevant divergences for some actionable driver genes, the profiling of metastatic sample is recommended. PMID:27028851

  6. Comparative pan genome analysis of oral Prevotella species implicated in periodontitis.

    PubMed

    Ibrahim, Maziya; Subramanian, Ahalyaa; Anishetty, Sharmila

    2017-02-24

    Prevotella is part of the oral bacterial community implicated in periodontitis. Pan genome analyses of eight oral Prevotella species, P. dentalis, P. enoeca, P. fusca, P. melaninogenica, P. denticola, P. intermedia 17, P. intermedia 17-2 and P. sp. oral taxon 299 are presented in this study. Analysis of the Prevotella pan genome revealed features such as secretion systems, resistance to oxidative stress and clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems that enable the bacteria to adapt to the oral environment. We identified the presence of type VI secretion system (T6SS) in P. fusca and P. intermedia strains. For some VgrG and Hcp proteins which were not part of the core T6SS loci, we used gene neighborhood analysis and identified putative effector proteins and putative polyimmunity loci in P. fusca and polymorphic toxin systems in P. intermedia strains. Earlier studies have identified the presence of Por secretion system (PorSS) in P. gingivalis, P. melaninogenica and P. intermedia. We noted the presence of their homologs in six other oral Prevotella studied here. We suggest that in Prevotella, PorSS is used to secrete cysteine proteases such as interpain and C-terminal domain containing proteins with a "Por_secre_tail" domain. We identified subtype I-B CRISPR-Cas system in P. enoeca. Putative CRISPR-Cas system subtypes for 37 oral Prevotella and 30 non-oral Prevotella species were also predicted. Further, we performed a BLASTp search of the Prevotella proteins which are also conserved in the red-complex pathogens, against the human proteome to identify potential broad-spectrum drug targets. In summary, the use of a pan genome approach enabled identification of secretion systems and defense mechanisms in Prevotella that confer adaptation to the oral cavity.

  7. Comparative genomic analysis of the zebra finch degradome provides new insights into evolution of proteases in birds and mammals

    PubMed Central

    2010-01-01

    Background The degradome -the complete repertoire of proteases in an organism- is involved in multiple key biological and pathological processes. Previous studies in several organisms have yielded sets of curated protease sequences which may be used to characterize the degradome in a novel genome by similarity. Differences between degradomes can then be related to physiological traits of the species under study. Therefore, the sequencing of the zebra finch genome allows the comparison between the degradomes of mammals and birds and may help to understand the biological peculiarities of the zebra finch. Results A set of curated protease sequences from humans and chicken was used to predict the sequences of 460 protease and protease-like genes in the zebra finch genome. This analysis revealed important differences in the evolution of mammalian and bird degradomes, including genomic expansions and deletions of caspases, cytotoxic proteases, kallikreins, matrix metalloproteases, and trypsin-like proteases. Furthermore, we found several zebra finch-specific features, such as duplications in CASP3 and BACE, and a large genomic expansion of acrosin. Conclusions We have compared the degradomes of zebra finch, chicken and several mammalian species, with the finding of multiple differences which illustrate the evolution of the protease complement of these organisms. Detailed analysis of these changes in zebra finch proteases has shown that they are mainly related to immunological, developmental, reproductive and neural functions. PMID:20359326

  8. Comparative Genomic Analysis of Pseudomonas chlororaphis PCL1606 Reveals New Insight into Antifungal Compounds Involved in Biocontrol.

    PubMed

    Calderón, Claudia E; Ramos, Cayo; de Vicente, Antonio; Cazorla, Francisco M

    2015-03-01

    Pseudomonas chlororaphis PCL1606 is a rhizobacterium that has biocontrol activity against many soilborne phytopathogenic fungi. The whole genome sequence of this strain was obtained using the Illumina Hiseq 2000 sequencing platform and was assembled using SOAP denovo software. The resulting 6.66-Mb complete sequence of the PCL1606 genome was further analyzed. A comparative genomic analysis using 10 plant-associated strains within the fluorescent Pseudomonas group, including the complete genome of P. chlororaphis PCL1606, revealed a diverse spectrum of traits involved in multitrophic interactions with plants and microbes as well as biological control. Phylogenetic analysis of these strains using eight housekeeping genes clearly placed strain PCL1606 into the P. chlororaphis group. The genome sequence of P. chlororaphis PCL1606 revealed the presence of sequences that were homologous to biosynthetic genes for the antifungal compounds 2-hexyl, 5-propyl resorcinol (HPR), hydrogen cyanide, and pyrrolnitrin; this is the first report of pyrrolnitrin encoding genes in this P. chlororaphis strain. Single-, double-, and triple-insertional mutants in the biosynthetic genes of each antifungal compound were used to test their roles in the production of these antifungal compounds and in antagonism and biocontrol of two fungal pathogens. The results confirmed the function of HPR in the antagonistic phenotype and in the biocontrol activity of P. chlororaphis PCL1606.

  9. Comparative genomic analysis of a neurotoxigenic Clostridium species using partial genome sequence: Phylogenetic analysis of a few conserved proteins involved in cellular processes and metabolism.

    PubMed

    Alam, Syed Imteyaz; Dixit, Aparna; Tomar, Arvind; Singh, Lokendra

    2010-04-01

    Clostridial organisms produce neurotoxins, which are generally regarded as the most potent toxic substances of biological origin and potential biological warfare agents. Clostridium tetani produces tetanus neurotoxin and is responsible for the fatal tetanus disease. In spite of the extensive immunization regimen, the disease is an important cause of death especially among neonates. Strains of C. tetani have not been genetically characterized except the complete genome sequencing of strain E88. The present study reports the genetic makeup and phylogenetic affiliations of an environmental strain of this bacterium with respect to C. tetani E88 and other clostridia. A shot gun library was constructed from the genomic DNA of C. tetani drde, isolated from decaying fish sample. Unique clones were sequenced and sequences compared with its closest relative C. tetani E88. A total of 275 clones were obtained and 32,457 bases of non-redundant sequence were generated. A total of 150 base changes were observed over the entire length of sequence obtained, including, additions, deletions and base substitutions. Of the total 120 ORFs detected, 48 exhibited closest similarity to E88 proteins of which three are hypothetical proteins. Eight of the ORFs exhibited similarity with hypothetical proteins from other organisms and 10 aligned with other proteins from unrelated organisms. There is an overall conservation of protein sequences among the two strains of C. tetani and. Selected ORFs involved in cellular processes and metabolism were subjected to phylogenetic analysis.

  10. Genome-wide Comparative Analysis of Atopic Dermatitis and Psoriasis Gives Insight into Opposing Genetic Mechanisms

    PubMed Central

    Baurecht, Hansjörg; Hotze, Melanie; Brand, Stephan; Büning, Carsten; Cormican, Paul; Corvin, Aiden; Ellinghaus, David; Ellinghaus, Eva; Esparza-Gordillo, Jorge; Fölster-Holst, Regina; Franke, Andre; Gieger, Christian; Hubner, Norbert; Illig, Thomas; Irvine, Alan D.; Kabesch, Michael; Lee, Young A.E.; Lieb, Wolfgang; Marenholz, Ingo; McLean, W.H. Irwin; Morris, Derek W.; Mrowietz, Ulrich; Nair, Rajan; Nöthen, Markus M.; Novak, Natalija; O’Regan, Grainne M.; Schreiber, Stefan; Smith, Catherine; Strauch, Konstantin; Stuart, Philip E.; Trembath, Richard; Tsoi, Lam C.; Weichenthal, Michael; Barker, Jonathan; Elder, James T.; Weidinger, Stephan; Cordell, Heather J.; Brown, Sara J.

    2015-01-01

    Atopic dermatitis and psoriasis are the two most common immune-mediated inflammatory disorders affecting the skin. Genome-wide studies demonstrate a high degree of genetic overlap, but these diseases have mutually exclusive clinical phenotypes and opposing immune mechanisms. Despite their prevalence, atopic dermatitis and psoriasis very rarely co-occur within one individual. By utilizing genome-wide association study and ImmunoChip data from >19,000 individuals and methodologies developed from meta-analysis, we have identified opposing risk alleles at shared loci as well as independent disease-specific loci within the epidermal differentiation complex (chromosome 1q21.3), the Th2 locus control region (chromosome 5q31.1), and the major histocompatibility complex (chromosome 6p21–22). We further identified previously unreported pleiotropic alleles with opposing effects on atopic dermatitis and psoriasis risk in PRKRA and ANXA6/TNIP1. In contrast, there was no evidence for shared loci with effects operating in the same direction on both diseases. Our results show that atopic dermatitis and psoriasis have distinct genetic mechanisms with opposing effects in shared pathways influencing epidermal differentiation and immune response. The statistical analysis methods developed in the conduct of this study have produced additional insight from previously published data sets. The approach is likely to be applicable to the investigation of the genetic basis of other complex traits with overlapping and distinct clinical features. PMID:25574825

  11. Genome-wide comparative analysis of atopic dermatitis and psoriasis gives insight into opposing genetic mechanisms.

    PubMed

    Baurecht, Hansjörg; Hotze, Melanie; Brand, Stephan; Büning, Carsten; Cormican, Paul; Corvin, Aiden; Ellinghaus, David; Ellinghaus, Eva; Esparza-Gordillo, Jorge; Fölster-Holst, Regina; Franke, Andre; Gieger, Christian; Hubner, Norbert; Illig, Thomas; Irvine, Alan D; Kabesch, Michael; Lee, Young A E; Lieb, Wolfgang; Marenholz, Ingo; McLean, W H Irwin; Morris, Derek W; Mrowietz, Ulrich; Nair, Rajan; Nöthen, Markus M; Novak, Natalija; O'Regan, Grainne M; Schreiber, Stefan; Smith, Catherine; Strauch, Konstantin; Stuart, Philip E; Trembath, Richard; Tsoi, Lam C; Weichenthal, Michael; Barker, Jonathan; Elder, James T; Weidinger, Stephan; Cordell, Heather J; Brown, Sara J

    2015-01-08

    Atopic dermatitis and psoriasis are the two most common immune-mediated inflammatory disorders affecting the skin. Genome-wide studies demonstrate a high degree of genetic overlap, but these diseases have mutually exclusive clinical phenotypes and opposing immune mechanisms. Despite their prevalence, atopic dermatitis and psoriasis very rarely co-occur within one individual. By utilizing genome-wide association study and ImmunoChip data from >19,000 individuals and methodologies developed from meta-analysis, we have identified opposing risk alleles at shared loci as well as independent disease-specific loci within the epidermal differentiation complex (chromosome 1q21.3), the Th2 locus control region (chromosome 5q31.1), and the major histocompatibility complex (chromosome 6p21-22). We further identified previously unreported pleiotropic alleles with opposing effects on atopic dermatitis and psoriasis risk in PRKRA and ANXA6/TNIP1. In contrast, there was no evidence for shared loci with effects operating in the same direction on both diseases. Our results show that atopic dermatitis and psoriasis have distinct genetic mechanisms with opposing effects in shared pathways influencing epidermal differentiation and immune response. The statistical analysis methods developed in the conduct of this study have produced additional insight from previously published data sets. The approach is likely to be applicable to the investigation of the genetic basis of other complex traits with overlapping and distinct clinical features.

  12. Comparative genomics for biodiversity conservation

    PubMed Central

    Grueber, Catherine E.

    2015-01-01

    Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem. PMID:26106461

  13. A comparative analysis of the complete mitochondrial genome of the Eurasian otter Lutra lutra (Carnivora; Mustelidae).

    PubMed

    Ki, Jang-Seu; Hwang, Dae-Sik; Park, Tae-Jin; Han, Sang-Hoon; Lee, Jae-Seong

    2010-04-01

    Otter populations are declining throughout the world and most otter species are considered endangered. Molecular methods are suitable tools for population genetic research on endangered species. In the present study, we analyzed the complete mitochondrial genome (mitogenome) sequence of the Eurasian otter Lutra lutra. The mitochondrial DNA sequence of the Eurasian otter is 16,505 bp in length and consists of 13 protein-coding genes, 22 tRNAs, 2 rRNAs, and a control region (CR). The CR sequence of otters from Europe and Asia showed nearly identical numbers and nucleotide sequences of minisatellites. Phylogenetic analysis of Mustelidae mitogenomes, including individual genes, revealed that Lutrinae and Mustelinae form a clade, and that L. lutra and Enhydra lutris are sister taxa within the Lutrinae. Phylogenetic analyses revealed that of the 13 mitochondrial protein-coding genes, ND5 is the most reliable marker for analysis of phylogenetic relationships within the Mustelidae.

  14. A Multi-Platform Draft de novo Genome Assembly and Comparative Analysis for the Scarlet Macaw (Ara macao)

    PubMed Central

    Seabury, Christopher M.; Dowd, Scot E.; Seabury, Paul M.; Raudsepp, Terje; Brightsmith, Donald J.; Liboriussen, Poul; Halley, Yvette; Fisher, Colleen A.; Owens, Elaine; Viswanathan, Ganesh; Tizard, Ian R.

    2013-01-01

    Data deposition to NCBI Genomes This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly). The version described in this paper is the first version (AMXX01000000). The scaffolded assembly (SMACv1.1) has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000). Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw). Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb) includes more than 997 Mb of unambiguous sequence data (excluding N’s). Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7), which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity) which were independently supported by the results of previous human GWAS studies. We

  15. Comparative Analysis Highlights Variable Genome Content of Wheat Rusts and Divergence of the Mating Loci.

    PubMed

    Cuomo, Christina A; Bakkeren, Guus; Khalil, Hala Badr; Panwar, Vinay; Joly, David; Linning, Rob; Sakthikumar, Sharadha; Song, Xiao; Adiconis, Xian; Fan, Lin; Goldberg, Jonathan M; Levin, Joshua Z; Young, Sarah; Zeng, Qiandong; Anikster, Yehoshua; Bruce, Myron; Wang, Meinan; Yin, Chuntao; McCallum, Brent; Szabo, Les J; Hulbert, Scot; Chen, Xianming; Fellers, John P

    2017-02-09

    Three members of the Puccinia genus, Pucciniatriticina (Pt), Pstriiformis f.sp. tritici (Pst), and Pgraminis f.sp. tritici (Pgt), cause the most common and often most significant foliar diseases of wheat. While similar in biology and life cycle, each species is uniquely adapted and specialized. The genomes of Pt and Pst were sequenced and compared to that of Pgt to identify common and distinguishing gene content, to determine gene variation among wheat rust pathogens, other rust fungi, and basidiomycetes, and to identify genes of significance for infection. Pt had the largest genome of the three, estimated at 135 Mb with expansion due to mobile elements and repeats encompassing 50.9% of contig bases; in comparison, repeats occupy 31.5% for Pst and 36.5% for Pgt We find all three genomes are highly heterozygous, with Pst [5.97 single nucleotide polymorphisms (SNPs)/kb] nearly twice the level detected in Pt (2.57 SNPs/kb) and that previously reported for Pgt Of 1358 predicted effectors in Pt, 784 were found expressed across diverse life cycle stages including the sexual stage. Comparison to related fungi highlighted the expansion of gene families involved in transcriptional regulation and nucleotide binding, protein modification, and carbohydrate degradation enzymes. Two allelic homeodomain pairs, HD1 and HD2, were identified in each dikaryotic Puccinia species along with three pheromone receptor (STE3) mating-type genes, two of which are likely representing allelic specificities. The HD proteins were active in a heterologous Ustilago maydis mating assay and host-induced gene silencing (HIGS) of the HD and STE3 alleles reduced wheat host infection.

  16. Comparative Analysis Highlights Variable Genome Content of Wheat Rusts and Divergence of the Mating Loci

    PubMed Central

    Cuomo, Christina A.; Bakkeren, Guus; Khalil, Hala Badr; Panwar, Vinay; Joly, David; Linning, Rob; Sakthikumar, Sharadha; Song, Xiao; Adiconis, Xian; Fan, Lin; Goldberg, Jonathan M.; Levin, Joshua Z.; Young, Sarah; Zeng, Qiandong; Anikster, Yehoshua; Bruce, Myron; Wang, Meinan; Yin, Chuntao; McCallum, Brent; Szabo, Les J.; Hulbert, Scot; Chen, Xianming; Fellers, John P.

    2016-01-01

    Three members of the Puccinia genus, Puccinia triticina (Pt), P. striiformis f.sp. tritici (Pst), and P. graminis f.sp. tritici (Pgt), cause the most common and often most significant foliar diseases of wheat. While similar in biology and life cycle, each species is uniquely adapted and specialized. The genomes of Pt and Pst were sequenced and compared to that of Pgt to identify common and distinguishing gene content, to determine gene variation among wheat rust pathogens, other rust fungi, and basidiomycetes, and to identify genes of significance for infection. Pt had the largest genome of the three, estimated at 135 Mb with expansion due to mobile elements and repeats encompassing 50.9% of contig bases; in comparison, repeats occupy 31.5% for Pst and 36.5% for Pgt. We find all three genomes are highly heterozygous, with Pst [5.97 single nucleotide polymorphisms (SNPs)/kb] nearly twice the level detected in Pt (2.57 SNPs/kb) and that previously reported for Pgt. Of 1358 predicted effectors in Pt, 784 were found expressed across diverse life cycle stages including the sexual stage. Comparison to related fungi highlighted the expansion of gene families involved in transcriptional regulation and nucleotide binding, protein modification, and carbohydrate degradation enzymes. Two allelic homeodomain pairs, HD1 and HD2, were identified in each dikaryotic Puccinia species along with three pheromone receptor (STE3) mating-type genes, two of which are likely representing allelic specificities. The HD proteins were active in a heterologous Ustilago maydis mating assay and host-induced gene silencing (HIGS) of the HD and STE3 alleles reduced wheat host infection. PMID:27913634

  17. What Makes a Bacterial Species Pathogenic?:Comparative Genomic Analysis of the Genus Leptospira

    PubMed Central

    Fouts, Derrick E.; Matthias, Michael A.; Adhikarla, Haritha; Adler, Ben; Amorim-Santos, Luciane; Berg, Douglas E.; Bulach, Dieter; Buschiazzo, Alejandro; Chang, Yung-Fu; Galloway, Renee L.; Haake, David A.; Haft, Daniel H.; Hartskeerl, Rudy; Ko, Albert I.; Levett, Paul N.; Matsunaga, James; Mechaly, Ariel E.; Monk, Jonathan M.; Nascimento, Ana L. T.; Nelson, Karen E.; Palsson, Bernhard; Peacock, Sharon J.; Picardeau, Mathieu; Ricaldi, Jessica N.; Thaipandungpanit, Janjira; Wunder, Elsio A.; Yang, X. Frank; Zhang, Jun-Jie; Vinetz, Joseph M.

    2016-01-01

    Leptospirosis, caused by spirochetes of the genus Leptospira, is a globally widespread, neglected and emerging zoonotic disease. While whole genome analysis of individual pathogenic, intermediately pathogenic and saprophytic Leptospira species has been reported, comprehensive cross-species genomic comparison of all known species of infectious and non-infectious Leptospira, with the goal of identifying genes related to pathogenesis and mammalian host adaptation, remains a key gap in the field. Infectious Leptospira, comprised of pathogenic and intermediately pathogenic Leptospira, evolutionarily diverged from non-infectious, saprophytic Leptospira, as demonstrated by the following computational biology analyses: 1) the definitive taxonomy and evolutionary relatedness among all known Leptospira species; 2) genomically-predicted metabolic reconstructions that indicate novel adaptation of infectious Leptospira to mammals, including sialic acid biosynthesis, pathogen-specific porphyrin metabolism and the first-time demonstration of cobalamin (B12) autotrophy as a bacterial virulence factor; 3) CRISPR/Cas systems demonstrated only to be present in pathogenic Leptospira, suggesting a potential mechanism for this clade’s refractoriness to gene targeting; 4) finding Leptospira pathogen-specific specialized protein secretion systems; 5) novel virulence-related genes/gene families such as the Virulence Modifying (VM) (PF07598 paralogs) proteins and pathogen-specific adhesins; 6) discovery of novel, pathogen-specific protein modification and secretion mechanisms including unique lipoprotein signal peptide motifs, Sec-independent twin arginine protein secretion motifs, and the absence of certain canonical signal recognition particle proteins from all Leptospira; and 7) and demonstration of infectious Leptospira-specific signal-responsive gene expression, motility and chemotaxis systems. By identifying large scale changes in infectious (pathogenic and intermediately pathogenic

  18. Comparative analysis of teleost fish genomes reveals preservation of different ancient clock duplicates in different fishes.

    PubMed

    Wang, Han

    2008-06-01

    Clock (Circadian locomotor output cycle kaput) was the first vertebrate circadian clock gene identified in a mouse forward genetics mutagenesis screen. It encodes a bHLH-PAS protein that is highly conserved throughout evolution. Tetrapods also have the second Clock gene, Clock2 or Npas2 (Neuronal PAS domain protein 2). Conversely, the fruit fly, an invertebrate, has only one clock gene. Interrogation of the five teleost fish genome databases revealed that the zebrafish and the Japanese pufferfish (fugu) each have three clock genes, whereas the green spotted pufferfish (tetraodon), the Japanese medaka fish and the three-spine stickleback each have two clock genes. Phylogenetic and splice site analyses indicated that zebrafish and fugu each have two clock1 genes, clock1a and clock1b and one clock2; tetraodon also have clock1a and clock1b but do not have clock2; and medaka and stickleback each have clock1b and one clock2. Genome neighborhood analysis further showed that clock1a/clock1b in zebrafish, fugu and tetraodon is an ancient duplicate. While the dN/dS ratios of these three fish clock duplicates are all <1, indicating that purifying selection has acted upon them; the Tajima relative rate test showed that all three fish clock duplicates have asymmetric evolutionary rates, implicating that one of these duplicates have been under positive selection or relaxed functional constraint. These results support the view that teleost fish clock genes were generated from an ancient genome-wide duplication, and differential gene loss after the duplication resulted in retention of different ancient duplicates in different teleost fishes, which could have contributed to the evolution of the distinct fish circadian clock mechanisms.

  19. Comparative genomic in situ hybridization (cGISH) analysis of the genomic relationships among Sinapis arvensis, Brassica rapa and Brassica nigra.

    PubMed

    Mao, Shufang; Han, Yonghua; Wu, Xiaoming; An, Tingting; Tang, Jiali; Shen, Junjun; Li, Zongyun

    2012-06-01

    To further understand the relationships between the SS genome of Sinapis arvensis and the AA, BB genomes in Brassica, genomic DNA of Sinapis arvensis was hybridized to the metaphase chromosomes of Brassica nigra (BB genome), and the metaphase chromosomes and interphase nucleus of Brassica rapa (AA genome) by comparative genomic in situ hybridization (cGISH). As a result, every chromosome of B. nigra had signals along the whole chromosomal length. However, only half of the condensed heterochromatic areas in the interphase nucleus and the chromosomes showed rich signals in Brassica rapa. Interphase nucleus and the metaphase chromosomes of S. arvensis were simultaneously hybridized with digoxigenin-labeled genomic DNA of B. nigra and biotin-labeled genomic DNA of B. rapa. Signals of genomic DNA of B. nigra hybridized throughout the length of all chromosomes and all the condensed heterochromatic areas in the interphase nucleus, except chromosome 4, of which signals were weak in centromeric regions. Signals of the genomic DNA of B. rapa patterned the most areas of ten chromosomes and ten condensed heterochromatic areas, others had less signals. The results showed that the SS genome had homology with AA and BB genomes, but the homology between SS genome and AA genome was clearly lower than that between the SS genome and BB genome.

  20. Genome-wide comparative analysis of 20 miniature inverted-repeat transposable element families in Brassica rapa and B. oleracea.

    PubMed

    Sampath, Perumal; Murukarthick, Jayakodi; Izzah, Nur Kholilatul; Lee, Jonghoon; Choi, Hong-Il; Shirasawa, Kenta; Choi, Beom-Soon; Liu, Shengyi; Nou, Ill-Sup; Yang, Tae-Jin

    2014-01-01

    Miniature inverted-repeat transposable elements (MITEs) are ubiquitous, non-autonomous class II transposable elements. Here, we conducted genome-wide comparative analysis of 20 MITE families in B. rapa, B. oleracea, and Arabidopsis thaliana. A total of 5894 and 6026 MITE members belonging to the 20 families were found in the whole genome pseudo-chromosome sequences of B. rapa and B. oleracea, respectively. Meanwhile, only four of the 20 families, comprising 573 members, were identified in the Arabidopsis genome, indicating that most of the families were activated in the Brassica genus after divergence from Arabidopsis. Copy numbers varied from 4 to 1459 for each MITE family, and there was up to 6-fold variation between B. rapa and B. oleracea. In particular, analysis of intact members showed that whereas eleven families were present in similar copy numbers in B. rapa and B. oleracea, nine families showed copy number variation ranging from 2- to 16-fold. Four of those families (BraSto-3, BraTo-3, 4, 5) were more abundant in B. rapa, and the other five (BraSto-1, BraSto-4, BraTo-1, 7 and BraHAT-1) were more abundant in B. oleracea. Overall, 54% and 51% of the MITEs resided in or within 2 kb of a gene in the B. rapa and B. oleracea genomes, respectively. Notably, 92 MITEs were found within the CDS of annotated genes, suggesting that MITEs might play roles in diversification of genes in the recently triplicated Brassica genome. MITE insertion polymorphism (MIP) analysis of 289 MITE members showed that 52% and 23% were polymorphic at the inter- and intra-species levels, respectively, indicating that there has been recent MITE activity in the Brassica genome. These recently activated MITE families with abundant MIP will provide useful resources for molecular breeding and identification of novel functional genes arising from MITE insertion.

  1. Functional analysis and comparative genomics of expressed sequence tags from the lycophyte Selaginella moellendorffii

    PubMed Central

    Weng, Jing-Ke; Tanurdzic, Milos; Chapple, Clint

    2005-01-01

    Background The lycophyte Selaginella moellendorffii is a member of one of the oldest lineages of vascular plants on Earth. Fossil records show that the lycophyte clade arose 400 million years ago, 150–200 million years earlier than angiosperms, a group of plants that includes the well-studied flowering plant Arabidopsis thaliana. S. moellendorffii has a genome size of approximately 100 Mbp, as small or smaller than that of A. thaliana. S. moellendorffii has the potential to provide significant comparative information to better understand the evolution of vascular plants. Results We sequenced 2181 Expressed Sequence Tags (ESTs) from a S. moellendorffii cDNA library. One thousand three hundred and one non-redundant sequences were assembled, containing 291 contigs and 1010 singletons. Approximately 75% of the ESTs matched proteins in the non-redundant protein database. Among 1301 clusters, 343 were categorized according to Gene Ontology (GO) hierarchy and were compared to the GO mapping of A. thaliana tentative consensus sequences. We compared S. moellendorffii ESTs to the A. thaliana and Physcomitrella patens EST databases, using the tBLASTX algorithm. Approximately 60% of the ESTs exhibited similarity with both A. thaliana and P. patens ESTs; whereas, 13% and 1% of the ESTs had exclusive similarity with A. thaliana and P. patens ESTs, respectively. A substantial proportion of the ESTs (26%) had no match with A. thaliana or P. patens ESTs. Conclusion We discovered 1301 putative unigenes in S. moellendorffii. These results give an initial insight into its transcriptome that will aid in the study of the S. moellendorffii genome in the near future. PMID:15938755

  2. Comparative Genome Sequence Analysis of Multidrug-Resistant Acinetobacter baumannii▿ †

    PubMed Central

    Adams, Mark D.; Goglin, Karrie; Molyneaux, Neil; Hujer, Kristine M.; Lavender, Heather; Jamison, Jennifer J.; MacDonald, Ian J.; Martin, Kristienna M.; Russo, Thomas; Campagnari, Anthony A.; Hujer, Andrea M.; Bonomo, Robert A.; Gill, Steven R.

    2008-01-01

    The recent emergence of multidrug resistance (MDR) in Acinetobacter baumannii has raised concern in health care settings worldwide. In order to understand the repertoire of resistance determinants and their organization and origins, we compared the genome sequences of three MDR and three drug-susceptible A. baumannii isolates. The entire MDR phenotype can be explained by the acquisition of discrete resistance determinants distributed throughout the genome. A comparison of closely related MDR and drug-susceptible isolates suggests that drug efflux may be a less significant contributor to resistance to certain classes of antibiotics than inactivation enzymes are. A resistance island with a variable composition of resistance determinants interspersed with transposons, integrons, and other mobile genetic elements is a significant but not universal contributor to the MDR phenotype. Four hundred seventy-five genes are shared among all six clinical isolates but absent from the related environmental species Acinetobacter baylyi ADP1. These genes are enriched for transcription factors and transporters and suggest physiological features of A. baumannii that are related to adaptation for growth in association with humans. PMID:18931120

  3. The genomic sequence and comparative analysis of the rat major histocompatibility complex.

    PubMed

    Hurt, Peter; Walter, Lutz; Sudbrak, Ralf; Klages, Sven; Müller, Ines; Shiina, Takashi; Inoko, Hidetoshi; Lehrach, Hans; Günther, Eberhard; Reinhardt, Richard; Himmelbauer, Heinz

    2004-04-01

    We have determined the sequence of a 4-Mb interval on rat chromosome 20p12 that encompasses the rat major histocompatibility complex (MHC). This is the first report of a finished sequence for a segment of the rat genome and constitutes one of the largest contiguous sequences thus far for rodent genomes in general. The rat MHC is, next to the human MHC, the second mammalian MHC sequenced to completion. Our analysis has resulted in the identification of at least 220 genes located within the sequenced interval. Although gene content and order are well conserved in the class II and class III gene intervals as well as the framework gene regions, profound rat-specific features were encountered within the class I gene regions, in comparison to human and mouse. Class I region-associated differences were found both at the structural level, the number, and organization of class I genes and gene families, and, in a more global context, in the way that evolution worked to shape the present-day rat MHC.

  4. Interspecies comparative genome hybridization and interspecies representational difference analysis reveal gross DNA differences between humans and great apes.

    PubMed

    Toder, R; Xia, Y; Bausch, E

    1998-09-01

    Comparative chromosome G-/R-banding, comparative gene mapping and chromosome painting techniques have demonstrated that only few chromosomal rearrangements occurred during great ape and human evolution. Interspecies comparative genome hybridization (CGH), used here in this study, between human, gorilla and pygmy chimpanzee revealed species-specific regions in all three species. In contrast to the human, a far more complex distribution of species-specific blocks was detected with CGH in gorilla and pygmy chimpanzee. Most of these blocks coincide with already described heterochromatic regions on gorilla and chimpanzee chromosomes. Representational difference analysis (RDA) was used to subtract the complex genome of gorilla against human in order to enrich gorilla-specific DNA sequences. Gorilla-specific clones isolated with this technique revealed a 32-bp repeat unit. These clones were mapped by fluorescence in situ hybridization (FISH) to the telomeric regions of gorilla chromosomes that had been shown by interspecies CGH to contain species-specific sequences.

  5. The mitochondrial genome of the red alga Kappaphycus striatus ("Green Sacol" variety): complete nucleotide sequence, genome structure and organization, and comparative analysis.

    PubMed

    Tablizo, Francis A; Lluisma, Arturo O

    2014-12-01

    The complete mitochondrial (mt) DNA sequence of the rhodophyte Kappaphycus striatus ("Green Sacol" variety) was determined. The mtDNA is circular, 25,242 bases long (A+T content: 69.94%), and contains 50 densely packed genes comprising 93.22% of the mitochondrial genome, with genes encoded on both strands. Through comparative analysis, the overall sequence, genome structure, and organization of K. striatus mtDNA were seen to be highly similar with other fully sequenced mitochondrial genomes of the class Florideophyceae. On the other hand, certain degrees of genome rearrangements and greater sequence dissimilarities were observed for the mtDNAs of other evolutionarily distant red algae, such as those from the class Bangiophyceae and Cyanidiophyceae, compared to that of K. striatus. Furthermore, a trend was observed wherein the red algal mtDNAs tend to encode lesser number of protein-coding genes, albeit not necessarily shorter, as the organism becomes more morphologically complex. This trend is supported by the phylogenetic tree inferred from the concatenated amino acid sequences of the deduced protein products of cytochrome c oxidase subunit genes (cox1, 2, and 3).

  6. Structural RNAs of known and unknown function identified in malaria parasites by comparative genomics and RNA analysis

    PubMed Central

    Chakrabarti, Kausik; Pearson, Michael; Grate, Leslie; Sterne-Weiler, Timothy; Deans, Jonathan; Donohue, John Paul; Ares, Manuel

    2007-01-01

    As the genomes of more eukaryotic pathogens are sequenced, understanding how molecular differences between parasite and host might be exploited to provide new therapies has become a major focus. Central to cell function are RNA-containing complexes involved in gene expression, such as the ribosome, the spliceosome, snoRNAs, RNase P, and telomerase, among others. In this article we identify by comparative genomics and validate by RNA analysis numerous previously unknown structural RNAs encoded by the Plasmodium falciparum genome, including the telomerase RNA, U3, 31 snoRNAs, as well as previously predicted spliceosomal snRNAs, SRP RNA, MRP RNA, and RNAse P RNA. Furthermore, we identify six new RNA coding genes of unknown function. To investigate the relationships of the RNA coding genes to other genomic features in related parasites, we developed a genome browser for P. falciparum (http://areslab.ucsc.edu/cgi-bin/hgGateway). Additional experiments provide evidence supporting the prediction that snoRNAs guide methylation of a specific position on U4 snRNA, as well as predicting an snRNA promoter element particular to Plasmodium sp. These findings should allow detailed structural comparisons between the RNA components of the gene expression machinery of the parasite and its vertebrate hosts. PMID:17901154

  7. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    PubMed Central

    Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K.; Duan, Yongping; Luo, Feng

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention. PMID:25811466

  8. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    PubMed

    Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  9. The complete chloroplast genome sequence of an endemic monotypic genus Hagenia (Rosaceae): structural comparative analysis, gene content and microsatellite detection.

    PubMed

    Gichira, Andrew W; Li, Zhizhong; Saina, Josphat K; Long, Zhicheng; Hu, Guangwan; Gituru, Robert W; Wang, Qingfeng; Chen, Jinming

    2017-01-01

    Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce) J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp), with a pair of Inverted Repeats (IR) 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp) and a small single copy (SSC, 18,696). H. abyssinica's chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene (infA) which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica. A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family.

  10. The complete chloroplast genome sequence of an endemic monotypic genus Hagenia (Rosaceae): structural comparative analysis, gene content and microsatellite detection

    PubMed Central

    Saina, Josphat K.; Long, Zhicheng; Hu, Guangwan; Gituru, Robert W.

    2017-01-01

    Hagenia is an endangered monotypic genus endemic to the topical mountains of Africa. The only species, Hagenia abyssinica (Bruce) J.F. Gmel, is an important medicinal plant producing bioactive compounds that have been traditionally used by African communities as a remedy for gastrointestinal ailments in both humans and animals. Complete chloroplast genomes have been applied in resolving phylogenetic relationships within plant families. We employed high-throughput sequencing technologies to determine the complete chloroplast genome sequence of H. abyssinica. The genome is a circular molecule of 154,961 base pairs (bp), with a pair of Inverted Repeats (IR) 25,971 bp each, separated by two single copies; a large (LSC, 84,320 bp) and a small single copy (SSC, 18,696). H. abyssinica’s chloroplast genome has a 37.1% GC content and encodes 112 unique genes, 78 of which code for proteins, 30 are tRNA genes and four are rRNA genes. A comparative analysis with twenty other species, sequenced to-date from the family Rosaceae, revealed similarities in structural organization, gene content and arrangement. The observed size differences are attributed to the contraction/expansion of the inverted repeats. The translational initiation factor gene (infA) which had been previously reported in other chloroplast genomes was conspicuously missing in H. abyssinica. A total of 172 microsatellites and 49 large repeat sequences were detected in the chloroplast genome. A Maximum Likelihood analyses of 71 protein-coding genes placed Hagenia in Rosoideae. The availability of a complete chloroplast genome, the first in the Sanguisorbeae tribe, is beneficial for further molecular studies on taxonomic and phylogenomic resolution within the Rosaceae family. PMID:28097059

  11. A 1.5-Mb-resolution radiation hybrid map of the cat genome and comparative analysis with the canine and human genomes.

    PubMed

    Murphy, William J; Davis, Brian; David, Victor A; Agarwala, Richa; Schäffer, Alejandro A; Pearks Wilkerson, Alison J; Neelam, Beena; O'Brien, Stephen J; Menotti-Raymond, Marilyn

    2007-02-01

    We report the construction of a 1.5-Mb-resolution radiation hybrid map of the domestic cat genome. This new map includes novel microsatellite loci and markers derived from the 2X genome sequence that target previous gaps in the feline-human comparative map. Ninety-six percent of the 1793 cat markers we mapped have identifiable orthologues in the canine and human genome sequences. The updated autosomal and X-chromosome comparative maps identify 152 cat-human and 134 cat-dog homologous synteny blocks. Comparative analysis shows the marked change in chromosomal evolution in the canid lineage relative to the felid lineage since divergence from their carnivoran ancestor. The canid lineage has a 30-fold difference in the number of interchromosomal rearrangements relative to felids, while the felid lineage has primarily undergone intrachromosomal rearrangements. We have also refined the pseudoautosomal region and boundary in the cat and show that it is markedly longer than those of human or mouse. This improved RH comparative map provides a useful tool to facilitate positional cloning studies in the feline model.

  12. Comparative genomic analysis of the swine pathogen Bordetella bronchiseptica strain KM22 to other B. bronchiseptica sequenced genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    B. bronchiseptica is pervasive in swine and plays multiple roles in respiratory disease as well as enhancing colonization by other bacterial pathogens and increasing the severity of disease associated with both viral and bacterial pathogens. The goal of this study was to use the genome sequence of K...

  13. Comparative genomic analyses in Asparagus.

    PubMed

    Kuhl, Joseph C; Havey, Michael J; Martin, William J; Cheung, Foo; Yuan, Qiaoping; Landherr, Lena; Hu, Yi; Leebens-Mack, James; Town, Christopher D; Sink, Kenneth C

    2005-12-01

    Garden asparagus (Asparagus officinalis L.) belongs to the monocot family Asparagaceae in the order Asparagales. Onion (Allium cepa L.) and Asparagus officinalis are 2 of the most economically important plants of the core Asparagales, a well supported monophyletic group within the Asparagales. Coding regions in onion have lower GC contents than the grasses. We compared the GC content of 3374 unique expressed sequence tags (ESTs) from A. officinalis with Lycoris longituba and onion (both members of the core Asparagales), Acorus americanus (sister to all other monocots), the grasses, and Arabidopsis. Although ESTs in A. officinalis and Acorus had a higher average GC content than Arabidopsis, Lycoris, and onion, all were clearly lower than the grasses. The Asparagaceae have the smallest nuclear genomes among all plants in the core Asparagales, which typically have huge genomes. Within the Asparagaceae, European Asparagus species have approximately twice the nuclear DNA of that of southern African Asparagus species. We cloned and sequenced 20 genomic amplicons from European A. officinalis and the southern African species Asparagus plumosus and observed no clear evidence for a recent genome doubling in A. officinalis relative to A. plumosus. These results indicate that members of the genus Asparagus with smaller genomes may be useful genomic models for plants in the core Asparagales.

  14. Enhancer Identification through Comparative Genomics

    SciTech Connect

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  15. Comparative Genomic Analysis of Delftia tsuruhatensis MTQ3 and the Identification of Functional NRPS Genes for Siderophore Production

    PubMed Central

    Guo, Haimeng; Yang, Yanan; Liu, Kai; Xu, Wenfeng; Gao, Jianyong; Duan, Hairong; Du, Binghai

    2016-01-01

    Plant growth-promoting rhizobacteria (PGPR) are a group of rhizosphere bacteria that promote plant growth. Delftia tsuruhatensis MTQ3 is a member of PGPR that produces siderophores. The draft genome sequence of MTQ3 has been reported. Here, we analyzed the genome sequence of MTQ3 and performed a comparative genome analysis of four sequenced Delftia strains, revealing genetic relationships among these strains. In addition, genes responsible for bacteriocin and nonribosomal peptide synthesis were detected in the genomes of each strain. To reveal the functions of NRPS genes in siderophore production in D. tsuruhatensis MTQ3, three NRPS genes were knocked out to obtain the three mutants MTQ3-Δ1941, MTQ3-Δ1945, and MTQ3-Δ1946, which were compared with the wild-type strain. In qualitative and quantitative analyses using CAS assay, the mutants failed to produce siderophores. Accordingly, the NRPS genes in MTQ3 were functionally related to siderophore production. These results clarify one mechanism by which plant growth is promoted in MTQ3 and have important applications in agricultural production. PMID:27847812

  16. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    PubMed

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  17. Comparative Genomic Analysis of Delftia tsuruhatensis MTQ3 and the Identification of Functional NRPS Genes for Siderophore Production.

    PubMed

    Guo, Haimeng; Yang, Yanan; Liu, Kai; Xu, Wenfeng; Gao, Jianyong; Duan, Hairong; Du, Binghai; Ding, Yanqin; Wang, Chengqiang

    2016-01-01

    Plant growth-promoting rhizobacteria (PGPR) are a group of rhizosphere bacteria that promote plant growth. Delftia tsuruhatensis MTQ3 is a member of PGPR that produces siderophores. The draft genome sequence of MTQ3 has been reported. Here, we analyzed the genome sequence of MTQ3 and performed a comparative genome analysis of four sequenced Delftia strains, revealing genetic relationships among these strains. In addition, genes responsible for bacteriocin and nonribosomal peptide synthesis were detected in the genomes of each strain. To reveal the functions of NRPS genes in siderophore production in D. tsuruhatensis MTQ3, three NRPS genes were knocked out to obtain the three mutants MTQ3-Δ1941, MTQ3-Δ1945, and MTQ3-Δ1946, which were compared with the wild-type strain. In qualitative and quantitative analyses using CAS assay, the mutants failed to produce siderophores. Accordingly, the NRPS genes in MTQ3 were functionally related to siderophore production. These results clarify one mechanism by which plant growth is promoted in MTQ3 and have important applications in agricultural production.

  18. Chromosome arm-specific BAC end sequences permit comparative analysis of homoeologous chromosomes and genomes of polyploid wheat

    PubMed Central

    2012-01-01

    Background Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat. Results The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695

  19. Comparative Genomics and Metabolic Analysis Reveals Peculiar Characteristics of Rhodococcus opacus Strain M213 Particularly for Naphthalene Degradation

    PubMed Central

    Blom, Jochen; Indest, Karl J.; Jung, Carina M.; Stothard, Paul; Bera, Gopal; Green, Stefan J.; Ogram, Andrew

    2016-01-01

    The genome of Rhodococcus opacus strain M213, isolated from a fuel-oil contaminated soil, was sequenced and annotated which revealed a genome size of 9,194,165 bp encoding 8680 putative genes and a G+C content of 66.72%. Among the protein coding genes, 71.77% were annotated as clusters of orthologous groups of proteins (COGs); 55% of the COGs were present as paralog clusters. Pulsed field gel electrophoresis (PFGE) analysis of M213 revealed the presence of three different sized replicons- a circular chromosome and two megaplasmids (pNUO1 and pNUO2) estimated to be of 750Kb 350Kb in size, respectively. Conversely, using an alternative approach of optical mapping, the plasmid replicons appeared as a circular ~1.2 Mb megaplasmid and a linear, ~0.7 Mb megaplasmid. Genome-wide comparative analysis of M213 with a cohort of sequenced Rhodococcus species revealed low syntenic affiliation with other R. opacus species including strains B4 and PD630. Conversely, a closer affiliation of M213, at the functional (COG) level, was observed with the catabolically versatile R. jostii strain RHA1 and other Rhodococcii such as R. wratislaviensis strain IFP 2016, R. imtechensis strain RKJ300, Rhodococcus sp. strain JVH1, and Rhodococcus sp. strain DK17, respectively. An in-depth, genome-wide comparison between these functional relatives revealed 971 unique genes in M213 representing 11% of its total genome; many associating with catabolic functions. Of major interest was the identification of as many as 154 genomic islands (GEIs), many with duplicated catabolic genes, in particular for PAHs; a trait that was confirmed by PCR-based identification of naphthalene dioxygenase (NDO) as a representative gene, across PFGE-resolved replicons of strain M213. Interestingly, several plasmid/GEI-encoded genes, that likely participate in degrading naphthalene (NAP) via a peculiar pathway, were also identified in strain M213 using a combination of bioinformatics, metabolic analysis and gene

  20. Comparative mapping of Raphanus sativus genome using Brassica markers and quantitative trait loci analysis for the Fusarium wilt resistance trait.

    PubMed

    Yu, Xiaona; Choi, Su Ryun; Ramchiary, Nirala; Miao, Xinyang; Lee, Su Hee; Sun, Hae Jeong; Kim, Sunggil; Ahn, Chun Hee; Lim, Yong Pyo

    2013-10-01

    Fusarium wilt (FW), caused by the soil-borne fungal pathogen Fusarium oxysporum is a serious disease in cruciferous plants, including the radish (Raphanus sativus). To identify quantitative trait loci (QTL) or gene(s) conferring resistance to FW, we constructed a genetic map of R. sativus using an F2 mapping population derived by crossing the inbred lines '835' (susceptible) and 'B2' (resistant). A total of 220 markers distributed in 9 linkage groups (LGs) were mapped in the Raphanus genome, covering a distance of 1,041.5 cM with an average distance between adjacent markers of 4.7 cM. Comparative analysis of the R. sativus genome with that of Arabidopsis thaliana and Brassica rapa revealed 21 and 22 conserved syntenic regions, respectively. QTL mapping detected a total of 8 loci conferring FW resistance that were distributed on 4 LGs, namely, 2, 3, 6, and 7 of the Raphanus genome. Of the detected QTL, 3 QTLs (2 on LG 3 and 1 on LG 7) were constitutively detected throughout the 2-year experiment. QTL analysis of LG 3, flanked by ACMP0609 and cnu_mBRPGM0085, showed a comparatively higher logarithm of the odds (LOD) value and percentage of phenotypic variation. Synteny analysis using the linked markers to this QTL showed homology to A. thaliana chromosome 3, which contains disease-resistance gene clusters, suggesting conservation of resistance genes between them.

  1. Comparative genomic analysis of dha regulon and related genes for anaerobic glycerol metabolism in bacteria.

    PubMed

    Sun, Jibin; van den Heuvel, Joop; Soucaille, Philippe; Qu, Yinbo; Zeng, An-Ping

    2003-01-01

    The dihydroxyacetone (dha) regulon of bacteria encodes genes for the anaerobic metabolism of glycerol. In this work, genomic data are used to analyze and compare the dha regulon and related genes in different organisms in silico with respect to gene organization, sequence similarity, and possible functions. Database searches showed that among the organisms, the genomes of which have been sequenced so far, only two, i.e., Klebsiella pneumoniae MGH 78578 and Clostridium perfringens contain a complete dha regulon bearing all known enzymes. The components and their organization in the dha regulon of these two organisms differ considerably from each other and also from the previously partially sequenced dha regulons in Citrobacter freundii, Clostridium pasteurianum, and Clostridium butyricum. Unlike all of the other organisms, genes for the oxidative and reductive pathways of anaerobic glycerol metabolism in C. perfringens are located in two separate organization units on the chromosome. Comparisons of deduced protein sequences of genes with similar functions showed that the dha regulon components in K. pneumoniae and C. freundii have high similarities (80-95%) but lower similarities to those of the Clostridium species (30-80%). Interestingly, the protein sequence similarities among the dha genes of the Clostridium species are in many cases even lower than those between the Clostridium species and K. pneumoniae or C. freundii, suggesting two different types of dha regulon in the Clostridium species studied. The in silico reconstruction and comparison of dha regulons revealed several new genes in the microorganisms studied. In particular, a novel dha kinase that is phosphoenolpyruvate-dependent is identified and experimentally confirmed for K. pneumoniae in addition to the known ATP-dependent dha kinase. This finding gives new insights into the regulation of glycerol metabolism in K. pneumoniae and explains some hitherto not well understood experimental observations.

  2. Evolution of a Cellular Immune Response in Drosophila: A Phenotypic and Genomic Comparative Analysis

    PubMed Central

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-01-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila. PMID:24443439

  3. Comparative genomic analysis of regulation of anaerobic respiration in ten genomes from three families of gamma-proteobacteria (Enterobacteriaceae, Pasteurellaceae, Vibrionaceae)

    PubMed Central

    Ravcheev, Dmitry A; Gerasimova, Anna V; Mironov, Andrey A; Gelfand, Mikhail S

    2007-01-01

    Background Gamma-proteobacteria, such as Escherichia coli, can use a variety of respiratory substrates employing numerous aerobic and anaerobic respiratory systems controlled by multiple transcription regulators. Thus, in E. coli, global control of respiration is mediated by four transcription factors, Fnr, ArcA, NarL and NarP. However, in other Gamma-proteobacteria the composition of global respiration regulators may be different. Results In this study we applied a comparative genomic approach to the analysis of three global regulatory systems, Fnr, ArcA and NarP. These systems were studied in available genomes containing these three regulators, but lacking NarL. So, we considered several representatives of Pasteurellaceae, Vibrionaceae and Yersinia spp. As a result, we identified new regulon members, functioning in respiration, central metabolism (glycolysis, gluconeogenesis, pentose phosphate pathway, citrate cicle, metabolism of pyruvate and lactate), metabolism of carbohydrates and fatty acids, transcriptional regulation and transport, in particular: the ATP synthase operon atpIBEFHAGCD, Na+-exporting NADH dehydrogenase operon nqrABCDEF, the D-amino acids dehydrogenase operon dadAX. Using an extension of the comparative technique, we demonstrated taxon-specific changes in regulatory interactions and predicted taxon-specific regulatory cascades. Conclusion A comparative genomic technique was applied to the analysis of global regulation of respiration in ten gamma-proteobacterial genomes. Three structurally different but functionally related regulatory systems were described. A correlation between the regulon size and the position of a transcription factor in regulatory cascades was observed: regulators with larger regulons tend to occupy top positions in the cascades. On the other hand, there is no obvious link to differences in the species' lifestyles and metabolic capabilities. PMID:17313674

  4. Comparative assembly hubs: Web-accessible browsers for comparative genomics

    PubMed Central

    Nguyen, Ngan; Hickey, Glenn; Raney, Brian J.; Armstrong, Joel; Clawson, Hiram; Zweig, Ann; Karolchik, Donna; Kent, William James; Haussler, David; Paten, Benedict

    2014-01-01

    Motivation: Researchers now have access to large volumes of genome sequences for comparative analysis, some generated by the plethora of public sequencing projects and, increasingly, from individual efforts. It is not possible, or necessarily desirable, that the public genome browsers attempt to curate all these data. Instead, a wealth of powerful tools is emerging to empower users to create their own visualizations and browsers. Results: We introduce a pipeline to easily generate collections of Web-accessible UCSC Genome Browsers interrelated by an alignment. It is intended to democratize our comparative genomic browser resources, serving the broad and growing community of evolutionary genomicists and facilitating easy public sharing via the Internet. Using the alignment, all annotations and the alignment itself can be efficiently viewed with reference to any genome in the collection, symmetrically. A new, intelligently scaled alignment display makes it simple to view all changes between the genomes at all levels of resolution, from substitutions to complex structural rearrangements, including duplications. To demonstrate this work, we create a comparative assembly hub containing 57 Escherichia coli and 9 Shigella genomes and show examples that highlight their unique biology. Availability and implementation: The source code is available as open source at: https://github.com/glennhickey/progressiveCactus The E.coli and Shigella genome hub is now a public hub listed on the UCSC browser public hubs Web page. Contact: benedict@soe.ucsc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25138168

  5. [R1 and R2 retrotransposons of German cockroach Blattella germanica: comparative analysis of 5' truncated copies integrated into genome].

    PubMed

    Kagramanova, A S; Kapelinskaia, T V; Korolev, A L; Mukha, D V

    2007-01-01

    This is the first report providing results on identification, cloning, and sequencing of extended fragments (5'-truncated copies) of R1 and R2 retrotransposons integrated into Blattella germanica genome. Comparative structural analysis of the received clones revealed two distinct subfamilies of R1 elements. However, all B. germanica R1 clones have two common features: poly(T) tails and similar target site duplications. Nucleotide structure and organization of five sequenced R2 fragments was similar. Analysis of R2 nucleotide sequences revealed typical deletions at the 3'end of target sites and lack of homopolynucleotides tails.

  6. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus

    PubMed Central

    Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F. Jerry; Glöckner, Frank O.; Crowley, Susan P.; O'Sullivan, Orla; Cotter, Paul D.; Adams, Claire; Dobson, Alan D. W.; O'Gara, Fergal

    2016-01-01

    Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its

  7. Compare and Contrast Meta Analysis (CCMA): A Method for Identification of Pleiotropic Loci in Genome-Wide Association Studies.

    PubMed

    Baurecht, Hansjörg; Hotze, Melanie; Rodríguez, Elke; Manz, Judith; Weidinger, Stephan; Cordell, Heather J; Augustin, Thomas; Strauch, Konstantin

    2016-01-01

    In recent years, genome-wide association studies (GWAS) have identified many loci that are shared among common disorders and this has raised interest in pleiotropy. For performing appropriate analysis, several methods have been proposed, e.g. conducting a look-up in external sources or exploiting GWAS results by meta-analysis based methods. We recently proposed the Compare & Contrast Meta-Analysis (CCMA) approach where significance thresholds were obtained by simulation. Here we present analytical formulae for the density and cumulative distribution function of the CCMA test statistic under the null hypothesis of no pleiotropy and no association, which, conveniently for practical reasons, turns out to be exponentially distributed. This allows researchers to apply the CCMA method without having to rely on simulations. Finally, we show that CCMA demonstrates power to detect disease-specific, agonistic and antagonistic loci comparable to the frequently used Subset-Based Meta-Analysis approach, while better controlling the type I error rate.

  8. Enhancer Identification through Comparative Genomics

    PubMed Central

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2007-01-01

    With the availability of genomic sequence from numerous vertebrates, a paradigm shift has occurred in the identification of distant-acting gene regulatory elements. In contrast to traditional gene-centric studies in which investigators randomly scanned genomic fragments that flank genes of interest in functional assays, the modern approach begins electronically with publicly available comparative sequence datasets that provide investigators with prioritized lists of putative functional sequences based on their evolutionary conservation. However, although a large number of tools and resources are now available, application of comparative genomic approaches remains far from trivial. In particular, it requires users to dynamically consider the species and methods for comparison depending on the specific biological question under investigation. While there is currently no single general rule to this end, it is clear that when applied appropriately, comparative genomic approaches exponentially increase our power in generating biological hypotheses for subsequent experimental testing. It is anticipated that cardiac-related genes and the identification of their distant-acting transcriptional enhancers are particularly poised to benefit from these modern capabilities. PMID:17276707

  9. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium

    PubMed Central

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  10. Comparative Genome Sequence Analysis Reveals the Extent of Diversity and Conservation for Glycan-Associated Proteins in Burkholderia spp.

    PubMed Central

    Ong, Hui San; Mohamed, Rahmah; Firdaus-Raih, Mohd

    2012-01-01

    Members of the Burkholderia family occupy diverse ecological niches. In pathogenic family members, glycan-associated proteins are often linked to functions that include virulence, protein conformation maintenance, surface recognition, cell adhesion, and immune system evasion. Comparative analysis of available Burkholderia genomes has revealed a core set of 178 glycan-associated proteins shared by all Burkholderia of which 68 are homologous to known essential genes. The genome sequence comparisons revealed insights into species-specific gene acquisitions through gene transfers, identified an S-layer protein, and proposed that significantly reactive surface proteins are associated to sugar moieties as a potential means to circumvent host defense mechanisms. The comparative analysis using a curated database of search queries enabled us to gain insights into the extent of conservation and diversity, as well as the possible virulence-associated roles of glycan-associated proteins in members of the Burkholderia spp. The curated list of glycan-associated proteins used can also be directed to screen other genomes for glycan-associated homologs. PMID:22991502

  11. Comparative Genomic Analysis of Bacillus amyloliquefaciens and Bacillus subtilis Reveals Evolutional Traits for Adaptation to Plant-Associated Habitats

    PubMed Central

    Zhang, Nan; Yang, Dongqing; Kendall, Joshua R. A.; Borriss, Rainer; Druzhinina, Irina S.; Kubicek, Christian P.; Shen, Qirong; Zhang, Ruifu

    2016-01-01

    Bacillus subtilis and its sister species B. amyloliquefaciens comprise an evolutionary compact but physiologically versatile group of bacteria that includes strains isolated from diverse habitats. Many of these strains are used as plant growth-promoting rhizobacteria (PGPR) in agriculture and a plant-specialized subspecies of B. amyloliquefaciens—B. amyloliquefaciens subsp. plantarum, has recently been recognized, here we used 31 whole genomes [including two newly sequenced PGPR strains: B. amyloliquefaciens NJN-6 isolated from Musa sp. (banana) and B. subtilis HJ5 from Gossypium sp. (cotton)] to perform comparative analysis and investigate the genomic characteristics and evolution traits of both species in different niches. Phylogenomic analysis indicated that strains isolated from plant-associated (PA) habitats could be distinguished from those from non-plant-associated (nPA) niches in both species. The core genomes of PA strains are more abundant in genes relevant to intermediary metabolism and secondary metabolites biosynthesis as compared with those of nPA strains, and they also possess additional specific genes involved in utilization of plant-derived substrates and synthesis of antibiotics. A further gene gain/loss analysis indicated that only a few of these specific genes (18/192 for B. amyloliquefaciens and 53/688 for B. subtilis) were acquired by PA strains at the initial divergence event, but most were obtained successively by different subgroups of PA stains during the evolutional process. This study demonstrated the genomic differences between PA and nPA B. amyloliquefaciens and B. subtilis from different niches and the involved evolutional traits, and has implications for screening of PGPR strains in agricultural production. PMID:28066362

  12. Comparative Genomic Hybridization Analysis of Yersinia enterocolitica and Yersinia pseudotuberculosis Identifies Genetic Traits to Elucidate Their Different Ecologies

    PubMed Central

    Jaakkola, Kaisa; Somervuo, Panu; Korkeala, Hannu

    2015-01-01

    Enteropathogenic Yersinia enterocolitica and Yersinia pseudotuberculosis are both etiological agents for intestinal infection known as yersiniosis, but their epidemiology and ecology bear many differences. Swine are the only known reservoir for Y. enterocolitica 4/O:3 strains, which are the most common cause of human disease, while Y. pseudotuberculosis has been isolated from a variety of sources, including vegetables and wild animals. Infections caused by Y. enterocolitica mainly originate from swine, but fresh produce has been the source for widespread Y. pseudotuberculosis outbreaks within recent decades. A comparative genomic hybridization analysis with a DNA microarray based on three Yersinia enterocolitica and four Yersinia pseudotuberculosis genomes was conducted to shed light on the genomic differences between enteropathogenic Yersinia. The hybridization results identified Y. pseudotuberculosis strains to carry operons linked with the uptake and utilization of substances not found in living animal tissues but present in soil, plants, and rotting flesh. Y. pseudotuberculosis also harbors a selection of type VI secretion systems targeting other bacteria and eukaryotic cells. These genetic traits are not found in Y. enterocolitica, and it appears that while Y. pseudotuberculosis has many tools beneficial for survival in varied environments, the Y. enterocolitica genome is more streamlined and adapted to their preferred animal reservoir. PMID:26605338

  13. Comparative analysis of deep-sea bacterioplankton OMICS revealed the occurrence of habitat-specific genomic attributes.

    PubMed

    Smedile, Francesco; Messina, Enzo; La Cono, Violetta; Yakimov, Michail M

    2014-10-01

    Bathyal aphotic ocean represents the largest biotope on our planet, which sustains highly diverse but low-density microbial communities, with yet untapped genomic attributes, potentially useful for discovery of new biomolecules, industrial enzymes and pathways. In the last two decades, culture-independent approaches of high-throughput sequencing have provided new insights into structure and function of marine bacterioplankton, leading to unprecedented opportunities to accurately characterize microbial communities and their interactions with the environments. In the present review we focused on the analysis of relatively few deep-sea OMICS studies, completed thus far, to find the specific genomic patterns determining the lifeway and adaptation mechanisms of prokaryotes thriving in the dark deep ocean below the depth of 1000m. Phylogenomic and omic studies provided clear evidence that the bathyal microbial communities are distinct from the epipelagic counterparts and, along with generally larger genomes, possess their own habitat-specific genomic attributes. The high abundance in the deep ocean OMICS of the systems for environmental sensing, signal transduction and metabolic versatility as compared to the epipelagic counterparts is thought to enable the deep-sea bacterioplankton to rapidly adapt to changing environmental conditions associated with resource scarcity and high diversity of energy and carbon substrates in the bathyal biotopes. Together with a versatile heterotrophy, mixotrophy and anaplerosis are thought to enable the deep-sea bacterioplankton to cope with these environmental conditions.

  14. Comparative Genomic Hybridization Analysis of Yersinia enterocolitica and Yersinia pseudotuberculosis Identifies Genetic Traits to Elucidate Their Different Ecologies.

    PubMed

    Jaakkola, Kaisa; Somervuo, Panu; Korkeala, Hannu

    2015-01-01

    Enteropathogenic Yersinia enterocolitica and Yersinia pseudotuberculosis are both etiological agents for intestinal infection known as yersiniosis, but their epidemiology and ecology bear many differences. Swine are the only known reservoir for Y. enterocolitica 4/O:3 strains, which are the most common cause of human disease, while Y. pseudotuberculosis has been isolated from a variety of sources, including vegetables and wild animals. Infections caused by Y. enterocolitica mainly originate from swine, but fresh produce has been the source for widespread Y. pseudotuberculosis outbreaks within recent decades. A comparative genomic hybridization analysis with a DNA microarray based on three Yersinia enterocolitica and four Yersinia pseudotuberculosis genomes was conducted to shed light on the genomic differences between enteropathogenic Yersinia. The hybridization results identified Y. pseudotuberculosis strains to carry operons linked with the uptake and utilization of substances not found in living animal tissues but present in soil, plants, and rotting flesh. Y. pseudotuberculosis also harbors a selection of type VI secretion systems targeting other bacteria and eukaryotic cells. These genetic traits are not found in Y. enterocolitica, and it appears that while Y. pseudotuberculosis has many tools beneficial for survival in varied environments, the Y. enterocolitica genome is more streamlined and adapted to their preferred animal reservoir.

  15. Comparative analysis of dioxin response elements in human, mouse and rat genomic sequences.

    PubMed

    Sun, Y V; Boverhof, D R; Burgoon, L D; Fielden, M R; Zacharewski, T R

    2004-01-01

    Comparative approaches were used to identify human, mouse and rat dioxin response elements (DREs) in genomic sequences unambiguously assigned to a nucleotide RefSeq accession number. A total of 13 bona fide DREs, all including the substitution intolerant core sequence (GCGTG) and adjacent variable sequences, were used to establish a position weight matrix and a matrix similarity (MS) score threshold to rank identified DREs. DREs with MS scores above the threshold were disproportionately distributed in close proximity to the transcription start site in all three species. Gene expression assays in hepatic mouse tissue confirmed the responsiveness of 192 genes possessing a putative DRE. Previously identified functional DREs in well-characterized AhR-regulated genes including Cyp1a1 and Cyp1b1 were corroborated. Putative DREs were identified in 48 out of 2437 human-mouse-rat orthologous genes between -1500 and the transcriptional start site, of which 19 of these genes possessed positionally conserved DREs as determined by multiple sequence alignment. Seven of these nineteen genes exhibited 2,3,7,8-tetrachlorodibenzo-p-dioxin-mediated regulation, although there were significant discrepancies between in vivo and in vitro results. Interestingly, of the mouse-rat orthologous genes with a DRE between -1500 and +1500, only 37% had an equivalent human ortholog. These results suggest that AhR-mediated gene expression may not be well conserved across species, which could have significant implications in human risk assessment.

  16. Comparative analysis of dioxin response elements in human, mouse and rat genomic sequences

    PubMed Central

    Sun, Y. V.; Boverhof, D. R.; Burgoon, L. D.; Fielden, M. R.; Zacharewski, T. R.

    2004-01-01

    Comparative approaches were used to identify human, mouse and rat dioxin response elements (DREs) in genomic sequences unambiguously assigned to a nucleotide RefSeq accession number. A total of 13 bona fide DREs, all including the substitution intolerant core sequence (GCGTG) and adjacent variable sequences, were used to establish a position weight matrix and a matrix similarity (MS) score threshold to rank identified DREs. DREs with MS scores above the threshold were disproportionately distributed in close proximity to the transcription start site in all three species. Gene expression assays in hepatic mouse tissue confirmed the responsiveness of 192 genes possessing a putative DRE. Previously identified functional DREs in well-characterized AhR-regulated genes including Cyp1a1 and Cyp1b1 were corroborated. Putative DREs were identified in 48 out of 2437 human–mouse–rat orthologous genes between −1500 and the transcriptional start site, of which 19 of these genes possessed positionally conserved DREs as determined by multiple sequence alignment. Seven of these nineteen genes exhibited 2,3,7,8-tetrachlorodibenzo-p-dioxin-mediated regulation, although there were significant discrepancies between in vivo and in vitro results. Interestingly, of the mouse–rat orthologous genes with a DRE between −1500 and +1500, only 37% had an equivalent human ortholog. These results suggest that AhR-mediated gene expression may not be well conserved across species, which could have significant implications in human risk assessment. PMID:15328365

  17. Comparative Analysis of Wolbachia Genomes Reveals Streamlining and Divergence of Minimalist Two-Component Systems

    PubMed Central

    Christensen, Steen; Serbus, Laura Renee

    2015-01-01

    Two-component regulatory systems are commonly used by bacteria to coordinate intracellular responses with environmental cues. These systems are composed of functional protein pairs consisting of a sensor histidine kinase and cognate response regulator. In contrast to the well-studied Caulobacter crescentus system, which carries dozens of these pairs, the streamlined bacterial endosymbiont Wolbachia pipientis encodes only two pairs: CckA/CtrA and PleC/PleD. Here, we used bioinformatic tools to compare characterized two-component system relays from C. crescentus, the related Anaplasmataceae species Anaplasma phagocytophilum and Ehrlichia chaffeensis, and 12 sequenced Wolbachia strains. We found the core protein pairs and a subset of interacting partners to be highly conserved within Wolbachia and these other Anaplasmataceae. Genes involved in two-component signaling were positioned differently within the various Wolbachia genomes, whereas the local context of each gene was conserved. Unlike Anaplasma and Ehrlichia, Wolbachia two-component genes were more consistently found clustered with metabolic genes. The domain architecture and key functional residues standard for two-component system proteins were well-conserved in Wolbachia, although residues that specify cognate pairing diverged substantially from other Anaplasmataceae. These findings indicate that Wolbachia two-component signaling pairs share considerable functional overlap with other α-proteobacterial systems, whereas their divergence suggests the potential for regulatory differences and cross-talk. PMID:25809075

  18. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    NASA Astrophysics Data System (ADS)

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress.

  19. Comparative transcriptome analysis reveals insights into the streamlined genomes of haplosclerid demosponges

    PubMed Central

    Guzman, Christine; Conaco, Cecilia

    2016-01-01

    Sponges (Porifera) are one of the most ancestral metazoan groups. They are characterized by a simple body plan lacking the true tissues and organ systems found in other animals. Members of this phylum display a remarkable diversity of form and function and yet little is known about the composition and complexity of their genomes. In this study, we sequenced the transcriptomes of two marine haplosclerid sponges belonging to Demospongiae, the largest and most diverse class within phylum Porifera, and compared their gene content with members of other sponge classes. We recovered 44,693 and 50,067 transcripts expressed in adult tissues of Haliclona amboinensis and Haliclona tubifera, respectively. These transcripts translate into 20,280 peptides in H. amboinensis and 18,000 peptides in H. tubifera. Genes associated with important signaling and metabolic pathways, regulatory networks, as well as genes that may be important in the organismal stress response, were identified in the transcriptomes. Futhermore, lineage-specific innovations were identified that may be correlated with observed sponge characters and ecological adaptations. The core gene complement expressed within the tissues of adult haplosclerid demosponges may represent a streamlined and flexible genetic toolkit that underlies the ecological success and resilience of sponges to environmental stress. PMID:26738846

  20. Comparative Genomic Analysis Reveals a Critical Role of De Novo Nucleotide Biosynthesis for Saccharomyces cerevisiae Virulence

    PubMed Central

    Pérez-Torrado, Roberto; Llopis, Silvia; Perrone, Benedetta; Gómez-Pastor, Rocío; Hube, Bernhard; Querol, Amparo

    2015-01-01

    In recent years, the number of human infection cases produced by the food related species Saccharomyces cerevisiae has increased. Whereas many strains of this species are considered safe, other ‘opportunistic’ strains show a high degree of potential virulence attributes and can cause infections in immunocompromised patients. Here we studied the genetic characteristics of selected opportunistic strains isolated from dietary supplements and also from patients by array comparative genomic hybridization. Our results show increased copy numbers of IMD genes in opportunistic strains, which are implicated in the de novo biosynthesis of the purine nucleotides pathway. The importance of this pathway for virulence of S. cerevisiae was confirmed by infections in immunodeficient murine models using a GUA1 mutant, a key gene of this pathway. We show that exogenous guanine, an end product of this pathway in its triphosphorylated form, increases the survival of yeast strains in ex vivo blood infections. Finally, we show the importance of the DNA damage response that activates dNTP biosynthesis in yeast cells during ex vivo blood infections. We conclude that opportunistic yeasts may use an enhanced de novo biosynthesis of the purine nucleotides pathway to increase survival and favor infections in the host. PMID:25816288

  1. Comparative analysis of wolbachia genomes reveals streamlining and divergence of minimalist two-component systems.

    PubMed

    Christensen, Steen; Serbus, Laura Renee

    2015-03-24

    Two-component regulatory systems are commonly used by bacteria to coordinate intracellular responses with environmental cues. These systems are composed of functional protein pairs consisting of a sensor histidine kinase and cognate response regulator. In contrast to the well-studied Caulobacter crescentus system, which carries dozens of these pairs, the streamlined bacterial endosymbiont Wolbachia pipientis encodes only two pairs: CckA/CtrA and PleC/PleD. Here, we used bioinformatic tools to compare characterized two-component system relays from C. crescentus, the related Anaplasmataceae species Anaplasma phagocytophilum and Ehrlichia chaffeensis, and 12 sequenced Wolbachia strains. We found the core protein pairs and a subset of interacting partners to be highly conserved within Wolbachia and these other Anaplasmataceae. Genes involved in two-component signaling were positioned differently within the various Wolbachia genomes, whereas the local context of each gene was conserved. Unlike Anaplasma and Ehrlichia, Wolbachia two-component genes were more consistently found clustered with metabolic genes. The domain architecture and key functional residues standard for two-component system proteins were well-conserved in Wolbachia, although residues that specify cognate pairing diverged substantially from other Anaplasmataceae. These findings indicate that Wolbachia two-component signaling pairs share considerable functional overlap with other α-proteobacterial systems, whereas their divergence suggests the potential for regulatory differences and cross-talk.

  2. The Basic Helix-Loop-Helix Protein Family: Comparative Genomics and Phylogenetic Analysis

    PubMed Central

    Ledent, Valérie; Vervoort, Michel

    2001-01-01

    The basic Helix-Loop-Helix (bHLH) proteins are transcription factors that play important roles during the development of various metazoans including fly, nematode, and vertebrates. They are also involved in human diseases, particularly in cancerogenesis. We made an extensive search for bHLH sequences in the completely sequenced genomes of Caenorhabditis elegans and of Drosophila melanogaster. We found 35 and 56 different genes, respectively, which may represent the complete set of bHLH of these organisms. A phylogenetic analysis of these genes, together with a large number (>350) of bHLH from other sources, led us to define 44 orthologous families among which 36 include bHLH from animals only, and two have representatives in both yeasts and animals. In addition, we identified two bHLH motifs present only in yeast, and four that are present only in plants; however, the latter number is certainly an underestimate. Most animal families (35/38) comprise fly, nematode, and vertebrate genes, suggesting that their common ancestor, which lived in pre-Cambrian times (600 million years ago) already owned as many as 35 different bHLH genes. PMID:11337472

  3. Comparative genomic analysis of nine Sphingobium strains: Insights into their evolution and hexachlorocyclohexane (HCH) degradation pathways

    DOE PAGES

    Verma, Helianthous; Kumar, Roshan; Oldach, Phoebe; ...

    2014-11-23

    Background: Sphingobium spp. are efficient degraders of a wide range of chlorinated and aromatic hydrocarbons. In particular, strains which harbour the lin pathway genes mediating the degradation of hexachlorocyclohexane (HCH) isomers are of interest due to the widespread persistence of this contaminant. Here, we examined the evolution and diversification of the lin pathway under the selective pressure of HCH, by comparing the draft genomes of six newly-sequenced Sphingobium spp. (strains LL03, DS20, IP26, HDIPO4, P25 and RL3) isolated from HCH dumpsites, with three existing genomes (S. indicum B90A, S. japonicum UT26S and Sphingobium sp. SYK6). Results: Efficient HCH degraders phylogeneticallymore » clustered in a closely related group comprising of UT26S, B90A, HDIPO4 and IP26, where HDIPO4 and IP26 were classified as subspecies with ANI value >98%. Less than 10% of the total gene content was shared among all nine strains, but among the eight HCH-associated strains, that is all except SYK6, the shared gene content jumped to nearly 25%. Genes associated with nitrogen stress response and two-component systems were found to be enriched. The strains also housed many xenobiotic degradation pathways other than HCH, despite the absence of these xenobiotics from isolation sources. In addition, these strains, although non-motile, but posses flagellar assembly genes. While strains HDIPO4 and IP26 contained the complete set of lin genes, DS20 was entirely devoid of lin genes (except linKLMN) whereas, LL03, P25 and RL3 were identified as lin deficient strains, as they housed incomplete lin pathways. Further, in HDIPO4, linA was found as a hybrid of two natural variants i.e., linA1 and linA2 known for their different enantioselectivity. In conclusion, the bacteria isolated from HCH dumpsites provide a natural testing ground to study variations in the lin system and their effects on degradation efficacy. Further, the diversity in the lin gene sequences and copy number, their

  4. Comparative genome analysis of high-level penicillin resistance in Streptococcus pneumoniae.

    PubMed

    Tait-Kamradt, Amelia G; Cronan, Melissa; Dougherty, Thomas J

    2009-06-01

    Streptococcus pneumoniae strains with very high levels of penicillin resistance (minimum inhibitory concentration [MIC] >or=8 microg/ml) emerged in the 1990 s. Previous studies have traced the changes in penicillin binding proteins (PBP) that result in decreased penicillin susceptibility, and the role of several PBP genes in high-level resistance. In the present study, we investigated the changes that occurred at the two highest levels of penicillin resistance using NimbleGen's Comparative Genome Sequencing (CGS) technology. DNA from a highly resistant (Pen MIC 16 microg/ml) pneumococcus was used to serially transform the R6 strain to high-level resistance. Four distinct levels of penicillin resistance above the susceptible R6 strain (MIC 0.016 microg/ml) were identified. Using CGS technology, the entire genome sequences of the two highest levels of resistant transformants were examined for changes associated with the resistance phenotypes. At the third level of resistance, changes in PBPs 1a, 2b, and 2x were found, very similar to previous reports. At the fourth resistance level, two additional changes were observed in the R6 transformants. More changes were observed in PBP2x, as well as in peptidoglycan GlcNAc deacetylase (pdgA), which had a missense mutation in the coding region. Genetic transformation with polymerase chain reaction (PCR) products generated from the high-level resistant parent containing either the additional PBP2x or mutant pdgA gene did not increase the MIC of the third-level transformant. Only when both PCR products were simultaneously transformed into the third-level transformant did colonies emerge that were at the highest level of resistance (16-32 microg/ml), equivalent to the highly resistant parent strain. This is the first instance of the involvement of a variant pdgA gene in penicillin resistance. It is also clear from these experiments and the literature that there are multiple paths to the pneumococcus achieving high

  5. Complete Plastid Genome Sequencing of Four Tilia Species (Malvaceae): A Comparative Analysis and Phylogenetic Implications

    PubMed Central

    Cai, Jie; Ma, Peng-Fei; Li, Hong-Tao; Li, De-Zhu

    2015-01-01

    Tilia is an ecologically and economically important genus in the family Malvaceae. However, there is no complete plastid genome of Tilia sequenced to date, and the taxonomy of Tilia is difficult owing to frequent hybridization and polyploidization. A well-supported interspecific relationships of this genus is not available due to limited informative sites from the commonly used molecular markers. We report here the complete plastid genome sequences of four Tilia species determined by the Illumina technology. The Tilia plastid genome is 162,653 bp to 162,796 bp in length, encoding 113 unique genes and a total number of 130 genes. The gene order and organization of the Tilia plastid genome exhibits the general structure of angiosperms and is very similar to other published plastid genomes of Malvaceae. As other long-lived tree genera, the sequence divergence among the four Tilia plastid genomes is very low. And we analyzed the nucleotide substitution patterns and the evolution of insertions and deletions in the Tilia plastid genomes. Finally, we build a phylogeny of the four sampled Tilia species with high supports using plastid phylogenomics, suggesting that it is an efficient way to resolve the phylogenetic relationships of this genus. PMID:26566230

  6. Complete Plastid Genome Sequencing of Four Tilia Species (Malvaceae): A Comparative Analysis and Phylogenetic Implications.

    PubMed

    Cai, Jie; Ma, Peng-Fei; Li, Hong-Tao; Li, De-Zhu

    2015-01-01

    Tilia is an ecologically and economically important genus in the family Malvaceae. However, there is no complete plastid genome of Tilia sequenced to date, and the taxonomy of Tilia is difficult owing to frequent hybridization and polyploidization. A well-supported interspecific relationships of this genus is not available due to limited informative sites from the commonly used molecular markers. We report here the complete plastid genome sequences of four Tilia species determined by the Illumina technology. The Tilia plastid genome is 162,653 bp to 162,796 bp in length, encoding 113 unique genes and a total number of 130 genes. The gene order and organization of the Tilia plastid genome exhibits the general structure of angiosperms and is very similar to other published plastid genomes of Malvaceae. As other long-lived tree genera, the sequence divergence among the four Tilia plastid genomes is very low. And we analyzed the nucleotide substitution patterns and the evolution of insertions and deletions in the Tilia plastid genomes. Finally, we build a phylogeny of the four sampled Tilia species with high supports using plastid phylogenomics, suggesting that it is an efficient way to resolve the phylogenetic relationships of this genus.

  7. Comparative analysis of protein evolution in the genome of pre-epidemic and epidemic Zika virus.

    PubMed

    Ramaiah, Arunachalam; Dai, Lei; Contreras, Deisy; Sinha, Sanjeev; Sun, Ren; Arumugaswami, Vaithilingaraja

    2017-03-14

    Zika virus (ZIKV) causes microcephaly in congenital infection, neurological disorders, and poor pregnancy outcome and no vaccine is available for use in humans or approved. Although ZIKV was first discovered in 1947, the exact mechanism of virus replication and pathogenesis remains unknown. Recent outbreaks of Zika virus in the Americas clearly suggest a human-mosquito cycle or urban cycle of transmission. Understanding the conserved and adaptive features in the evolution of ZIKV genome will provide a hint on the mechanism of ZIKV adaptation to a new cycle of transmission. Here, we show comprehensive analysis of protein evolution of ZIKV strains including the current 2015-16 outbreak. To identify the constraints on ZIKV evolution, selection pressure at individual codons, immune epitopes and co-evolving sites were analyzed. Phylogenetic trees show that the ZIKV strains of the Asian genotype form distinct cluster and share a common ancestor with African genotype. The TMRCA (Time to the Most Recent Common Ancestor) for the Asian lineage and the subsequently evolved Asian human strains was calculated at 88 and 34years ago, respectively. The proteome of current 2015/16 epidemic ZIKV strains of Asian genotype was found to be genetically conserved due to genome-wide negative selection, with limited positive selection. We identified a total of 16 amino acid substitutions in the epidemic and pre-epidemic strains from human, mosquito, and monkey hosts. Negatively selected amino acid sites of Envelope protein (E-protein) (positions 69, 166, and 174) and NS5 (292, 345, and 587) were located in central dimerization domains and C-terminal RNA-directed RNA polymerase regions, respectively. The predicted 137 (92 CD4 TCEs; 45 CD8 TCEs) immunogenic peptide chains comprising negatively selected amino acid sites can be considered as suitable target for sub-unit vaccine development, as these sites are less likely to generate immune-escape variants due to strong functional constrains

  8. Comparative genomic analysis of T-box regulatory systems in bacteria

    PubMed Central

    Vitreschak, Alexey G.; Mironov, Andrei A.; Lyubetsky, Vassily A.; Gelfand, Mikhail S.

    2008-01-01

    T-box antitermination is one of the main mechanisms of regulation of genes involved in amino acid metabolism in Gram-positive bacteria. T-box regulatory sites consist of conserved sequence and RNA secondary structure elements. Using a set of known T-box sites, we constructed the common pattern and used it to scan available bacterial genomes. New T-boxes were found in various Gram-positive bacteria, some Gram-negative bacteria (δ-proteobacteria), and some other bacterial groups (Deinococcales/Thermales, Chloroflexi, Dictyoglomi). The majority of T-box-regulated genes encode aminoacyl-tRNA synthetases. Two other groups of T-box-regulated genes are amino acid biosynthetic genes and transporters, as well as genes with unknown function. Analysis of candidate T-box sites resulted in new functional annotations. We assigned the amino acid specificity to a large number of candidate amino acid transporters and a possible function to amino acid biosynthesis genes. We then studied the evolution of the T-boxes. Analysis of the constructed phylogenetic trees demonstrated that in addition to the normal evolution consistent with the evolution of regulated genes, T-boxes may be duplicated, transferred to other genes, and change specificity. We observed several cases of recent T-box regulon expansion following the loss of a previously existing regulatory system, in particular, arginine regulon in Clostridium difficile and methionine regulon in Lactobacillaceae. Finally, we described a new structural class of T-boxes containing duplicated terminator–antiterminator elements and unusual reduced T-boxes regulating initiation of translation in the Actinobacteria. PMID:18359782

  9. Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics.

    PubMed

    Harrison, Nicola; Harrison, Richard J; Kidner, Catherine A

    2016-01-01

    Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia.

  10. Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics

    PubMed Central

    Harrison, Nicola; Harrison, Richard J.

    2016-01-01

    Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia. PMID:27058864

  11. Reptile genomes open the frontier for comparative analysis of amniote development and regeneration.

    PubMed

    Tollis, Marc; Hutchins, Elizabeth D; Kusumi, Kenro

    2014-01-01

    Developmental genetic studies of vertebrates have focused primarily on zebrafish, frog and mouse models, which have clear application to medicine and well-developed genomic resources. In contrast, reptiles represent the most diverse amniote group, but have only recently begun to gather the attention of genome sequencing efforts. Extant reptilian groups last shared a common ancestor ?280 million years ago and include lepidosaurs, turtles and crocodilians. This phylogenetic diversity is reflected in great morphological and behavioral diversity capturing the attention of biologists interested in mechanisms regulating developmental processes such as somitogenesis and spinal patterning, regeneration, the evolution of "snake-like" morphology, the formation of the unique turtle shell, and the convergent evolution of the four-chambered heart shared by mammals and archosaurs. The complete genome of the first non-avian reptile, the green anole lizard, was published in 2011 and has provided insights into the origin and evolution of amniotes. Since then, the genomes of multiple snakes, turtles, and crocodilians have also been completed. Here we will review the current diversity of available reptile genomes, with an emphasis on their evolutionary relationships, and will highlight how these genomes have and will continue to facilitate research in developmental and regenerative biology.

  12. Comparative genomic and functional analysis reveal conservation of plant growth promoting traits in Paenibacillus polymyxa and its closely related species

    PubMed Central

    Xie, Jianbo; Shi, Haowen; Du, Zhenglin; Wang, Tianshu; Liu, Xiaomeng; Chen, Sanfeng

    2016-01-01

    Paenibacillus polymyxa has widely been studied as a model of plant-growth promoting rhizobacteria (PGPR). Here, the genome sequences of 9 P. polymyxa strains, together with 26 other sequenced Paenibacillus spp., were comparatively studied. Phylogenetic analysis of the concatenated 244 single-copy core genes suggests that the 9 P. polymyxa strains and 5 other Paenibacillus spp., isolated from diverse geographic regions and ecological niches, formed a closely related clade (here it is called Poly-clade). Analysis of single nucleotide polymorphisms (SNPs) reveals local diversification of the 14 Poly-clade genomes. SNPs were not evenly distributed throughout the 14 genomes and the regions with high SNP density contain the genes related to secondary metabolism, including genes coding for polyketide. Recombination played an important role in the genetic diversity of this clade, although the rate of recombination was clearly lower than mutation. Some genes relevant to plant-growth promoting traits, i.e. phosphate solubilization and IAA production, are well conserved, while some genes relevant to nitrogen fixation and antibiotics synthesis are evolved with diversity in this Poly-clade. This study reveals that both P. polymyxa and its closely related species have plant growth promoting traits and they have great potential uses in agriculture and horticulture as PGPR. PMID:26856413

  13. Comparative Genome-Wide Analysis of the Malate Dehydrogenase Gene Families in Cotton

    PubMed Central

    Imran, Muhammad; Tang, Kai; Liu, Jin-Yuan

    2016-01-01

    Malate dehydrogenases (MDHs) play crucial roles in the physiological processes of plant growth and development. In this study, 13 and 25 MDH genes were identified from Gossypium raimondii and Gossypium hirsutum, respectively. Using these and 13 previously reported Gossypium arboretum MDH genes, a comparative molecular analysis between identified MDH genes from G. raimondii, G. hirsutum, and G. arboretum was performed. Based on multiple sequence alignments, cotton MDHs were divided into five subgroups: mitochondrial MDH, peroxisomal MDH, plastidial MDH, chloroplastic MDH and cytoplasmic MDH. Almost all of the MDHs within the same subgroup shared similar gene structure, amino acid sequence, and conserved motifs in their functional domains. An analysis of chromosomal localization suggested that segmental duplication played a major role in the expansion of cotton MDH gene families. Additionally, a selective pressure analysis indicated that purifying selection acted as a vital force in the evolution of MDH gene families in cotton. Meanwhile, an expression analysis showed the distinct expression profiles of GhMDHs in different vegetative tissues and at different fiber developmental stages, suggesting the functional diversification of these genes in cotton growth and fiber development. Finally, a promoter analysis indicated redundant but typical cis-regulatory elements for the potential functions and stress activity of many MDH genes. This study provides fundamental information for a better understanding of cotton MDH gene families and aids in functional analyses of the MDH genes in cotton fiber development. PMID:27829020

  14. The Complete Chloroplast Genome Sequences of Three Veroniceae Species (Plantaginaceae): Comparative Analysis and Highly Divergent Regions

    PubMed Central

    Choi, Kyoung Su; Chung, Myong Gi; Park, SeonJoo

    2016-01-01

    Previous studies of Veronica and related genera were weakly supported by molecular and paraphyletic taxa. Here, we report the complete chloroplast genome sequence of Veronica nakaiana and the related species Veronica persica and Veronicastrum sibiricum. The chloroplast genome length of V. nakaiana, V. persica, and V. sibiricum ranged from 150,198 bp to 152,930 bp. A total of 112 genes comprising 79 protein coding genes, 29 tRNA genes, and 4 rRNA genes were observed in three chloroplast genomes. The total number of SSRs was 48, 51, and 53 in V. nakaiana, V. persica, and V. sibiricum, respectively. Two SSRs (10 bp of AT and 12 bp of AATA) were observed in the same regions (rpoC2 and ndhD) in three chloroplast genomes. A comparison of coding genes and non-coding regions between V. nakaiana and V. persica revealed divergent sites, with the greatest variation occurring petD-rpoA region. The complete chloroplast genome sequence information regarding the three Veroniceae will be helpful for elucidating Veroniceae phylogenetic relationships. PMID:27047524

  15. Genome sequence, comparative analysis and population genetics of the domestic horse (Equus caballus)

    PubMed Central

    Wade, CM; Giulotto, E; Sigurdsson, S; Zoli, M; Gnerre, S; Imsland, F; Lear, TL; Adelson, DL; Bailey, E; Bellone, RR; Blöcker, H; Distl, O; Edgar, RC; Garber, M; Leeb, T; Mauceli, E; MacLeod, JN; Penedo, MCT; Raison, JM; Sharpe, T; Vogel, J; Andersson, L; Antczak, DF; Biagi, T; Binns, MM; Chowdhary, BP; Coleman, SJ; Della Valle, G; Fryc, S; Guérin, G; Hasegawa, T; Hill, EW; Jurka, J; Kiialainen, A; Lindgren, G; Liu, J; Magnani, E; Mickelson, JR; Murray, J; Nergadze, SG; Onofrio, R; Pedroni, S; Piras, MF; Raudsepp, T; Rocchi, M; Røed, KH; Ryder, OA; Searle, S; Skow, L; Swinburne, JE; Syvänen, AC; Tozaki, T; Valberg, SJ; Vaudin, M; White, JR; Zody, MC; Lander, ES; Lindblad-Toh, K

    2013-01-01

    We report a high-quality draft sequence of the genome of the horse (Equus caballus). The genome is relatively repetitive, but has little segmental duplication. Chromosomes appear to have undergone few historical rearrangements – 48% of equine chromosomes show conserved synteny to a single human chromosome. Equine chromosome 11 is shown to have an evolutionary novel centromere devoid of centromeric satellite DNA, suggesting that centromeric function may arise prior to satellite repeat accumulation. Linkage disequilibrium, showing the influences of early domestication of large herds of female horses, is intermediate in length between dog and human, and there is long-range haplotype sharing among breeds. PMID:19892987

  16. Comparative genomics analysis of completely sequenced microbial genomes reveals the ubiquity of N-linked glycosylation in prokaryotes.

    PubMed

    Kumar, Manjeet; Balaji, Petety V

    2011-05-01

    Glycosylation of proteins in prokaryotes has been known for the last few decades. Glycan structures and/or the glycosylation pathways have been experimentally characterized in only a small number of prokaryotes. Even this has become possible only during the last decade or so, primarily due to technological and methodological developments. Glycosylated proteins are diverse in their function and localization. Glycosylation has been shown to be associated with a wide range of biological phenomena. Characterization of the various types of glycans and the glycosylation machinery is critical to understand such processes. Such studies can help in the identification of novel targets for designing drugs, diagnostics, and engineering of therapeutic proteins. In view of this, the experimentally characterized pgl system of Campylobacter jejuni, responsible for N-linked glycosylation, has been used in this study to identify glycosylation loci in 865 prokaryotes whose genomes have been completely sequenced. Results from the present study show that only a small number of organisms have homologs for all the pgl enzymes and a few others have homologs for none of the pgl enzymes. Most of the organisms have homologs for only a subset of the pgl enzymes. There is no specific pattern for the presence or absence of pgl homologs vis-à-vis the 16S rRNA sequence-based phylogenetic tree. This may be due to differences in the glycan structures, high sequence divergence, horizontal gene transfer or non-orthologous gene displacement. Overall, the presence of homologs for pgl enzymes in a large number of organisms irrespective of their habitat, pathogenicity, energy generation mechanism, etc., hints towards the ubiquity of N-linked glycosylation in prokaryotes.

  17. VISTA - computational tools for comparative genomics

    SciTech Connect

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  18. Genome‐scale diversity and niche adaptation analysis of Lactococcus lactis by comparative genome hybridization using multi‐strain arrays

    PubMed Central

    Siezen, Roland J.; Bayjanov, Jumamurat R.; Felis, Giovanna E.; van der Sijde, Marijke R.; Starrenburg, Marjo; Molenaar, Douwe; Wels, Michiel; van Hijum, Sacha A. F. T.; van Hylckama Vlieg, Johan E. T.

    2011-01-01

    Summary Lactococcus lactis produces lactic acid and is widely used in the manufacturing of various fermented dairy products. However, the species is also frequently isolated from non‐dairy niches, such as fermented plant material. Recently, these non‐dairy strains have gained increasing interest, as they have been described to possess flavour‐forming activities that are rarely found in dairy isolates and have diverse metabolic properties. We performed an extensive whole‐genome diversity analysis on 39 L. lactis strains, isolated from dairy and plant sources. Comparative genome hybridization analysis with multi‐strain microarrays was used to assess presence or absence of genes and gene clusters in these strains, relative to all L. lactis sequences in public databases, whereby chromosomal and plasmid‐encoded genes were computationally analysed separately. Nearly 3900 chromosomal orthologous groups (chrOGs) were defined on basis of four sequenced chromosomes of L. lactis strains (IL1403, KF147, SK11, MG1363). Of these, 1268 chrOGs are present in at least 35 strains and represent the presently known core genome of L. lactis, and 72 chrOGs appear to be unique for L. lactis. Nearly 600 and 400 chrOGs were found to be specific for either the subspecies lactis or subspecies cremoris respectively. Strain variability was found in presence or absence of gene clusters related to growth on plant substrates, such as genes involved in the consumption of arabinose, xylan, α‐galactosides and galacturonate. Further niche‐specific differences were found in gene clusters for exopolysaccharides biosynthesis, stress response (iron transport, osmotolerance) and bacterial defence mechanisms (nisin biosynthesis). Strain variability of functions encoded on known plasmids included proteolysis, lactose fermentation, citrate uptake, metal ion resistance and exopolysaccharides biosynthesis. The present study supports the view of L. lactis as a species with a very flexible

  19. Comparative genomic analysis of the DUF71/COG2102 family predicts roles in diphthamide biosynthesis and B12 salvage

    PubMed Central

    2012-01-01

    Background The availability of over 3000 published genome sequences has enabled the use of comparative genomic approaches to drive the biological function discovery process. Classically, one used to link gene with function by genetic or biochemical approaches, a lengthy process that often took years. Phylogenetic distribution profiles, physical clustering, gene fusion, co-expression profiles, structural information and other genomic or post-genomic derived associations can be now used to make very strong functional hypotheses. Here, we illustrate this shift with the analysis of the DUF71/COG2102 family, a subgroup of the PP-loop ATPase family. Results The DUF71 family contains at least two subfamilies, one of which was predicted to be the missing diphthine-ammonia ligase (EC 6.3.1.14), Dph6. This enzyme catalyzes the last ATP-dependent step in the synthesis of diphthamide, a complex modification of Elongation Factor 2 that can be ADP-ribosylated by bacterial toxins. Dph6 orthologs are found in nearly all sequenced Archaea and Eucarya, as expected from the distribution of the diphthamide modification. The DUF71 family appears to have originated in the Archaea/Eucarya ancestor and to have been subsequently horizontally transferred to Bacteria. Bacterial DUF71 members likely acquired a different function because the diphthamide modification is absent in this Domain of Life. In-depth investigations suggest that some archaeal and bacterial DUF71 proteins participate in B12 salvage. Conclusions This detailed analysis of the DUF71 family members provides an example of the power of integrated data-miming for solving important “missing genes” or “missing function” cases and illustrates the danger of functional annotation of protein families by homology alone. Reviewers’ names This article was reviewed by Arcady Mushegian, Michael Galperin and L. Aravind. PMID:23013770

  20. Comparative analysis highlights variable genome content of wheat rusts and divergence of the mating loci

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Wheat is grown around the world and has been plagued by three rust fungi for centuries. Leaf rust, stripe rust, and stem rust each cause significant damage and can adapt quickly to overcome resistance that is present in wheat cultivars. Using advanced DNA sequencing technology, the genomes of leaf ...

  1. Dissecting the fungal biology of Bipolaris papendorfii: from phylogenetic to comparative genomic analysis.

    PubMed

    Kuan, Chee Sian; Yew, Su Mei; Toh, Yue Fen; Chan, Chai Ling; Ngeow, Yun Fong; Lee, Kok Wei; Na, Shiang Ling; Yee, Wai-Yan; Hoh, Chee-Choong; Ng, Kee Peng

    2015-06-01

    Bipolaris papendorfii has been reported as a fungal plant pathogen that rarely causes opportunistic infection in humans. Secondary metabolites isolated from this fungus possess medicinal and anticancer properties. However, its genetic fundamental and basic biology are largely unknown. In this study, we report the first draft genome sequence of B. papendorfii UM 226 isolated from the skin scraping of a patient. The assembled 33.4 Mb genome encodes 11,015 putative coding DNA sequences, of which, 2.49% are predicted transposable elements. Multilocus phylogenetic and phylogenomic analyses showed B. papendorfii UM 226 clustering with Curvularia species, apart from other plant pathogenic Bipolaris species. Its genomic features suggest that it is a heterothallic fungus with a putative unique gene encoding the LysM-containing protein which might be involved in fungal virulence on host plants, as well as a wide array of enzymes involved in carbohydrate metabolism, degradation of polysaccharides and lignin in the plant cell wall, secondary metabolite biosynthesis (including dimethylallyl tryptophan synthase, non-ribosomal peptide synthetase, polyketide synthase), the terpenoid pathway and the caffeine metabolism. This first genomic characterization of B. papendorfii provides the basis for further studies on its biology, pathogenicity and medicinal potential.

  2. Dissecting the fungal biology of Bipolaris papendorfii: from phylogenetic to comparative genomic analysis

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Toh, Yue Fen; Chan, Chai Ling; Ngeow, Yun Fong; Lee, Kok Wei; Na, Shiang Ling; Yee, Wai-Yan; Hoh, Chee-Choong; Ng, Kee Peng

    2015-01-01

    Bipolaris papendorfii has been reported as a fungal plant pathogen that rarely causes opportunistic infection in humans. Secondary metabolites isolated from this fungus possess medicinal and anticancer properties. However, its genetic fundamental and basic biology are largely unknown. In this study, we report the first draft genome sequence of B. papendorfii UM 226 isolated from the skin scraping of a patient. The assembled 33.4 Mb genome encodes 11,015 putative coding DNA sequences, of which, 2.49% are predicted transposable elements. Multilocus phylogenetic and phylogenomic analyses showed B. papendorfii UM 226 clustering with Curvularia species, apart from other plant pathogenic Bipolaris species. Its genomic features suggest that it is a heterothallic fungus with a putative unique gene encoding the LysM-containing protein which might be involved in fungal virulence on host plants, as well as a wide array of enzymes involved in carbohydrate metabolism, degradation of polysaccharides and lignin in the plant cell wall, secondary metabolite biosynthesis (including dimethylallyl tryptophan synthase, non-ribosomal peptide synthetase, polyketide synthase), the terpenoid pathway and the caffeine metabolism. This first genomic characterization of B. papendorfii provides the basis for further studies on its biology, pathogenicity and medicinal potential. PMID:25922537

  3. Comparative genomics of biotechnologically important yeasts

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the...

  4. Comparative Genomic Analysis of the Endosymbionts of Herbivorous Insects Reveals Eco-Environmental Adaptations: Biotechnology Applications

    PubMed Central

    Shi, Weibing; Xie, Shangxian; Chen, Xueyan; Sun, Su; Zhou, Xin; Liu, Lantao; Gao, Peng; Kyrpides, Nikos C.; No, En-Gyu; Yuan, Joshua S.

    2013-01-01

    Metagenome analysis of the gut symbionts of three different insects was conducted as a means of comparing taxonomic and metabolic diversity of gut microbiomes to diet and life history of the insect hosts. A second goal was the discovery of novel biocatalysts for biorefinery applications. Grasshopper and cutworm gut symbionts were sequenced and compared with the previously identified metagenome of termite gut microbiota. These insect hosts represent three different insect orders and specialize on different food types. The comparative analysis revealed dramatic differences among the three insect species in the abundance and taxonomic composition of the symbiont populations present in the gut. The composition and abundance of symbionts was correlated with their previously identified capacity to degrade and utilize the different types of food consumed by their hosts. The metabolic reconstruction revealed that the gut metabolome of cutworms and grasshoppers was more enriched for genes involved in carbohydrate metabolism and transport than wood-feeding termite, whereas the termite gut metabolome was enriched for glycosyl hydrolase (GH) enzymes relevant to lignocellulosic biomass degradation. Moreover, termite gut metabolome was more enriched with nitrogen fixation genes than those of grasshopper and cutworm gut, presumably due to the termite's adaptation to the high fiber and less nutritious food types. In order to evaluate and exploit the insect symbionts for biotechnology applications, we cloned and further characterized four biomass-degrading enzymes including one endoglucanase and one xylanase from both the grasshopper and cutworm gut symbionts. The results indicated that the grasshopper symbiont enzymes were generally more efficient in biomass degradation than the homologous enzymes from cutworm symbionts. Together, these results demonstrated a correlation between the composition and putative metabolic functionality of the gut microbiome and host diet, and suggested

  5. Comparative analysis of genome-wide Mlo gene family in Cajanus cajan and Phaseolus vulgaris.

    PubMed

    Deshmukh, Reena; Singh, V K; Singh, B D

    2016-04-01

    The Mlo gene was discovered in barley because the mutant 'mlo' allele conferred broad-spectrum, non-race-specific resistance to powdery mildew caused by Blumeria graminis f. sp. hordei. The Mlo genes also play important roles in growth and development of plants, and in responses to biotic and abiotic stresses. The Mlo gene family has been characterized in several crop species, but only a single legume species, soybean (Glycine max L.), has been investigated so far. The present report describes in silico identification of 18 CcMlo and 20 PvMlo genes in the important legume crops Cajanus cajan (L.) Millsp. and Phaseolus vulgaris L., respectively. In silico analysis of gene organization, protein properties and conserved domains revealed that the C. cajan and P. vulgaris Mlo gene paralogs are more divergent from each other than from their orthologous pairs. The comparative phylogenetic analysis classified CcMlo and PvMlo genes into three major clades. A comparative analysis of CcMlo and PvMlo proteins with the G. max Mlo proteins indicated close association of one CcMlo, one PvMlo with two GmMlo genes, indicating that there was no further expansion of the Mlo gene family after the separation of these species. Thus, most of the diploid species of eudicots might be expected to contain 15-20 Mlo genes. The genes CcMlo12 and 14, and PvMlo11 and 12 are predicted to participate in powdery mildew resistance. If this prediction were verified, these genes could be targeted by TILLING or CRISPR to isolate powdery mildew resistant mutants.

  6. Rumen Cellulosomics: Divergent Fiber-Degrading Strategies Revealed by Comparative Genome-Wide Analysis of Six Ruminococcal Strains

    PubMed Central

    Dassa, Bareket; Borovok, Ilya; Ruimy-Israeli, Vered; Lamed, Raphael; Flint, Harry J.; Duncan, Sylvia H.; Henrissat, Bernard; Coutinho, Pedro; Morrison, Mark; Mosoni, Pascale; Yeoman, Carl J.; White, Bryan A.; Bayer, Edward A.

    2014-01-01

    Background A complex community of microorganisms is responsible for efficient plant cell wall digestion by many herbivores, notably the ruminants. Understanding the different fibrolytic mechanisms utilized by these bacteria has been of great interest in agricultural and technological fields, reinforced more recently by current efforts to convert cellulosic biomass to biofuels. Methodology/Principal Findings Here, we have used a bioinformatics-based approach to explore the cellulosome-related components of six genomes from two of the primary fiber-degrading bacteria in the rumen: Ruminococcus flavefaciens (strains FD-1, 007c and 17) and Ruminococcus albus (strains 7, 8 and SY3). The genomes of two of these strains are reported for the first time herein. The data reveal that the three R. flavefaciens strains encode for an elaborate reservoir of cohesin- and dockerin-containing proteins, whereas the three R. albus strains are cohesin-deficient and encode mainly dockerins and a unique family of cell-anchoring carbohydrate-binding modules (family 37). Conclusions/Significance Our comparative genome-wide analysis pinpoints rare and novel strain-specific protein architectures and provides an exhaustive profile of their numerous lignocellulose-degrading enzymes. This work provides blueprints of the divergent cellulolytic systems in these two prominent fibrolytic rumen bacterial species, each of which reflects a distinct mechanistic model for efficient degradation of cellulosic biomass. PMID:24992679

  7. Complete sequence and comparative genome analysis of the dairy bacterium Streptococcus thermophilus.

    PubMed

    Bolotin, Alexander; Quinquis, Benoît; Renault, Pierre; Sorokin, Alexei; Ehrlich, S Dusko; Kulakauskas, Saulius; Lapidus, Alla; Goltsman, Eugene; Mazur, Michael; Pusch, Gordon D; Fonstein, Michael; Overbeek, Ross; Kyprides, Nikos; Purnelle, Bénédicte; Prozzi, Deborah; Ngui, Katrina; Masuy, David; Hancy, Frédéric; Burteau, Sophie; Boutry, Marc; Delcour, Jean; Goffeau, André; Hols, Pascal

    2004-12-01

    The lactic acid bacterium Streptococcus thermophilus is widely used for the manufacture of yogurt and cheese. This dairy species of major economic importance is phylogenetically close to pathogenic streptococci, raising the possibility that it has a potential for virulence. Here we report the genome sequences of two yogurt strains of S. thermophilus. We found a striking level of gene decay (10% pseudogenes) in both microorganisms. Many genes involved in carbon utilization are nonfunctional, in line with the paucity of carbon sources in milk. Notably, most streptococcal virulence-related genes that are not involved in basic cellular processes are either inactivated or absent in the dairy streptococcus. Adaptation to the constant milk environment appears to have resulted in the stabilization of the genome structure. We conclude that S. thermophilus has evolved mainly through loss-of-function events that remarkably mirror the environment of the dairy niche resulting in a severely diminished pathogenic potential.

  8. Origin of the 1918 Spanish influenza virus: a comparative genomic analysis.

    PubMed

    Vana, Geoff; Westover, Kristi M

    2008-06-01

    To test the avian-origin hypothesis of the 1918 Spanish influenza virus we surveyed influenza sequences from a broad taxonomic distribution and collected 65 full-length genomes representing avian, human and "classic" swine H1N1 lineages in addition to numerous other swine (H1N2, H3N1, and H3N2), human (H2N2, H3N2, and H5N1), and avian (H1N1, H4N6, H5N1, H6N1, H6N6, H6N8, H7N3, H8N4, H9N2, and H13N2) subtypes. Amino acids from all eight segments were concatenated, aligned, and used for phylogenetic analyses. In addition, the genes of the polymerase complex (PB1, PB2, and PA) were analyzed individually. All of our results showed the Brevig-Mission/1918 strain in a position basal to the rest of the clade containing human H1N1s and were consistent with a reassortment hypothesis for the origin of the 1918 virus. Our genome phylogeny further indicates a sister relationship with the "classic" swine H1N1 lineage. The individual PB1, PB2, and PA phylogenies were consistent with reassortment/recombination hypotheses for these genes. These results demonstrate the importance of using a complete-genome approach for addressing the avian-origin hypothesis and predicting the emergence of new pandemic influenza strains.

  9. Computational analysis of full-length mouse cDNAs compared with human genome sequences.

    PubMed

    Kondo, S; Shinagawa, A; Saito, T; Kiyosawa, H; Yamanaka, I; Aizawa, K; Fukuda, S; Hara, A; Itoh, M; Kawai, J; Shibata, K; Hayashizaki, Y

    2001-09-01

    Although the sequencing of the human genome is complete, identification of encoded genes and determination of their structures remain a major challenge. In this report, we introduce a method that effectively uses full-length mouse cDNAs to complement efforts in carrying out these difficult tasks. A total of 61,227 RIKEN mouse cDNAs (21,076 full-length and 40,151 EST sequences containing certain redundancies) were aligned with the draft human sequences. We found 35,141 non-redundant genomic regions that showed a significant alignment with the mouse cDNAs. We analyzed the structures and compositional properties of the regions detected by the full-length cDNAs, including cross-species comparisons, and noted a systematic bias of GENSCAN against exons of small size and/or low GC-content. Of the cDNAs locating the 35,141 genomic regions, 3,217 did not match any sequences of the known human genes or ESTs. Among those 3,217 cDNAs, 1,141 did not show any significant similarity to any protein sequence in the GenBank non-redundant protein database and thus are candidates for novel genes.

  10. Male with mosaicism for supernumerary ring X chromosome: analysis of phenotype and characterization of genotype using array comparative genome hybridization.

    PubMed

    Baker, Peter R; Tsai, Anne Chun-Hui; Springer, Michelle; Swisshelm, Karen; March, Jennifer; Brown, Kathleen; Bellus, Gary

    2010-09-01

    Supernumerary, derivative, and ring X chromosomes are relatively common in Turner syndrome females but have been reported rarely in males. To date, less than 10 cases have been published, of which only 2 have been partially characterized in defining the breakpoints and genetic content of the derivative X chromosome. We describe a male with mosaicism for a supernumerary X chromosome (46,XY/47,XY, r(X)) who has multiple congenital anomalies, including features of craniofrontonasal dysplasia (Mendelian Inheritance in Man 304110) and the presence of ectopic female reproductive organs. Using comparative genomic hybridization array mapping, we determined that the derivative X is composed of a 24-Mb fragment that contains the regions Xp11.3 through Xq13.1 and lacks the XIST gene. This is the first report to describe a detailed molecular characterization of a ring X chromosome in a male by comparative genomic hybridization array analysis. We compare the clinical and molecular findings in this patient to other 46,XY, r(X) patients reported in the literature and discuss the potential role of disomy for known genes contained on the ring X chromosome.

  11. Characterization and comparative analysis of the genome of Puccinia sorghi Schwein, the causal agent of maize common rust.

    PubMed

    Rochi, Lucia; Diéguez, María José; Burguener, Germán; Darino, Martín Alejandro; Pergolesi, María Fernanda; Ingala, Lorena Romina; Cuyeu, Alba Romina; Turjanski, Adrián; Kreff, Enrique Domingo; Sacco, Francisco

    2016-10-13

    Rust fungi are one of the most devastating pathogens of crop plants. The biotrophic fungus Puccinia sorghi Schwein (Ps) is responsible for maize common rust, an endemic disease of maize (Zea mays L.) in Argentina that causes significant yield losses in corn production. In spite of this, the Ps genomic sequence was not available. We used Illumina sequencing to rapidly produce the 99.6Mbdraft genome sequence of Ps race RO10H11247, derived from a single-uredinial isolate from infected maize leaves collected in the Argentine Corn Belt Region during 2010. High quality reads were obtained from 200bppaired-end and 5000bpmate-paired libraries and assembled in 15,722 scaffolds. A pipeline which combined an ab initio program with homology-based models and homology to in planta enriched ESTs from four cereal pathogenic fungus (the three sequenced wheat rusts and Ustilago maydis) was used to identify 21,087 putative coding sequences, of which 1599 might be part of the Ps RO10H11247 secretome. Among the 458 highly conserved protein families from the euKaryotic Orthologous Groups (KOG) that occur in a wide range of eukaryotic organisms, 97.5% have at least one member with high homology in the Ps assembly (TBlastN, E-value⩽e-10) covering more than 50% of the length of the KOG protein. Comparative studies with the three sequenced wheat rust fungus, and microsynteny analysis involving Puccinia striiformis f. sp. tritici (Pst, wheat stripe rust fungus), support the quality achieved. The results presented here show the effectiveness of the Illumina strategy for sequencing dikaryotic genomes of non-model organisms and provides reliable DNA sequence information for genomic studies, including pathogenic mechanisms of this maize fungus and molecular marker design.

  12. Comparative analysis of the complete genome sequence of the California MSW strain of myxoma virus reveals potential host adaptations.

    PubMed

    Kerr, Peter J; Rogers, Matthew B; Fitch, Adam; Depasse, Jay V; Cattadori, Isabella M; Hudson, Peter J; Tscharke, David C; Holmes, Edward C; Ghedin, Elodie

    2013-11-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits.

  13. Comparative Analysis of the Complete Genome Sequence of the California MSW Strain of Myxoma Virus Reveals Potential Host Adaptations

    PubMed Central

    Kerr, Peter J.; Rogers, Matthew B.; Fitch, Adam; DePasse, Jay V.; Cattadori, Isabella M.; Hudson, Peter J.; Tscharke, David C.; Holmes, Edward C.

    2013-01-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits. PMID:23986601

  14. The Xenopus alcohol dehydrogenase gene family: characterization and comparative analysis incorporating amphibian and reptilian genomes

    PubMed Central

    2014-01-01

    Background The alcohol dehydrogenase (ADH) gene family uniquely illustrates the concept of enzymogenesis. In vertebrates, tandem duplications gave rise to a multiplicity of forms that have been classified in eight enzyme classes, according to primary structure and function. Some of these classes appear to be exclusive of particular organisms, such as the frog ADH8, a unique NADP+-dependent ADH enzyme. This work describes the ADH system of Xenopus, as a model organism, and explores the first amphibian and reptilian genomes released in order to contribute towards a better knowledge of the vertebrate ADH gene family. Results Xenopus cDNA and genomic sequences along with expressed sequence tags (ESTs) were used in phylogenetic analyses and structure-function correlations of amphibian ADHs. Novel ADH sequences identified in the genomes of Anolis carolinensis (anole lizard) and Pelodiscus sinensis (turtle) were also included in these studies. Tissue and stage-specific libraries provided expression data, which has been supported by mRNA detection in Xenopus laevis tissues and regulatory elements in promoter regions. Exon-intron boundaries, position and orientation of ADH genes were deduced from the amphibian and reptilian genome assemblies, thus revealing syntenic regions and gene rearrangements with respect to the human genome. Our results reveal the high complexity of the ADH system in amphibians, with eleven genes, coding for seven enzyme classes in Xenopus tropicalis. Frogs possess the amphibian-specific ADH8 and the novel ADH1-derived forms ADH9 and ADH10. In addition, they exhibit ADH1, ADH2, ADH3 and ADH7, also present in reptiles and birds. Class-specific signatures have been assigned to ADH7, and ancestral ADH2 is predicted to be a mixed-class as the ostrich enzyme, structurally close to mammalian ADH2 but with class-I kinetic properties. Remarkably, many ADH1 and ADH7 forms are observed in the lizard, probably due to lineage-specific duplications. ADH4 is not

  15. Comparative Genome Analysis of Trichophyton rubrum and Related Dermatophytes Reveals Candidate Genes Involved in Infection

    PubMed Central

    Martinez, Diego A.; Oliver, Brian G.; Gräser, Yvonne; Goldberg, Jonathan M.; Li, Wenjun; Martinez-Rossi, Nilce M.; Monod, Michel; Shelest, Ekaterina; Barton, Richard C.; Birch, Elizabeth; Brakhage, Axel A.; Chen, Zehua; Gurr, Sarah J.; Heiman, David; Heitman, Joseph; Kosti, Idit; Rossi, Antonio; Saif, Sakina; Samalova, Marketa; Saunders, Charles W.; Shea, Terrance; Summerbell, Richard C.; Xu, Jun; Young, Sarah; Zeng, Qiandong; Birren, Bruce W.; Cuomo, Christina A.; White, Theodore C.

    2012-01-01

    ABSTRACT The major cause of athlete’s foot is Trichophyton rubrum, a dermatophyte or fungal pathogen of human skin. To facilitate molecular analyses of the dermatophytes, we sequenced T. rubrum and four related species, Trichophyton tonsurans, Trichophyton equinum, Microsporum canis, and Microsporum gypseum. These species differ in host range, mating, and disease progression. The dermatophyte genomes are highly colinear yet contain gene family expansions not found in other human-associated fungi. Dermatophyte genomes are enriched for gene families containing the LysM domain, which binds chitin and potentially related carbohydrates. These LysM domains differ in sequence from those in other species in regions of the peptide that could affect substrate binding. The dermatophytes also encode novel sets of fungus-specific kinases with unknown specificity, including nonfunctional pseudokinases, which may inhibit phosphorylation by competing for kinase sites within substrates, acting as allosteric effectors, or acting as scaffolds for signaling. The dermatophytes are also enriched for a large number of enzymes that synthesize secondary metabolites, including dermatophyte-specific genes that could synthesize novel compounds. Finally, dermatophytes are enriched in several classes of proteases that are necessary for fungal growth and nutrient acquisition on keratinized tissues. Despite differences in mating ability, genes involved in mating and meiosis are conserved across species, suggesting the possibility of cryptic mating in species where it has not been previously detected. These genome analyses identify gene families that are important to our understanding of how dermatophytes cause chronic infections, how they interact with epithelial cells, and how they respond to the host immune response. PMID:22951933

  16. Comparative Genomic Analysis of Drechmeria coniospora Reveals Core and Specific Genetic Requirements for Fungal Endoparasitism of Nematodes.

    PubMed

    Lebrigand, Kevin; He, Le D; Thakur, Nishant; Arguel, Marie-Jeanne; Polanowska, Jolanta; Henrissat, Bernard; Record, Eric; Magdelenat, Ghislaine; Barbe, Valérie; Raffaele, Sylvain; Barbry, Pascal; Ewbank, Jonathan J

    2016-05-01

    Drechmeria coniospora is an obligate fungal pathogen that infects nematodes via the adhesion of specialized spores to the host cuticle. D. coniospora is frequently found associated with Caenorhabditis elegans in environmental samples. It is used in the study of the nematode's response to fungal infection. Full understanding of this bi-partite interaction requires knowledge of the pathogen's genome, analysis of its gene expression program and a capacity for genetic engineering. The acquisition of all three is reported here. A phylogenetic analysis placed D. coniospora close to the truffle parasite Tolypocladium ophioglossoides, and Hirsutella minnesotensis, another nematophagous fungus. Ascomycete nematopathogenicity is polyphyletic; D. coniospora represents a branch that has not been molecularly characterized. A detailed in silico functional analysis, comparing D. coniospora to 11 fungal species, revealed genes and gene families potentially involved in virulence and showed it to be a highly specialized pathogen. A targeted comparison with nematophagous fungi highlighted D. coniospora-specific genes and a core set of genes associated with nematode parasitism. A comparative gene expression analysis of samples from fungal spores and mycelia, and infected C. elegans, gave a molecular view of the different stages of the D. coniospora lifecycle. Transformation of D. coniospora allowed targeted gene knock-out and the production of fungus that expresses fluorescent reporter genes. It also permitted the initial characterisation of a potential fungal counter-defensive strategy, involving interference with a host antimicrobial mechanism. This high-quality annotated genome for D. coniospora gives insights into the evolution and virulence of nematode-destroying fungi. Coupled with genetic transformation, it opens the way for molecular dissection of D. coniospora physiology, and will allow both sides of the interaction between D. coniospora and C. elegans, as well as the

  17. Comparative Genomic Analysis of Drechmeria coniospora Reveals Core and Specific Genetic Requirements for Fungal Endoparasitism of Nematodes

    PubMed Central

    Thakur, Nishant; Arguel, Marie-Jeanne; Polanowska, Jolanta; Henrissat, Bernard; Record, Eric; Magdelenat, Ghislaine; Barbe, Valérie; Raffaele, Sylvain; Barbry, Pascal

    2016-01-01

    Drechmeria coniospora is an obligate fungal pathogen that infects nematodes via the adhesion of specialized spores to the host cuticle. D. coniospora is frequently found associated with Caenorhabditis elegans in environmental samples. It is used in the study of the nematode’s response to fungal infection. Full understanding of this bi-partite interaction requires knowledge of the pathogen’s genome, analysis of its gene expression program and a capacity for genetic engineering. The acquisition of all three is reported here. A phylogenetic analysis placed D. coniospora close to the truffle parasite Tolypocladium ophioglossoides, and Hirsutella minnesotensis, another nematophagous fungus. Ascomycete nematopathogenicity is polyphyletic; D. coniospora represents a branch that has not been molecularly characterized. A detailed in silico functional analysis, comparing D. coniospora to 11 fungal species, revealed genes and gene families potentially involved in virulence and showed it to be a highly specialized pathogen. A targeted comparison with nematophagous fungi highlighted D. coniospora-specific genes and a core set of genes associated with nematode parasitism. A comparative gene expression analysis of samples from fungal spores and mycelia, and infected C. elegans, gave a molecular view of the different stages of the D. coniospora lifecycle. Transformation of D. coniospora allowed targeted gene knock-out and the production of fungus that expresses fluorescent reporter genes. It also permitted the initial characterisation of a potential fungal counter-defensive strategy, involving interference with a host antimicrobial mechanism. This high-quality annotated genome for D. coniospora gives insights into the evolution and virulence of nematode-destroying fungi. Coupled with genetic transformation, it opens the way for molecular dissection of D. coniospora physiology, and will allow both sides of the interaction between D. coniospora and C. elegans, as well as the

  18. Comparative genomic analysis of coffee-infecting Xylella fastidiosa strains isolated from Brazil.

    PubMed

    Barbosa, Deibs; Alencar, Valquíria Campos; Santos, Daiene Souza; de Freitas Oliveira, Ana Cláudia; de Souza, Alessandra A; Coletta-Filho, Helvecio D; de Oliveira, Regina Souza; Nunes, Luiz R

    2015-05-01

    Strains of Xylella fastidiosa constitute a complex group of bacteria that develop within the xylem of many plant hosts, causing diseases of significant economic importance, such as Pierce's disease in North American grapevines and citrus variegated chlorosis in Brazil. X. fastidiosa has also been obtained from other host plants, in direct correlation with the development of diseases, as in the case of coffee leaf scorch (CLS)--a disease with potential to cause severe economic losses to the Brazilian coffee industry. This paper describes a thorough genomic characterization of coffee-infecting X. fastidiosa strains, initially performed through a microarray-based approach, which demonstrated that CLS strains could be subdivided in two phylogenetically distinct subgroups. Whole-genomic sequencing of two of these bacteria (one from each subgroup) allowed identification of ORFs and horizontally transferred elements (HTEs) that were specific to CLS-related X. fastidiosa strains. Such analyses confirmed the size and importance of HTEs as major mediators of chromosomal evolution amongst these bacteria, and allowed identification of differences in gene content, after comparisons were made with previously sequenced X. fastidiosa strains, isolated from alternative hosts. Although direct experimentation still needs to be performed to elucidate the biological consequences associated with such differences, it was interesting to verify that CLS-related bacteria display variations in genes that produce toxins, as well as surface-related factors (such as fimbrial adhesins and LPS) that have been shown to be involved with recognition of specific host factors in different pathogenic bacteria.

  19. Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster

    NASA Astrophysics Data System (ADS)

    Zdobnov, Evgeny M.; von Mering, Christian; Letunic, Ivica; Torrents, David; Suyama, Mikita; Copley, Richard R.; Christophides, George K.; Thomasova, Dana; Holt, Robert A.; Subramanian, G. Mani; Mueller, Hans-Michael; Dimopoulos, George; Law, John H.; Wells, Michael A.; Birney, Ewan; Charlab, Rosane; Halpern, Aaron L.; Kokoza, Elena; Kraft, Cheryl L.; Lai, Zhongwu; Lewis, Suzanna; Louis, Christos; Barillas-Mury, Carolina; Nusskern, Deborah; Rubin, Gerald M.; Salzberg, Steven L.; Sutton, Granger G.; Topalis, Pantelis; Wides, Ron; Wincker, Patrick; Yandell, Mark; Collins, Frank H.; Ribeiro, Jose; Gelbart, William M.; Kafatos, Fotis C.; Bork, Peer

    2002-10-01

    Comparison of the genomes and proteomes of the two diptera Anopheles gambiae and Drosophila melanogaster, which diverged about 250 million years ago, reveals considerable similarities. However, numerous differences are also observed; some of these must reflect the selection and subsequent adaptation associated with different ecologies and life strategies. Almost half of the genes in both genomes are interpreted as orthologs and show an average sequence identity of about 56%, which is slightly lower than that observed between the orthologs of the pufferfish and human (diverged about 450 million years ago). This indicates that these two insects diverged considerably faster than vertebrates. Aligned sequences reveal that orthologous genes have retained only half of their intron/exon structure, indicating that intron gains or losses have occurred at a rate of about one per gene per 125 million years. Chromosomal arms exhibit significant remnants of homology between the two species, although only 34% of the genes colocalize in small ``microsyntenic'' clusters, and major interarm transfers as well as intra-arm shuffling of gene order are detected.

  20. A whole-genome mouse BAC microarray with 1-Mb resolution for analysis of DNA copy number changes by array comparative genomic hybridization.

    PubMed

    Chung, Yeun-Jun; Jonkers, Jos; Kitson, Hannah; Fiegler, Heike; Humphray, Sean; Scott, Carol; Hunt, Sarah; Yu, Yuejin; Nishijima, Ichiko; Velds, Arno; Holstege, Henne; Carter, Nigel; Bradley, Allan

    2004-01-01

    Microarray-based comparative genomic hybridization (CGH) has become a powerful method for the genome-wide detection of chromosomal imbalances. Although BAC microarrays have been used for mouse CGH studies, the resolving power of these analyses was limited because high-density whole-genome mouse BAC microarrays were not available. We therefore developed a mouse BAC microarray containing 2803 unique BAC clones from mouse genomic libraries at 1-Mb intervals. For the general amplification of BAC clone DNA prior to spotting, we designed a set of three novel degenerate oligonucleotide-primed (DOP) PCR primers that preferentially amplify mouse genomic sequences while minimizing unwanted amplification of contaminating Escherichia coli DNA. The resulting 3K mouse BAC microarrays reproducibly identified DNA copy number alterations in cell lines and primary tumors, such as single-copy deletions, regional amplifications, and aneuploidy.

  1. Comparative analysis of the complete mitochondrial genomes of three geographical topmouth culter (Culter alburnus) groups and implications for their phylogenetics.

    PubMed

    Shi, Jianwu; Wang, Dexia; Wang, Junhua; Sheng, Junqing; Peng, Kou; Hu, Beijuan; Zeng, Liugen; Xiao, Minghe; Hong, Yijiang

    2017-03-01

    Topmouth culter (C. alburnus) is an important commercial fish in China. We compared the nucleotide variations in the mtDNA genomes among three geographical groups of Culter alburnus: Liangzi Lake, Hubei Province (referred to as LZH); Taihu Lake, Jiangsu Province (TH); and Poyang Lake, Jiangxi Province (PYH). The similarity of whole mtDNA genomes ranged from 0.992 to 0.999. The similarity among 13 protein-coding genes, 2 rRNA genes, and the D-loop sequences was found to range from 0.982 to 0.996. This is useful data for future designing work for making specific molecular marker for distinguishing individuals of C. alburnus from the three geographical groups. An extended termination-associated sequence (ETAS) and several conserved blocks (CSB-F, CSB-E, CSB-D, CSB1, CSB2, and CSB3) were identified in the mtDNA control regions. A phylogenetic analysis shows a monophyletic relationship of the LZF-female and the LZF-male. However, the analysis also showed paraphyletic relationships for the other two geological groups. This result will be useful for the future breeding work of C. alburnus.

  2. The First Chameleon Transcriptome: Comparative Genomic Analysis of the OXPHOS System Reveals Loss of COX8 in Iguanian Lizards

    PubMed Central

    Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan

    2013-01-01

    Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system. PMID:24009133

  3. The first Chameleon transcriptome: comparative genomic analysis of the OXPHOS system reveals loss of COX8 in Iguanian lizards.

    PubMed

    Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan

    2013-01-01

    Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system.

  4. Sequence-level comparative analysis of the Brassica napus genome around two stearoyl-ACP desaturase loci.

    PubMed

    Cho, Kwangsoo; O'Neill, Carmel M; Kwon, Soo-Jin; Yang, Tae-Jin; Smooker, Andrew M; Fraser, Fiona; Bancroft, Ian

    2010-02-01

    We conducted a sequence-level comparative analyses, at the scale of complete bacterial artificial chromosome (BAC) clones, between the genome of the most economically important Brassica species, Brassica napus (oilseed rape), and those of Brassica rapa, the genome of which is currently being sequenced, and Arabidopsis thaliana. We constructed a new B. napus BAC library and identified and sequenced clones that contain homoeologous regions of the genome including stearoyl-ACP desaturase-encoding genes. We sequenced the orthologous region of the genome of B. rapa and conducted comparative analyses between the Brassica sequences and those of the orthologous region of the genome of A. thaliana. The proportion of genes conserved (approximately 56%) is lower than has been reported previously between A. thaliana and Brassica (approximately 66%). The gene models for sets of conserved genes were used to determine the extent of nucleotide conservation of coding regions. This was found to be 84.2 +/- 3.9% and 85.8 +/- 3.7% between the B. napus A and C genomes, respectively, and that of A. thaliana, which is consistent with previous results for other Brassica species, and 97.5 +/- 3.1% between the B. napus A genome and B. rapa, and 93.1 +/- 4.9% between the B. napus C genome and B. rapa. The divergence of the B. napus genes from the A genome and the B. rapa genes was greater than anticipated and indicates that the A genome ancestor of the B. napus cultivar studied was relatively distantly related to the cultivar of B. rapa selected for genome sequencing.

  5. Comparative primate genomics: emerging patterns of genome content and dynamics

    PubMed Central

    Rogers, Jeffrey; Gibbs, Richard A.

    2014-01-01

    Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

  6. Comparative analysis of mitochondrial genomes in Diplura (hexapoda, arthropoda): taxon sampling is crucial for phylogenetic inferences.

    PubMed

    Chen, Wan-Jun; Koch, Markus; Mallatt, Jon M; Luan, Yun-Xia

    2014-01-01

    Two-pronged bristletails (Diplura) are traditionally classified into three major superfamilies: Campodeoidea, Projapygoidea, and Japygoidea. The interrelationships of these three superfamilies and the monophyly of Diplura have been much debated. Few previous studies included Projapygoidea in their phylogenetic considerations, and its position within Diplura still is a puzzle from both morphological and molecular points of view. Until now, no mitochondrial genome has been sequenced for any projapygoid species. To fill in this gap, we determined and annotated the complete mitochondrial genome of Octostigma sinensis (Octostigmatidae, Projapygoidea), and of three more dipluran species, one each from the Campodeidae, Parajapygidae, and Japygidae. All four newly sequenced dipluran mtDNAs encode the same set of genes in the same gene order as shared by most crustaceans and hexapods. Secondary structure truncations have occurred in trnR, trnC, trnS1, and trnS2, and the reduction of transfer RNA D-arms was found to be taxonomically correlated, with Campodeoidea having experienced the most reduction. Partitioned phylogenetic analyses, based on both amino acids and nucleotides of the protein-coding genes plus the ribosomal RNA genes, retrieve significant support for a monophyletic Diplura within Pancrustacea, with Projapygoidea more closely related to Campodeoidea than to Japygoidea. Another key finding is that monophyly of Diplura cannot be recovered unless Projapygoidea is included in the phylogenetic analyses; this explains the dipluran polyphyly found by past mitogenomic studies. Including Projapygoidea increased the sampling density within Diplura and probably helped by breaking up a long-branch-attraction artifact. This finding provides an example of how proper sampling is significant for phylogenetic inference.

  7. Ebolavirus comparative genomics

    SciTech Connect

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Ussery, David W.

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.

  8. Comparative genomic analysis of the Haloferax volcanii DS2 and Halobacterium salinarium GRB contig maps reveals extensive rearrangement.

    PubMed Central

    St Jean, A; Charlebois, R L

    1996-01-01

    Anonymous probes from the genome of Halobacterium salinarium GRB and 12 gene probes were hybridized to the cosmid clones representing the chromosome and plasmids of Halobacterium salinarium GRB and Haloferax volcanii DS2. The order of and pairwise distances between 35 loci uniquely cross-hybridizing to both chromosomes were analyzed in a search for conservation. No conservation between the genomes could be detected at the 15-kbp resolution used in this study. We found distinct sets of low-copy-number repeated sequences in the chromosome and plasmids of Halobacterium salinarium GRB, indicating some degree of partitioning between these replicons. We propose alternative courses for the evolution of the haloarchaeal genome: (i) that the majority of genomic differences that exist between genera came about at the inception of this group or (ii) that the differences have accumulated over the lifetime of the lineage. The strengths and limitations of investigating these models through comparative genomic studies are discussed. PMID:8682791

  9. A Comparative Map of the Zebrafish Genome

    PubMed Central

    Woods, Ian G.; Kelly, Peter D.; Chu, Felicia; Ngo-Hazelett, Phuong; Yan, Yi-Lin; Huang, Hui; Postlethwait, John H.; Talbot, William S.

    2000-01-01

    Zebrafish mutations define the functions of hundreds of essential genes in the vertebrate genome. To accelerate the molecular analysis of zebrafish mutations and to facilitate comparisons among the genomes of zebrafish and other vertebrates, we used a homozygous diploid meiotic mapping panel to localize polymorphisms in 691 previously unmapped genes and expressed sequence tags (ESTs). Together with earlier efforts, this work raises the total number of markers scored in the mapping panel to 2119, including 1503 genes and ESTs and 616 previously characterized simple-sequence length polymorphisms. Sequence analysis of zebrafish genes mapped in this study and in prior work identified putative human orthologs for 804 zebrafish genes and ESTs. Map comparisons revealed 139 new conserved syntenies, in which two or more genes are on the same chromosome in zebrafish and human. Although some conserved syntenies are quite large, there were changes in gene order within conserved groups, apparently reflecting the relatively frequent occurrence of inversions and other intrachromosomal rearrangements since the divergence of teleost and tetrapod ancestors. Comparative mapping also shows that there is not a one-to-one correspondence between zebrafish and human chromosomes. Mapping of duplicate gene pairs identified segments of 20 linkage groups that may have arisen during a genome duplication that occurred early in the evolution of teleosts after the divergence of teleost and mammalian ancestors. This comparative map will accelerate the molecular analysis of zebrafish mutations and enhance the understanding of the evolution of the vertebrate genome. PMID:11116086

  10. Comparative analysis of contextual bias around the translation initiation sites in plant genomes.

    PubMed

    Gupta, Paras; Rangan, Latha; Ramesh, T Venkata; Gupta, Mudit

    2016-09-07

    Nucleotide distribution around translation initiation site (TIS) is thought to play an important role in determining translation efficiency. Kozak in vertebrates and later Joshi et al. in plants identified context sequence having a key role in translation efficiency, but a great variation regarding this context sequence has been observed among different taxa. The present study aims to refine the context sequence around initiation codon in plants and addresses the sampling error problem by using complete genomes of 7 monocots and 7 dicots separately. Besides positions -3 and +4, significant conservation at -2 and +5 positions was also found and nucleotide bias at the latter two positions was shown to directly influence translation efficiency in the taxon studied. About 1.8% (monocots) and 2.4% (dicots) of the total sequences fit the context sequence from positions -3 to +5, which might be indicative of lower number of housekeeping genes in the transcriptome. A three base periodicity was observed in 5' UTR and CDS of monocots and only in CDS of dicots as confirmed against random occurrence and annotation errors. Deterministic enrichment of GCNAUGGC in monocots, AANAUGGC in dicots and GCNAUGGC in plants around TIS was also established (where AUG denotes the start codon), which can serve as an arbiter of putative TIS with efficient translation in plants.

  11. Genome resequencing and comparative variome analysis in a Brassica rapa and Brassica oleracea collection

    PubMed Central

    Cheng, Feng; Wu, Jian; Cai, Chengcheng; Fu, Lixia; Liang, Jianli; Borm, Theo; Zhuang, Mu; Zhang, Yangyong; Zhang, Fenglan; Bonnema, Guusje; Wang, Xiaowu

    2016-01-01

    The closely related species Brassica rapa and B. oleracea encompass a wide range of vegetable, fodder and oil crops. The release of their reference genomes has facilitated resequencing collections of B. rapa and B. oleracea aiming to build their variome datasets. These data can be used to investigate the evolutionary relationships between and within the different species and the domestication of the crops, hereafter named morphotypes. These data can also be used in genetic studies aiming at the identification of genes that influence agronomic traits. We selected and resequenced 199 B. rapa and 119 B. oleracea accessions representing 12 and nine morphotypes, respectively. Based on these resequencing data, we obtained 2,249,473 and 3,852,169 high quality SNPs (single-nucleotide polymorphisms), as well as 303,617 and 417,004 InDels for the B. rapa and B. oleracea populations, respectively. The variome datasets of B. rapa and B. oleracea represent valuable resources to researchers working on evolution, domestication or breeding of Brassica vegetable crops. PMID:27996963

  12. Comparative Analysis of Human B Cell Epitopes Based on BCG Genomes

    PubMed Central

    Liu, Haican; Zhao, Xiuqin; Wan, Kanglin

    2016-01-01

    Background. Tuberculosis is a huge global health problem. BCG is the only vaccine used for about 100 years against TB, but the reasons for protection variability in populations remain unclear. To improve BCG efficacy and develop a strategy for new vaccines, the underlying genetic differences among BCG subtypes should be understood urgently. Methods and Findings. Human B cell epitope data were collected from the Immune Epitope Database. Epitope sequences were mapped with those of 15 genomes, including 13 BCGs, M. bovis AF2122/97, and M. tuberculosis H37Rv, to identify epitopes distribution. Among 398 experimentally verified B cell epitopes, 321 (80.7%) were conserved, while the remaining 77 (19.3%) were lost to varying degrees in BCGs. The variable protective efficacy of BCGs may result from the degree of B cell epitopes deficiency. Conclusions. Here we firstly analyzed the genetic characteristics of BCGs based on B cell epitopes and found that B cell epitopes distribution may contribute to vaccine efficacy. Restoration of important antigens or effective B cell epitopes in BCG could be a useful strategy for vaccine development. PMID:27382565

  13. Genome resequencing and comparative variome analysis in a Brassica rapa and Brassica oleracea collection.

    PubMed

    Cheng, Feng; Wu, Jian; Cai, Chengcheng; Fu, Lixia; Liang, Jianli; Borm, Theo; Zhuang, Mu; Zhang, Yangyong; Zhang, Fenglan; Bonnema, Guusje; Wang, Xiaowu

    2016-12-20

    The closely related species Brassica rapa and B. oleracea encompass a wide range of vegetable, fodder and oil crops. The release of their reference genomes has facilitated resequencing collections of B. rapa and B. oleracea aiming to build their variome datasets. These data can be used to investigate the evolutionary relationships between and within the different species and the domestication of the crops, hereafter named morphotypes. These data can also be used in genetic studies aiming at the identification of genes that influence agronomic traits. We selected and resequenced 199 B. rapa and 119 B. oleracea accessions representing 12 and nine morphotypes, respectively. Based on these resequencing data, we obtained 2,249,473 and 3,852,169 high quality SNPs (single-nucleotide polymorphisms), as well as 303,617 and 417,004 InDels for the B. rapa and B. oleracea populations, respectively. The variome datasets of B. rapa and B. oleracea represent valuable resources to researchers working on evolution, domestication or breeding of Brassica vegetable crops.

  14. Comparative genomic analysis of Saccharomyces cerevisiae yeasts isolated from fermentations of traditional beverages unveils different adaptive strategies.

    PubMed

    Ibáñez, Clara; Pérez-Torrado, Roberto; Chiva, Rosana; Guillamón, José Manuel; Barrio, Eladio; Querol, Amparo

    2014-02-03

    Saccharomyces cerevisiae strains are the main responsible of most traditional alcohol fermentation processes performed around the world. The characteristics of the diverse traditional fermentations are very different according to their sugar composition, temperature, pH or nitrogen sources. During the adaptation of yeasts to these new environments provided by human activity, their different compositions likely imposed selective pressures that shaped the S. cerevisiae genome. In the present work we performed a comparative genomic hybridization analysis to explore the genome constitution of six S. cerevisiae strains isolated from different traditional fermentations (masato, mescal, cachaça, sake, wine, and sherry wine) and one natural strain. Our results indicate that gene copy numbers (GCN) are very variable among strains, and most of them were observed in subtelomeric and intrachromosomal gene families involved in metabolic functions related to cellular homeostasis, cell-to-cell interactions, and transport of solutes such as ions, sugars and metals. In many cases, these genes are not essential but they can play an important role in the adaptation to new environmental conditions. However, the most interesting result is the association observed between GCN changes in genes involved in the nitrogen metabolism and the availability of nitrogen sources in the different traditional fermentation processes. This is clearly illustrated by the differences in copy numbers not only in gene PUT1, the main player in the assimilation of proline as a nitrogen source, but also in CAR2, involved in arginine catabolism. Strains isolated from fermentations where proline is more abundant contain a higher number of PUT1 copies and are more efficient in assimilating this amino acid as a nitrogen source. A strain isolated from sugarcane juice fermentations, in which arginine is a rare amino acid, contains less copies of CAR2 and showed low efficiency in arginine assimilation. These

  15. A comparative phenotypic and genomic analysis of C57BL/6J and C57BL/6N mouse strains

    PubMed Central

    2013-01-01

    Background The mouse inbred line C57BL/6J is widely used in mouse genetics and its genome has been incorporated into many genetic reference populations. More recently large initiatives such as the International Knockout Mouse Consortium (IKMC) are using the C57BL/6N mouse strain to generate null alleles for all mouse genes. Hence both strains are now widely used in mouse genetics studies. Here we perform a comprehensive genomic and phenotypic analysis of the two strains to identify differences that may influence their underlying genetic mechanisms. Results We undertake genome sequence comparisons of C57BL/6J and C57BL/6N to identify SNPs, indels and structural variants, with a focus on identifying all coding variants. We annotate 34 SNPs and 2 indels that distinguish C57BL/6J and C57BL/6N coding sequences, as well as 15 structural variants that overlap a gene. In parallel we assess the comparative phenotypes of the two inbred lines utilizing the EMPReSSslim phenotyping pipeline, a broad based assessment encompassing diverse biological systems. We perform additional secondary phenotyping assessments to explore other phenotype domains and to elaborate phenotype differences identified in the primary assessment. We uncover significant phenotypic differences between the two lines, replicated across multiple centers, in a number of physiological, biochemical and behavioral systems. Conclusions Comparison of C57BL/6J and C57BL/6N demonstrates a range of phenotypic differences that have the potential to impact upon penetrance and expressivity of mutational effects in these strains. Moreover, the sequence variants we identify provide a set of candidate genes for the phenotypic differences observed between the two strains. PMID:23902802

  16. A comparative genomic study in schizophrenic and in bipolar disorder patients, based on microarray expression profiling meta-analysis.

    PubMed

    Logotheti, Marianthi; Papadodima, Olga; Venizelos, Nikolaos; Chatziioannou, Aristotelis; Kolisis, Fragiskos

    2013-01-01

    Schizophrenia affecting almost 1% and bipolar disorder affecting almost 3%-5% of the global population constitute two severe mental disorders. The catecholaminergic and the serotonergic pathways have been proved to play an important role in the development of schizophrenia, bipolar disorder, and other related psychiatric disorders. The aim of the study was to perform and interpret the results of a comparative genomic profiling study in schizophrenic patients as well as in healthy controls and in patients with bipolar disorder and try to relate and integrate our results with an aberrant amino acid transport through cell membranes. In particular we have focused on genes and mechanisms involved in amino acid transport through cell membranes from whole genome expression profiling data. We performed bioinformatic analysis on raw data derived from four different published studies. In two studies postmortem samples from prefrontal cortices, derived from patients with bipolar disorder, schizophrenia, and control subjects, have been used. In another study we used samples from postmortem orbitofrontal cortex of bipolar subjects while the final study was performed based on raw data from a gene expression profiling dataset in the postmortem superior temporal cortex of schizophrenics. The data were downloaded from NCBI's GEO datasets.

  17. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    PubMed

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  18. Genome wide identification of Dof transcription factor gene family in sorghum and its comparative phylogenetic analysis with rice and Arabidopsis.

    PubMed

    Kushwaha, Hariom; Gupta, Shubhra; Singh, Vinay Kumar; Rastogi, Smita; Yadav, Dinesh

    2011-11-01

    The Dof (DNA binding with One Finger) family represents a classic zinc-finger transcription factors involved with multifarious roles exclusively in plants. There exists great diversity in terms of number of Dof genes observed in different crops. In current study, a total of 28 putative Dof genes have been predicted in silico from the recently available whole genome shotgun sequence of Sorghum bicolor (L.) Moench (with assigned accession numbers TPA:BK006983-BK007006 and TPA:BK007079-BK007082). The predicted SbDof genes are distributed on nine out of ten chromosomes of sorghum and most of these genes lack introns based on canonical intron/exon structure. Phylogenetic analysis of 28 SbDof proteins resulted in four subgroups constituting six clusters. The comparative phylogenetic analysis of these Dof proteins along with 30 rice and 36 Arabidopsis Dof proteins revealed six major groups similar to what has been observed earlier for rice and Arabidopsis. Motif analysis revealed the presence of conserved 50-52 amino acids Dof domain uniformly distributed across all the 28 Dof proteins of sorghum. The in silico cis-regulatory elements analysis of these SbDof genes suggested its diverse functions associated with light responsiveness, endosperm specific gene expression, hormone responsiveness, meristem specific expression and stress responsiveness.

  19. Comparative Analysis.

    DTIC Science & Technology

    1987-11-01

    Al" 56 COMPARATIVE ANALYSIS(U) MASSACJWSETTS INST OF TECH 1/1 CAMBRIDGE ARTIFICIAL INTELLIGENCE LAO 9 S MELD NOV B, RI-N-95i NMM4-85-K-0124...0 0 0 0 0 0 i. -~~ --- WU V1 2 fwx~, - - W-na alc F!LFI MASSACHUSETTS INSTITUTE OF TECHNOLOGY CO ARTIFICIAL INTELLIGENCE LABORATORY Lfl 0 Al/Memo...differential qualita- tive (DQ) analysis, which solves the task, providing explanations suitable for use by design systems, automated diagnosis, intelligent

  20. Comparative Genome Analysis of Extended-Spectrum-β-Lactamase-Producing Escherichia coli Sequence Type 131 Strains from Nepal and Japan

    PubMed Central

    Miyoshi-Akiyama, Tohru; Sherchan, Jatan Bahadur; Doi, Yohei; Nagamatsu, Maki; Sherchand, Jeevan B.; Tandukar, Sarmila; Ohmagari, Norio; Kirikae, Teruo; Ohara, Hiroshi

    2016-01-01

    ABSTRACT The global spread of extended-spectrum-β-lactamase (ESBL)-producing Escherichia coli (ESBL-E. coli) has largely been driven by the pandemic sequence type 131 (ST131). This study aimed to determine the molecular epidemiology of their spread in two Asian countries with contrasting prevalence. We conducted whole-genome sequencing (WGS) of ESBL-E. coli ST131 strains collected prospectively from Nepal and Japan, two countries in Asia with a high and low prevalence of ESBL-E. coli, respectively. We also systematically compared these genomes with those reported from other regions using publicly available WGS data for E. coli ST131 strains. Further, we conducted phylogenetic analysis of these isolates and all genome sequence data for ST131 strains to determine sequence diversity. One hundred five unique ESBL-E. coli isolates from Nepal (February 2013 to July 2013) and 76 isolates from Japan (October 2013 to September 2014) were included. Of these isolates, 54 (51%) isolates from Nepal and 11 (14%) isolates from Japan were identified as ST131 by WGS. Phylogenetic analysis based on WGS suggested that the majority of ESBL-E. coli ST131 isolates from Nepal clustered together, whereas those from Japan were more diverse. Half of the ESBL-E. coli ST131 isolates from Japan belonged to virotype C, whereas half of the isolates from Nepal belonged to a virotype other than virotype A, B, C, D, or E (A/B/C/D/E). The dominant sublineage of E. coli ST131 was H30Rx, which was most prominent in ESBL-E. coli ST131 isolates from Nepal. Our results revealed distinct phylogenetic characteristics of ESBL-E. coli ST131 spread in the two geographical areas of Asia, indicating the involvement of multiple factors in its local spread in each region. IMPORTANCE The global spread of ESBL-E. coli has been driven in large part by pandemic sequence type 131 (ST131). A recent study suggested that, within E. coli ST131, certain sublineages have disseminated worldwide with little association

  1. Functionally-focused algorithmic analysis of high resolution microarray-CGH genomic landscapes demonstrates comparable genomic copy number aberrations in MSI and MSS sporadic colorectal cancer

    PubMed Central

    Ali, Hamad; Bitar, Milad S.; Al Madhoun, Ashraf; Marafie, Makia; Al-Mulla, Fahd

    2017-01-01

    Array-based comparative genomic hybridization (aCGH) emerged as a powerful technology for studying copy number variations at higher resolution in many cancers including colorectal cancer. However, the lack of standardized systematic protocols including bioinformatic algorithms to obtain and analyze genomic data resulted in significant variation in the reported copy number aberration (CNA) data. Here, we present genomic aCGH data obtained using highly stringent and functionally relevant statistical algorithms from 116 well-defined microsatellites instable (MSI) and microsatellite stable (MSS) colorectal cancers. We utilized aCGH to characterize genomic CNAs in 116 well-defined sets of colorectal cancer (CRC) cases. We further applied the significance testing for aberrant copy number (STAC) and Genomic Identification of Significant Targets in Cancer (GISTIC) algorithms to identify functionally relevant (nonrandom) chromosomal aberrations in the analyzed colorectal cancer samples. Our results produced high resolution genomic landscapes of both, MSI and MSS sporadic CRC. We found that CNAs in MSI and MSS CRCs are heterogeneous in nature but may be divided into 3 distinct genomic patterns. Moreover, we show that although CNAs in MSI and MSS CRCs differ with respect to their size, number and chromosomal distribution, the functional copy number aberrations obtained from MSI and MSS CRCs were in fact comparable but not identical. These unifying CNAs were verified by MLPA tumor-loss gene panel, which spans 15 different chromosomal locations and contains 50 probes for at least 20 tumor suppressor genes. Consistently, deletion/amplification in these frequently cancer altered genes were identical in MSS and MSI CRCs. Our results suggest that MSI and MSS copy number aberrations driving CRC may be functionally comparable. PMID:28231327

  2. Comparative genomic analysis of light-regulated transcripts in the Solanaceae

    PubMed Central

    Rutitzky, Mariana; Ghiglione, Hernan O; Curá, José A; Casal, Jorge J; Yanovsky, Marcelo J

    2009-01-01

    Background Plants use different light signals to adjust their growth and development to the prevailing environmental conditions. Studies in the model species Arabidopsis thaliana and rice indicate that these adjustments are mediated by large changes in the transcriptome. Here we compared transcriptional responses to light in different species of the Solanaceae to investigate common as well as species-specific changes in gene expression. Results cDNA microarrays were used to identify genes regulated by a transition from long days (LD) to short days (SD) in the leaves of potato and tobacco plants, and by phytochrome B (phyB), the photoreceptor that represses tuberization under LD in potato. We also compared transcriptional responses to photoperiod in Nicotiana tabacum Maryland Mammoth (MM), which flowers only under SD, with those of Nicotiana sylvestris, which flowers only under LD conditions. Finally, we identified genes regulated by red compared to far-red light treatments that promote germination in tomato. Conclusion Most of the genes up-regulated in LD were associated with photosynthesis, the synthesis of protective pigments and the maintenance of redox homeostasis, probably contributing to the acclimatization to seasonal changes in irradiance. Some of the photoperiodically regulated genes were the same in potato and tobacco. Others were different but belonged to similar functional categories, suggesting that conserved as well as convergent evolutionary processes are responsible for physiological adjustments to seasonal changes in the Solanaceae. A β-ZIP transcription factor whose expression correlated with the floral transition in Nicotiana species with contrasting photoperiodic responses was also regulated by photoperiod and phyB in potato, and is a candidate gene to act as a general regulator of photoperiodic responses. Finally, GIGANTEA, a gene that controls flowering time in Arabidopsis thaliana and rice, was regulated by photoperiod in the leaves of

  3. Datasets for evolutionary comparative genomics

    PubMed Central

    Liberles, David A

    2005-01-01

    Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856

  4. Comparative genomics of the liberibacteral plant pathogens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative analyses of multiple Liberibacter genomes provide significant insights into the evolutionary history, genetic diversity, and phylogenetic and metabolomic capacities among pathogenic bacteria that have caused tremendous economic losses to agricultural crops. In addition, genomic analyses ...

  5. Integration of the Rat Recombination and EST Maps in the Rat Genomic Sequence and Comparative Mapping Analysis With the Mouse Genome

    PubMed Central

    Wilder, Steven P.; Bihoreau, Marie-Thérèse; Argoud, Karène; Watanabe, Takeshi K.; Lathrop, Mark; Gauguier, Dominique

    2004-01-01

    Inbred strains of the laboratory rat are widely used for identifying genetic regions involved in the control of complex quantitative phenotypes of biomedical importance. The draft genomic sequence of the rat now provides essential information for annotating rat quantitative trait locus (QTL) maps. Following the survey of unique rat microsatellite (11,585 including 1648 new markers) and EST (10,067) markers currently available, we have incorporated a selection of 7952 rat EST sequences in an improved version of the integrated linkage-radiation hybrid map of the rat containing 2058 microsatellite markers which provided over 10,000 potential anchor points between rat QTL and the genomic sequence of the rat. A total of 996 genetic positions were resolved (avg. spacing 1.77 cM) in a single large intercross and anchored in the rat genomic sequence (avg. spacing 1.62 Mb). Comparative genome maps between rat and mouse were constructed by successful computational alignment of 6108 mapped rat ESTs in the mouse genome. The integration of rat linkage maps in the draft genomic sequence of the rat and that of other species represents an essential step for translating rat QTL intervals into human chromosomal targets. PMID:15060020

  6. Comparative genome analysis of a thermotolerant Escherichia coli obtained by Genome Replication Engineering Assisted Continuous Evolution (GREACE) and its parent strain provides new understanding of microbial heat tolerance.

    PubMed

    Luan, Guodong; Bao, Guanhui; Lin, Zhao; Li, Yang; Chen, Zugen; Li, Yin; Cai, Zhen

    2015-12-25

    Heat tolerance of microbes is of great importance for efficient biorefinery and bioconversion. However, engineering and understanding of microbial heat tolerance are difficult and insufficient because it is a complex physiological trait which probably correlates with all gene functions, genetic regulations, and cellular metabolisms and activities. In this work, a novel strain engineering approach named Genome Replication Engineering Assisted Continuous Evolution (GREACE) was employed to improve the heat tolerance of Escherichia coli. When the E. coli strain carrying a mutator was cultivated under gradually increasing temperature, genome-wide mutations were continuously generated during genome replication and the mutated strains with improved thermotolerance were autonomously selected. A thermotolerant strain HR50 capable of growing at 50°C on LB agar plate was obtained within two months, demonstrating the efficiency of GREACE in improving such a complex physiological trait. To understand the improved heat tolerance, genomes of HR50 and its wildtype strain DH5α were sequenced. Evenly distributed 361 mutations covering all mutation types were found in HR50. Closed material transportations, loose genome conformation, and possibly altered cell wall structure and transcription pattern were the main differences of HR50 compared with DH5α, which were speculated to be responsible for the improved heat tolerance. This work not only expanding our understanding of microbial heat tolerance, but also emphasizing that the in vivo continuous genome mutagenesis method, GREACE, is efficient in improving microbial complex physiological trait.

  7. A comparative genomic analysis of targets of Hox protein Ultrabithorax amongst distant insect species

    PubMed Central

    Prasad, Naveen; Tarikere, Shreeharsha; Khanale, Dhanashree; Habib, Farhat; Shashidhara, L. S.

    2016-01-01

    In the fruitfly Drosophila melanogaster, the differential development of wing and haltere is dependent on the function of the Hox protein Ultrabithorax (Ubx). Here we compare Ubx-mediated regulation of wing patterning genes between the honeybee, Apis mellifera, the silkmoth, Bombyx mori and Drosophila. Orthologues of Ubx are expressed in the third thoracic segment of Apis and Bombyx, although they make functional hindwings. When over-expressed in transgenic Drosophila, Ubx derived from Apis or Bombyx could suppress wing development, suggesting evolutionary changes at the level of co-factors and/or targets of Ubx. To gain further insights into such events, we identified direct targets of Ubx from Apis and Bombyx by ChIP-seq and compared them with those of Drosophila. While majority of the putative targets of Ubx are species-specific, a considerable number of wing-patterning genes are retained, over the past 300 millions years, as targets in all the three species. Interestingly, many of these are differentially expressed only between wing and haltere in Drosophila but not between forewing and hindwing in Apis or Bombyx. Detailed bioinformatics and experimental validation of enhancer sequences suggest that, perhaps along with other factors, changes in the cis-regulatory sequences of earlier targets contribute to diversity in Ubx function. PMID:27296678

  8. Comparative genome analysis of non-toxigenic non-O1 versus toxigenic O1 Vibrio cholerae

    PubMed Central

    Mukherjee, Munmun; Kakarla, Prathusha; Kumar, Sanath; Gonzalez, Esmeralda; Floyd, Jared T.; Inupakutika, Madhuri; Devireddy, Amith Reddy; Tirrell, Selena R.; Bruns, Merissa; He, Guixin; Lindquist, Ingrid E.; Sundararajan, Anitha; Schilkey, Faye D.; Mudge, Joann; Varela, Manuel F.

    2015-01-01

    Pathogenic strains of Vibrio cholerae are responsible for endemic and pandemic outbreaks of the disease cholera. The complete toxigenic mechanisms underlying virulence in Vibrio strains are poorly understood. The hypothesis of this work was that virulent versus non-virulent strains of V. cholerae harbor distinctive genomic elements that encode virulence. The purpose of this study was to elucidate genomic differences between the O1 serotypes and non-O1 V. cholerae PS15, a non-toxigenic strain, in order to identify novel genes potentially responsible for virulence. In this study, we compared the whole genome of the non-O1 PS15 strain to the whole genomes of toxigenic serotypes at the phylogenetic level, and found that the PS15 genome was distantly related to those of toxigenic V. cholerae. Thus we focused on a detailed gene comparison between PS15 and the distantly related O1 V. cholerae N16961. Based on sequence alignment we tentatively assigned chromosome numbers 1 and 2 to elements within the genome of non-O1 V. cholerae PS15. Further, we found that PS15 and O1 V. cholerae N16961 shared 98% identity and 766 genes, but of the genes present in N16961 that were missing in the non-O1 V. cholerae PS15 genome, 56 were predicted to encode not only for virulence–related genes (colonization, antimicrobial resistance, and regulation of persister cells) but also genes involved in the metabolic biosynthesis of lipids, nucleosides and sulfur compounds. Additionally, we found 113 genes unique to PS15 that were predicted to encode other properties related to virulence, disease, defense, membrane transport, and DNA metabolism. Here, we identified distinctive and novel genomic elements between O1 and non-O1 V. cholerae genomes as potential virulence factors and, thus, targets for future therapeutics. Modulation of such novel targets may eventually enhance eradication efforts of endemic and pandemic disease cholera in afflicted nations. PMID:25722857

  9. Comparative genomic analysis of single-molecule sequencing and hybrid approaches for finishing the Clostridium autoethanogenum JA1-1 strain DSM 10061 genome

    SciTech Connect

    Brown, Steven D; Nagaraju, Shilpa; Utturkar, Sagar M; De Tissera, Sashini; Segovia, Simón; Mitchell, Wayne; Land, Miriam L; Dassanayake, Asela; Köpke, Michael

    2014-01-01

    Background Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. Results A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a

  10. 1p36 deletion syndrome confirmed by fluorescence in situ hybridization and array-comparative genomic hybridization analysis

    PubMed Central

    Kang, Dong Soo; Shin, Eunsim

    2016-01-01

    Pediatric epilepsy can be caused by various conditions, including specific syndromes. 1p36 deletion syndrome is reported in 1 in 5,000–10,000 newborns, and its characteristic clinical features include developmental delay, mental retardation, hypotonia, congenital heart defects, seizure, and facial dysmorphism. However, detection of the terminal deletion in chromosome 1p by conventional G-banded karyotyping is difficult. Here we present a case of epilepsy with profound developmental delay and characteristic phenotypes. A 7-year- and 6-month-old boy experienced afebrile generalized seizure at the age of 5 years and 3 months. He had recurrent febrile seizures since 12 months of age and showed severe global developmental delay, remarkable hypotonia, short stature, and dysmorphic features such as microcephaly; small, low-set ears; dark, straight eyebrows; deep-set eyes; flat nasal bridge; midface hypoplasia; and a small, pointed chin. Previous diagnostic work-up, including conventional chromosomal analysis, revealed no definite causes. However, array-comparative genomic hybridization analysis revealed 1p36 deletion syndrome with a 9.15-Mb copy loss of the 1p36.33-1p36.22 region, and fluorescence in situ hybridization analysis (FISH) confirmed this diagnosis. This case highlights the need to consider detailed chromosomal study for patients with delayed development and epilepsy. Furthermore, 1p36 deletion syndrome should be considered for patients presenting seizure and moderate-to-severe developmental delay, particularly if the patient exhibits dysmorphic features, short stature, and hypotonia. PMID:28018437

  11. 1p36 deletion syndrome confirmed by fluorescence in situ hybridization and array-comparative genomic hybridization analysis.

    PubMed

    Kang, Dong Soo; Shin, Eunsim; Yu, Jeesuk

    2016-11-01

    Pediatric epilepsy can be caused by various conditions, including specific syndromes. 1p36 deletion syndrome is reported in 1 in 5,000-10,000 newborns, and its characteristic clinical features include developmental delay, mental retardation, hypotonia, congenital heart defects, seizure, and facial dysmorphism. However, detection of the terminal deletion in chromosome 1p by conventional G-banded karyotyping is difficult. Here we present a case of epilepsy with profound developmental delay and characteristic phenotypes. A 7-year- and 6-month-old boy experienced afebrile generalized seizure at the age of 5 years and 3 months. He had recurrent febrile seizures since 12 months of age and showed severe global developmental delay, remarkable hypotonia, short stature, and dysmorphic features such as microcephaly; small, low-set ears; dark, straight eyebrows; deep-set eyes; flat nasal bridge; midface hypoplasia; and a small, pointed chin. Previous diagnostic work-up, including conventional chromosomal analysis, revealed no definite causes. However, array-comparative genomic hybridization analysis revealed 1p36 deletion syndrome with a 9.15-Mb copy loss of the 1p36.33-1p36.22 region, and fluorescence in situ hybridization analysis (FISH) confirmed this diagnosis. This case highlights the need to consider detailed chromosomal study for patients with delayed development and epilepsy. Furthermore, 1p36 deletion syndrome should be considered for patients presenting seizure and moderate-to-severe developmental delay, particularly if the patient exhibits dysmorphic features, short stature, and hypotonia.

  12. Comparative genomic analysis of duplicated homoeologous regions involved in the resistance of Brassica napus to stem canker.

    PubMed

    Fopa Fomeju, Berline; Falentin, Cyril; Lassalle, Gilles; Manzanares-Dauleux, Maria J; Delourme, Régine

    2015-01-01

    All crop species are current or ancient polyploids. Following whole genome duplication, structural and functional modifications result in differential gene content or regulation in the duplicated regions, which can play a fundamental role in the diversification of genes underlying complex traits. We have investigated this issue in Brassica napus, a species with a highly duplicated genome, with the aim of studying the structural and functional organization of duplicated regions involved in quantitative resistance to stem canker, a disease caused by the fungal pathogen Leptosphaeria maculans. Genome-wide association analysis on two oilseed rape panels confirmed that duplicated regions of ancestral blocks E, J, R, U, and W were involved in resistance to stem canker. The structural analysis of the duplicated genomic regions showed a higher gene density on the A genome than on the C genome and a better collinearity between homoeologous regions than paralogous regions, as overall in the whole B. napus genome. The three ancestral sub-genomes were involved in the resistance to stem canker and the fractionation profile of the duplicated regions corresponded to what was expected from results on the B. napus progenitors. About 60% of the genes identified in these duplicated regions were single-copy genes while less than 5% were retained in all the duplicated copies of a given ancestral block. Genes retained in several copies were mainly involved in response to stress, signaling, or transcription regulation. Genes with resistance-associated markers were mainly retained in more than two copies. These results suggested that some genes underlying quantitative resistance to stem canker might be duplicated genes. Genes with a hydrolase activity that were retained in one copy or R-like genes might also account for resistance in some regions. Further analyses need to be conducted to indicate to what extent duplicated genes contribute to the expression of the resistance phenotype.

  13. Cocoa/Cotton Comparative Genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  14. Comparative Analysis of Two Helicobacter pylori Strains using Genomics and Mass Spectrometry-Based Proteomics

    PubMed Central

    Karlsson, Roger; Thorell, Kaisa; Hosseini, Shaghayegh; Kenny, Diarmuid; Sihlbom, Carina; Sjöling, Åsa; Karlsson, Anders; Nookaew, Intawat

    2016-01-01

    Helicobacter pylori, a gastroenteric pathogen believed to have co-evolved with humans over 100,000 years, shows significant genetic variability. This motivates the study of different H. pylori strains and the diseases they cause in order to identify determinants for disease evolution. In this study, we used proteomics tools to compare two H. pylori strains. Nic25_A was isolated in Nicaragua from a patient with intestinal metaplasia, and P12 was isolated in Europe from a patient with duodenal ulcers. Differences in the abundance of surface proteins between the two strains were determined with two mass spectrometry-based methods, label-free quantification (MaxQuant) or the use of tandem mass tags (TMT). Each approach used a lipid-based protein immobilization (LPITM) technique to enrich peptides of surface proteins. Using the MaxQuant software, we found 52 proteins that differed significantly in abundance between the two strains (up- or downregulated by a factor of 1.5); with TMT, we found 18 proteins that differed in abundance between the strains. Strain P12 had a higher abundance of proteins encoded by the cag pathogenicity island, while levels of the acid response regulator ArsR and its regulatory targets (KatA, AmiE, and proteins involved in urease production) were higher in strain Nic25_A. Our results show that differences in protein abundance between H. pylori strains can be detected with proteomic approaches; this could have important implications for the study of disease progression. PMID:27891114

  15. Comparative analysis of pepper and tomato reveals euchromatin expansion of pepper genome caused by differential accumulation of Ty3/Gypsy-like elements

    PubMed Central

    2011-01-01

    Background Among the Solanaceae plants, the pepper genome is three times larger than that of tomato. Although the gene repertoire and gene order of both species are well conserved, the cause of the genome-size difference is not known. To determine the causes for the expansion of pepper euchromatic regions, we compared the pepper genome to that of tomato. Results For sequence-level analysis, we generated 35.6 Mb of pepper genomic sequences from euchromatin enriched 1,245 pepper BAC clones. The comparative analysis of orthologous gene-rich regions between both species revealed insertion of transposons exclusively in the pepper sequences, maintaining the gene order and content. The most common type of the transposon found was the LTR retrotransposon. Phylogenetic comparison of the LTR retrotransposons revealed that two groups of Ty3/Gypsy-like elements (Tat and Athila) were overly accumulated in the pepper genome. The FISH analysis of the pepper Tat elements showed a random distribution in heterochromatic and euchromatic regions, whereas the tomato Tat elements showed heterochromatin-preferential accumulation. Conclusions Compared to tomato pepper euchromatin doubled its size by differential accumulation of a specific group of Ty3/Gypsy-like elements. Our results could provide an insight on the mechanism of genome evolution in the Solanaceae family. PMID:21276256

  16. Construction of Global Acyl Lipid Metabolic Map by Comparative Genomics and Subcellular Localization Analysis in the Red Alga Cyanidioschyzon merolae

    PubMed Central

    Mori, Natsumi; Moriyama, Takashi; Toyoshima, Masakazu; Sato, Naoki

    2016-01-01

    Pathways of lipid metabolism have been established in land plants, such as Arabidopsis thaliana, but the information on exact pathways is still under study in microalgae. In contrast with Chlamydomonas reinhardtii, which is currently studied extensively, the pathway information in red algae is still in the state in which enzymes and pathways are estimated by analogy with the knowledge in plants. Here we attempt to construct the entire acyl lipid metabolic pathways in a model red alga, Cyanidioschyzon merolae, as an initial basis for future genetic and biochemical studies, by exploiting comparative genomics and localization analysis. First, the data of whole genome clustering by Gclust were used to identify 121 acyl lipid-related enzymes. Then, the localization of 113 of these enzymes was analyzed by GFP-based techniques. We found that most of the predictions on the subcellular localization by existing tools gave erroneous results, probably because these tools had been tuned for plants or green algae. The experimental data in the present study as well as the data reported before in our laboratory will constitute a good training set for tuning these tools. The lipid metabolic map thus constructed show that the lipid metabolic pathways in the red alga are essentially similar to those in A. thaliana, except that the number of enzymes catalyzing individual reactions is quite limited. The absence of fatty acid desaturation to produce oleic and linoleic acids within the plastid, however, highlights the central importance of desaturation and acyl editing in the endoplasmic reticulum, for the synthesis of plastid lipids as well as other cellular lipids. Additionally, some notable characteristics of lipid metabolism in C. merolae were found. For example, phosphatidylcholine is synthesized by the methylation of phosphatidylethanolamine as in yeasts. It is possible that a single 3-ketoacyl-acyl carrier protein synthase is involved in the condensation reactions of fatty acid

  17. Experimental infection and comparative genomic analysis of a highly pathogenic PRRSV-HBR strain at different passage levels.

    PubMed

    Wei, Yanwu; Li, Shengbin; Huang, Liping; Tang, Qinghai; Liu, Jianbo; Liu, Dan; Wang, Yiping; Wu, Hongli; Liu, Changming

    2013-10-25

    A highly pathogenic strain of porcine reproductive and respiratory syndrome virus (PRRSV-HBR) was passaged on Marc-145 cells for 125 passages. In order to elucidate the change in virulence of PRRSV-HBR strain during the process of passage in vitro, swine infection experiment was performed with the viruses of low (F5 and F10) and high passage (F125). In addition, to identify the mutations related to the change in virulence of PRRSV-HBR strain, we compared and analyzed the genomic sequences of the F5, F10 and F125 of the strain. The virulence of F125 was significantly lower than that of F5 in the virus-infected pigs. In comparison with F5 and F125, there were 45 amino acids (aa) mutations and a deletion of 2 continuous aa by means of the virus genome sequence analysis. For these mutations, 33 aa (73.3%) occurred in the viral nonstructural proteins and the other 12 aa (26.7%) were contained in the viral structural proteins. Of the mutations, only 15 aa (33.3%) appeared in F10 and 30 aa (66.7%) occurred during passage from F10 to F125. The data showed that the latter 30 aa mutations were probably associated with attenuation of PRRSV-HBR strain, and that the change in virulence of the virus was determined by multiple alterations both in the structural and nonstructural genes. The virulence of PRRSV-HBR strain was remarkably attenuated after serial passages, and it can be used as vaccine candidate for control of the PRRS.

  18. Freshwater bacterial lifestyles inferred from comparative genomics.

    PubMed

    Livermore, Joshua A; Emrich, Scott J; Tan, John; Jones, Stuart E

    2014-03-01

    While micro-organisms actively mediate and participate in freshwater ecosystem services, we know little about freshwater microbial genetic diversity. Genome sequences are available for many bacteria from the human microbiome and the ocean (over 800 and 200, respectively), but only two freshwater genomes are currently available: the streamlined genomes of Polynucleobacter necessarius ssp. asymbioticus and the Actinobacterium AcI-B1. Here, we sequenced and analysed draft genomes of eight phylogentically diverse freshwater bacteria exhibiting a range of lifestyle characteristics. Comparative genomics of these bacteria reveals putative freshwater bacterial lifestyles based on differences in predicted growth rate, capability to respond to environmental stimuli and diversity of useable carbon substrates. Our conceptual model based on these genomic characteristics provides a foundation on which further ecophysiological and genomic studies can be built. In addition, these genomes greatly expand the diversity of existing genomic context for future studies on the ecology and genetics of freshwater bacteria.

  19. Comprehensive comparative-genomic analysis of Type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes

    PubMed Central

    Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

    2009-01-01

    Background The prokaryotic toxin-antitoxin systems (TAS, also referred to as TA loci) are widespread, mobile two-gene modules that can be viewed as selfish genetic elements because they evolved mechanisms to become addictive for replicons and cells in which they reside, but also possess "normal" cellular functions in various forms of stress response and management of prokaryotic population. Several distinct TAS of type 1, where the toxin is a protein and the antitoxin is an antisense RNA, and numerous, unrelated TAS of type 2, in which both the toxin and the antitoxin are proteins, have been experimentally characterized, and it is suspected that many more remain to be identified. Results We report a comprehensive comparative-genomic analysis of Type 2 toxin-antitoxin systems in prokaryotes. Using sensitive methods for distant sequence similarity search, genome context analysis and a new approach for the identification of mobile two-component systems, we identified numerous, previously unnoticed protein families that are homologous to toxins and antitoxins of known type 2 TAS. In addition, we predict 12 new families of toxins and 13 families of antitoxins, and also, predict a TAS or TAS-like activity for several gene modules that were not previously suspected to function in that capacity. In particular, we present indications that the two-gene module that encodes a minimal nucleotidyl transferase and the accompanying HEPN protein, and is extremely abundant in many archaea and bacteria, especially, thermophiles might comprise a novel TAS. We present a survey of previously known and newly predicted TAS in 750 complete genomes of archaea and bacteria, quantitatively demonstrate the exceptional mobility of the TAS, and explore the network of toxin-antitoxin pairings that combines plasticity with selectivity. Conclusion The defining properties of the TAS, namely, the typically small size of the toxin and antitoxin genes, fast evolution, and extensive horizontal mobility

  20. Genetic architecture in a marine hybrid zone: comparing outlier detection and genomic clines analysis in the bivalve Macoma balthica.

    PubMed

    Luttikhuizen, P C; Drent, J; Peijnenburg, K T C A; van der Veer, H W; Johannesson, K

    2012-06-01

    The role of natural selection in speciation has received increasing attention and support in recent years. Different types of approaches have been developed that can detect genomic regions influenced by selection. Here, we address the question whether two highly different methods--F(ST) outlier analysis and admixture analysis--detect largely the same set of non-neutral genomic elements or, instead, complementary sets. We study genetic architecture in a natural secondary contact zone where extensive admixture occurs. The marine bivalves Macoma balthica rubra and M. b. balthica descend from two independent trans-Arctic invasions of the north Atlantic and hybridize extensively where they meet, for example in the Kattegat-Danish Straits-Baltic Sea region. The Kattegat-Danish Straits region forms a steep salinity cline and is the only entrance to the recently (ca. 8000 years ago) established brackish water basin the Baltic Sea. Salinity along the contact zone drops from 30‰ (Skagerrak, M.b.rubra) to 3‰ (Baltic, M.b.balthica). Both outlier analysis and genomic clines analysis suggest that large parts of the genome are influenced by non-neutral effects. Contrasting samples from well outside the hybrid zone, outlier analysis detects 16 of 84 amplified fragment length polymorphism markers as significant F(ST) outliers. Genomic clines analysis detects 31 of 84 markers as non-neutral inside the hybrid zone. Remarkably, only three markers are detected by both methods. We conclude that the two methods together identify a suite of markers that are under the influence of non-neutral effects.

  1. Comparative genomics of biotechnologically important yeasts

    PubMed Central

    Riley, Robert; Haridas, Sajeet; Wolfe, Kenneth H.; Lopes, Mariana R.; Hittinger, Chris Todd; Göker, Markus; Salamov, Asaf A.; Wisecaver, Jennifer H.; Long, Tanya M.; Aerts, Andrea L.; Barry, Kerrie W.; Choi, Cindy; Clum, Alicia; Coughlan, Aisling Y.; Deshpande, Shweta; Douglass, Alexander P.; Hanson, Sara J.; Klenk, Hans-Peter; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lipzen, Anna M.; Meier-Kolthoff, Jan P.; Ohm, Robin A.; Otillar, Robert P.; Pangilinan, Jasmyn L.; Peng, Yi; Rosa, Carlos A.; Scheuner, Carmen; Sibirny, Andriy A.; Slot, Jason C.; Stielow, J. Benjamin; Sun, Hui; Kurtzman, Cletus P.; Blackwell, Meredith; Grigoriev, Igor V.

    2016-01-01

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the clade sister to the known CUG-Ser clade. Our well-resolved yeast phylogeny shows that some traits, such as methylotrophy, are restricted to single clades, whereas others, such as l-rhamnose utilization, have patchy phylogenetic distributions. Gene clusters, with variable organization and distribution, encode many pathways of interest. Genomics can predict some biochemical traits precisely, but the genomic basis of others, such as xylose utilization, remains unresolved. Our data also provide insight into early evolution of ascomycetes. We document the loss of H3K9me2/3 heterochromatin, the origin of ascomycete mating-type switching, and panascomycete synteny at the MAT locus. These data and analyses will facilitate the engineering of efficient biosynthetic and degradative pathways and gateways for genomic manipulation. PMID:27535936

  2. Genomic analysis of oceanic cyanobacterial myoviruses compared with T4-like myoviruses from diverse hosts and environments

    PubMed Central

    Sullivan, Matthew B; Huang, Katherine H; Ignacio-Espinoza, Julio C; Berlin, Aaron M; Kelly, Libusha; Weigele, Peter R; DeFrancesco, Alicia S; Kern, Suzanne E; Thompson, Luke R; Young, Sarah; Yandava, Chandri; Fu, Ross; Krastins, Bryan; Chase, Michael; Sarracino, David; Osburne, Marcia S; Henn, Matthew R; Chisholm, Sallie W

    2010-01-01

    T4-like myoviruses are ubiquitous, and their genes are among the most abundant documented in ocean systems. Here we compare 26 T4-like genomes, including 10 from non-cyanobacterial myoviruses, and 16 from marine cyanobacterial myoviruses (cyanophages) isolated on diverse Prochlorococcus or Synechococcus hosts. A core genome of 38 virion construction and DNA replication genes was observed in all 26 genomes, with 32 and 25 additional genes shared among the non-cyanophage and cyanophage subsets, respectively. These hierarchical cores are highly syntenic across the genomes, and sampled to saturation. The 25 cyanophage core genes include six previously described genes with putative functions (psbA, mazG, phoH, hsp20, hli03, cobS), a hypothetical protein with a potential phytanoyl-CoA dioxygenase domain, two virion structural genes, and 16 hypothetical genes. Beyond previously described cyanophage-encoded photosynthesis and phosphate stress genes, we observed core genes that may play a role in nitrogen metabolism during infection through modulation of 2-oxoglutarate. Patterns among non-core genes that may drive niche diversification revealed that phosphorus-related gene content reflects source waters rather than host strain used for isolation, and that carbon metabolism genes appear associated with putative mobile elements. As well, phages isolated on Synechococcus had higher genome-wide %G+C and often contained different gene subsets (e.g. petE, zwf, gnd, prnA, cpeT) than those isolated on Prochlorococcus. However, no clear diagnostic genes emerged to distinguish these phage groups, suggesting blurred boundaries possibly due to cross-infection. Finally, genome-wide comparisons of both diverse and closely related, co-isolated genomes provide a locus-to-locus variability metric that will prove valuable for interpreting metagenomic data sets. PMID:20662890

  3. The complete chloroplast genome sequence of Morus cathayana and Morus multicaulis, and comparative analysis within genus Morus L

    PubMed Central

    Yang, Jin Hong

    2017-01-01

    Trees in the Morus genera belong to the Moraceae family. To better understand the species status of genus Morus and to provide information for studies on evolutionary biology within the genus, the complete chloroplast (cp) genomes of M. cathayana and M. multicaulis were sequenced. The plastomes of the two species are 159,265 bp and 159,103 bp, respectively, with corresponding 83 and 82 simple sequence repeats (SSRs). Similar to the SSRs of M. mongolica and M. indica cp genomes, more than 70% are mononucleotides, ten are in coding regions, and one exhibits nucleotide content polymorphism. Results for codon usage and relative synonymous codon usage show a strong bias towards NNA and NNT codons in the two cp genomes. Analysis of a plot of the effective number of codons (ENc) for five Morus spp. cp genomes showed that most genes follow the standard curve, but several genes have ENc values below the expected curve. The results indicate that both natural selection and mutational bias have contributed to the codon bias. Ten highly variable regions were identified among the five Morus spp. cp genomes, and 154 single-nucleotide polymorphism mutation events were accurately located in the gene coding region. PMID:28286710

  4. Comparative Genomic Analysis of Two Vibrio toranzoniae Strains with Different Virulence Capacity Reveals Clues on Its Pathogenicity for Fish

    PubMed Central

    Lasa, Aide; Gibas, Cynthia J.; Romalde, Jesús L.

    2017-01-01

    Vibrio toranzoniae is a Gram-negative bacterium of the Splendidus clade within the Vibrio genus. V. toranzoniae was first isolated from healthy clams in Galicia (Spain) but recently was also identified associated to disease outbreaks of red conger eel in Chile. Experimental challenges showed that the Chilean isolates were able to produce fish mortalities but not the strains isolated from clams. The aim of the present study was to determine the differences at the genomic level between the type strain of the species (CECT 7225T) and the strain R17, isolated from red conger eel in Chile, which could explain their different virulent capacity. The genome-based comparison showed high homology between both strains but differences were observed in certain gene clusters that include some virulence factors. Among these, we found that iron acquisition systems and capsule synthesis genes were the main differential features between both genomes that could explain the differences in the pathogenicity of the strains. Besides, the studied genomes presented genomic islands and toxins, and the R17 strain presented CRISPR sequences that are absent on the type strain. Taken together, this analysis provided important insights into virulence factors of V. toranzoniae that will lead to a better understanding of the pathogenic process. PMID:28194141

  5. Comparative Genomic Analysis of Two Vibrio toranzoniae Strains with Different Virulence Capacity Reveals Clues on Its Pathogenicity for Fish.

    PubMed

    Lasa, Aide; Gibas, Cynthia J; Romalde, Jesús L

    2017-01-01

    Vibrio toranzoniae is a Gram-negative bacterium of the Splendidus clade within the Vibrio genus. V. toranzoniae was first isolated from healthy clams in Galicia (Spain) but recently was also identified associated to disease outbreaks of red conger eel in Chile. Experimental challenges showed that the Chilean isolates were able to produce fish mortalities but not the strains isolated from clams. The aim of the present study was to determine the differences at the genomic level between the type strain of the species (CECT 7225(T)) and the strain R17, isolated from red conger eel in Chile, which could explain their different virulent capacity. The genome-based comparison showed high homology between both strains but differences were observed in certain gene clusters that include some virulence factors. Among these, we found that iron acquisition systems and capsule synthesis genes were the main differential features between both genomes that could explain the differences in the pathogenicity of the strains. Besides, the studied genomes presented genomic islands and toxins, and the R17 strain presented CRISPR sequences that are absent on the type strain. Taken together, this analysis provided important insights into virulence factors of V. toranzoniae that will lead to a better understanding of the pathogenic process.

  6. Whole-genome comparative analysis of virulence genes unveils similarities and differences between endophytes and other symbiotic bacteria

    PubMed Central

    Lòpez-Fernàndez, Sebastiàn; Sonego, Paolo; Moretto, Marco; Pancher, Michael; Engelen, Kristof; Pertot, Ilaria; Campisano, Andrea

    2015-01-01

    Plant pathogens and endophytes co-exist and often interact with the host plant and within its microbial community. The outcome of these interactions may lead to healthy plants through beneficial interactions, or to disease through the inducible production of molecules known as virulence factors. Unravelling the role of virulence in endophytes may crucially improve our understanding of host-associated microbial communities and their correlation with host health. Virulence is the outcome of a complex network of interactions, and drawing the line between pathogens and endophytes has proven to be conflictive, as strain-level differences in niche overlapping, ecological interactions, state of the host's immune system and environmental factors are seldom taken into account. Defining genomic differences between endophytes and plant pathogens is decisive for understanding the boundaries between these two groups. Here we describe the major differences at the genomic level between seven grapevine endophytic test bacteria, and 12 reference strains. We describe the virulence factors detected in the genomes of the test group, as compared to endophytic and non-endophytic references, to better understand the distribution of these traits in endophytic genomes. To do this, we adopted a comparative whole-genome approach, encompassing BLAST-based searches through the GUI-based tools Mauve and BRIG as well as calculating the core and accessory genomes of three genera of enterobacteria. We outline divergences in metabolic pathways of these endophytes and reference strains, with the aid of the online platform RAST. We present a summary of the major differences that help in the drawing of the boundaries between harmless and harmful bacteria, in the spirit of contributing to a microbiological definition of endophyte. PMID:26074885

  7. Unraveling adaptation of Pontibacter korlensis to radiation and infertility in desert through complete genome and comparative transcriptomic analysis

    PubMed Central

    Dai, Jun; Dai, Wenkui; Qiu, Chuangzhao; Yang, Zhenyu; Zhang, Yi; Zhou, Mengzhou; Zhang, Lei; Fang, Chengxiang; Gao, Qiang; Yang, Qiao; Li, Xin; Wang, Zhi; Wang, Zhiyong; Jia, Zhenhua; Chen, Xiong

    2015-01-01

    The desert is a harsh habitat for flora and microbial life due to its aridness and strong radiation. In this study, we constructed the first complete and deeply annotated genome of the genus Pontibacter (Pontibacter korlensis X14-1T = CCTCC AB 206081T, X14-1). Reconstruction of the sugar metabolism process indicated that strain X14-1 can utilize diverse sugars, including cellulose, starch and sucrose; this result is consistent with previous experiments. Strain X14-1 is also able to resist desiccation and radiation in the desert through well-armed systems related to DNA repair, radical oxygen species (ROS) detoxification and the OstAB and TreYZ pathways for trehalose synthesis. A comparative transcriptomic analysis under gamma radiation revealed that strain X14-1 presents high-efficacy operating responses to