Science.gov

Sample records for comparative genomics based

  1. Prediction of microbial phenotypes based on comparative genomics

    PubMed Central

    2015-01-01

    The accessibility of almost complete genome sequences of uncultivable microbial species from metagenomes necessitates computational methods predicting microbial phenotypes solely based on genomic data. Here we investigate how comparative genomics can be utilized for the prediction of microbial phenotypes. The PICA framework facilitates application and comparison of different machine learning techniques for phenotypic trait prediction. We have improved and extended PICA's support vector machine plug-in and suggest its applicability to large-scale genome databases and incomplete genome sequences. We have demonstrated the stability of the predictive power for phenotypic traits, not perturbed by the rapid growth of genome databases. A new software tool facilitates the in-depth analysis of phenotype models, which associate expected and unexpected protein functions with particular traits. Most of the traits can be reliably predicted in only 60-70% complete genomes. We have established a new phenotypic model that predicts intracellular microorganisms. Thereby we could demonstrate that also independently evolved phenotypic traits, characterized by genome reduction, can be reliably predicted based on comparative genomics. Our results suggest that the extended PICA framework can be used to automatically annotate phenotypes in near-complete microbial genome sequences, as generated in large numbers in current metagenomics studies. PMID:26451672

  2. GenColors-based comparative genome databases for small eukaryotic genomes

    PubMed Central

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources. PMID:23193285

  3. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.

    PubMed

    Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

    2006-07-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this

  4. CFGP: a web-based, comparative fungal genomics platform.

    PubMed

    Park, Jongsun; Park, Bongsoo; Jung, Kyongyong; Jang, Suwang; Yu, Kwangyul; Choi, Jaeyoung; Kong, Sunghyung; Park, Jaejin; Kim, Seryun; Kim, Hyojeong; Kim, Soonok; Kim, Jihyun F; Blair, Jaime E; Lee, Kwangwon; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI. PMID:17947331

  5. WormBase: methods for data mining and comparative genomics.

    PubMed

    Harris, Todd W; Stein, Lincoln D

    2006-01-01

    WormBase is a comprehensive repository for information on Caenorhabditis elegans and related nematodes. Although the primary web-based interface of WormBase (http:// www.wormbase.org/) is familiar to most C. elegans researchers, WormBase also offers powerful data-mining features for addressing questions of comparative genomics, genome structure, and evolution. In this chapter, we focus on data mining at WormBase through the use of flexible web interfaces, custom queries, and scripts. The intended audience includes users wishing to query the database beyond the confines of the web interface or fetch data en masse. No knowledge of programming is necessary or assumed, although users with intermediate skills in the Perl scripting language will be able to utilize additional data-mining approaches. PMID:16988424

  6. FusoBase: an online Fusobacterium comparative genomic analysis platform

    PubMed Central

    Ang, Mia Yang; Heydari, Hamed; Jakubovics, Nick S.; Mahmud, Mahafizul Imran; Dutta, Avirup; Wee, Wei Yee; Wong, Guat Jah; Mutha, Naresh V.R.; Tan, Shi Yang; Choo, Siew Woh

    2014-01-01

    Fusobacterium are anaerobic gram-negative bacteria that have been associated with a wide spectrum of human infections and diseases. As the biology of Fusobacterium is still not well understood, comparative genomic analysis on members of this species will provide further insights on their taxonomy, phylogeny, pathogenicity and other information that may contribute to better management of infections and diseases. To facilitate the ongoing genomic research on Fusobacterium, a specialized database with easy-to-use analysis tools is necessary. Here we present FusoBase, an online database providing access to genome-wide annotated sequences of Fusobacterium strains as well as bioinformatics tools, to support the expanding scientific community. Using our custom-developed Pairwise Genome Comparison tool, we demonstrate how differences between two user-defined genomes and how insertion of putative prophages can be identified. In addition, Pathogenomics Profiling Tool is capable of clustering predicted genes across Fusobacterium strains and visualizing the results in the form of a heat map with dendrogram. Database URL: http://fusobacterium.um.edu.my. PMID:25149689

  7. A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

    PubMed Central

    STRONG, MICHAEL; CASCIO, DUILIO; EISENBERG, DAVID

    2004-01-01

    As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php. PMID:23653555

  8. CrusView: a Java-based visualization platform for comparative genomics analyses in Brassicaceae species.

    PubMed

    Chen, Hao; Wang, Xiangfeng

    2013-09-01

    In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/. PMID:23898041

  9. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  10. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  11. Genome Based Phylogeny and Comparative Genomic Analysis of Intra-Mammary Pathogenic Escherichia coli

    PubMed Central

    Richards, Vincent P.; Lefébure, Tristan; Pavinski Bitar, Paulina D.; Dogan, Belgin; Simpson, Kenneth W.; Schukken, Ynte H.; Stanhope, Michael J.

    2015-01-01

    Escherichia coli is an important cause of bovine mastitis and can cause both severe inflammation with a short-term transient infection, as well as less severe, but more chronic inflammation and infection persistence. E. coli is a highly diverse organism that has been classified into a number of different pathotypes or pathovars, and mammary pathogenic E. coli (MPEC) has been proposed as a new such pathotype. The purpose of this study was to use genome sequence data derived from both transient and persistent MPEC isolates (two isolates of each phenotype) to construct a genome-based phylogeny that places MPEC in its phylogenetic context with other E. coli pathovars. A subsidiary goal was to conduct comparative genomic analyses of these MPEC isolates with other E. coli pathovars to provide a preliminary perspective on loci that might be correlated with the MPEC phenotype. Both concatenated and consensus tree phylogenies did not support MPEC monophyly or the monophyly of either transient or persistent phenotypes. Three of the MPEC isolates (ECA-727, ECC-Z, and ECA-O157) originated from within the predominately commensal clade of E. coli, referred to as phylogroup A. The fourth MPEC isolate, of the persistent phenotype (ECC-1470), was sister group to an isolate of ETEC, falling within the E. coli B1 clade. This suggests that the MPEC phenotype has arisen on numerous independent occasions and that this has often, although not invariably, occurred from commensal ancestry. Examination of the genes present in the MPEC strains relative to the commensal strains identified a consistent presence of the type VI secretion system (T6SS) in the MPEC strains, with only occasional representation in commensal strains, suggesting that T6SS may be associated with MPEC pathogenesis and/or as an inter-bacterial competitive attribute and therefore could represent a useful target to explore for the development of MPEC specific inhibitors. PMID:25807497

  12. Ebolavirus comparative genomics

    DOE PAGESBeta

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; et al

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  13. Identification of Genomic Alterations in Pancreatic Cancer Using Array-Based Comparative Genomic Hybridization

    PubMed Central

    Liang, Jian-Wei; Shi, Zhi-Zhou; Shen, Tian-Yun; Che, Xu; Wang, Zheng; Shi, Su-Sheng; Xu, Xin; Cai, Yan; Zhao, Ping; Wang, Cheng-Feng; Zhou, Zhi-Xiang; Wang, Ming-Rong

    2014-01-01

    Background Genomic aberration is a common feature of human cancers and also is one of the basic mechanisms that lead to overexpression of oncogenes and underexpression of tumor suppressor genes. Our study aims to identify frequent genomic changes in pancreatic cancer. Materials and Methods We used array comparative genomic hybridization (array CGH) to identify recurrent genomic alterations and validated the protein expression of selected genes by immunohistochemistry. Results Sixteen gains and thirty-two losses occurred in more than 30% and 60% of the tumors, respectively. High-level amplifications at 7q21.3–q22.1 and 19q13.2 and homozygous deletions at 1p33–p32.3, 1p22.1, 1q22, 3q27.2, 6p22.3, 6p21.31, 12q13.2, 17p13.2, 17q21.31 and 22q13.1 were identified. Especially, amplification of AKT2 was detected in two carcinomas and homozygous deletion of CDKN2C in other two cases. In 15 independent validation samples, we found that AKT2 (19q13.2) and MCM7 (7q22.1) were amplified in 6 and 9 cases, and CAMTA2 (17p13.2) and PFN1 (17p13.2) were homozygously deleted in 3 and 1 cases. AKT2 and MCM7 were overexpressed, and CAMTA2 and PFN1 were underexpressed in pancreatic cancer tissues than in morphologically normal operative margin tissues. Both GISTIC and Genomic Workbench software identified 22q13.1 containing APOBEC3A and APOBEC3B as the only homozygous deletion region. And the expression levels of APOBEC3A and APOBEC3B were significantly lower in tumor tissues than in morphologically normal operative margin tissues. Further validation showed that overexpression of PSCA was significantly associated with lymph node metastasis, and overexpression of HMGA2 was significantly associated with invasive depth of pancreatic cancer. Conclusion These recurrent genomic changes may be useful for revealing the mechanism of pancreatic carcinogenesis and providing candidate biomarkers. PMID:25502777

  14. Ebolavirus comparative genomics.

    PubMed

    Jun, Se-Ran; Leuze, Michael R; Nookaew, Intawat; Uberbacher, Edward C; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas D; Wassenaar, Trudy M; Ussery, David W

    2015-09-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  15. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  16. Genome-Based Comparative Analyses of Antarctic and Temperate Species of Paenibacillus

    PubMed Central

    Dsouza, Melissa; Taylor, Michael W.; Turner, Susan J.; Aislabie, Jackie

    2014-01-01

    Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR). However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043–3,091 protein-coding sequences (CDSs), primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further investigations into

  17. A genome-wide analysis of array-based comparative genomic hybridization (CGH) data to detect intra-species variations and evolutionary relationships.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Array-based comparative genomics hybridization (CGH) has gained prevalence as a technique of choice for the detection of structural variations in the genome. In this study, we propose a novel genome-wide method of classification using CGH data, in order to reveal putative phylogenetic relationships ...

  18. An HMM-Based Comparative Genomic Framework for Detecting Introgression in Eukaryotes

    PubMed Central

    Liu, Kevin J.; Dai, Jingxuan; Truong, Kathy; Song, Ying; Kohn, Michael H.; Nakhleh, Luay

    2014-01-01

    One outcome of interspecific hybridization and subsequent effects of evolutionary forces is introgression, which is the integration of genetic material from one species into the genome of an individual in another species. The evolution of several groups of eukaryotic species has involved hybridization, and cases of adaptation through introgression have been already established. In this work, we report on PhyloNet-HMM—a new comparative genomic framework for detecting introgression in genomes. PhyloNet-HMM combines phylogenetic networks with hidden Markov models (HMMs) to simultaneously capture the (potentially reticulate) evolutionary history of the genomes and dependencies within genomes. A novel aspect of our work is that it also accounts for incomplete lineage sorting and dependence across loci. Application of our model to variation data from chromosome 7 in the mouse (Mus musculus domesticus) genome detected a recently reported adaptive introgression event involving the rodent poison resistance gene Vkorc1, in addition to other newly detected introgressed genomic regions. Based on our analysis, it is estimated that about 9% of all sites within chromosome 7 are of introgressive origin (these cover about 13 Mbp of chromosome 7, and over 300 genes). Further, our model detected no introgression in a negative control data set. We also found that our model accurately detected introgression and other evolutionary processes from synthetic data sets simulated under the coalescent model with recombination, isolation, and migration. Our work provides a powerful framework for systematic analysis of introgression while simultaneously accounting for dependence across sites, point mutations, recombination, and ancestral polymorphism. PMID:24922281

  19. Array-Based Comparative Genomic Hybridization for the Genomewide Detection of Submicroscopic Chromosomal Abnormalities

    PubMed Central

    Vissers, Lisenka E. L. M. ; de Vries, Bert B. A. ; Osoegawa, Kazutoyo ; Janssen, Irene M. ; Feuth, Ton ; Choy, Chik On ; Straatman, Huub ; van der Vliet, Walter ; Huys, Erik H. L. P. G. ; van Rijk, Anke ; Smeets, Dominique ; van Ravenswaaij-Arts, Conny M. A. ; Knoers, Nine V. ; van der Burgt, Ineke ; de Jong, Pieter J. ; Brunner, Han G. ; van Kessel, Ad Geurts ; Schoenmakers, Eric F. P. M. ; Veltman, Joris A. 

    2003-01-01

    Microdeletions and microduplications, not visible by routine chromosome analysis, are a major cause of human malformation and mental retardation. Novel high-resolution, whole-genome technologies can improve the diagnostic detection rate of these small chromosomal abnormalities. Array-based comparative genomic hybridization allows such a high-resolution screening by hybridizing differentially labeled test and reference DNAs to arrays consisting of thousands of genomic clones. In this study, we tested the diagnostic capacity of this technology using ∼3,500 flourescent in situ hybridization–verified clones selected to cover the genome with an average of 1 clone per megabase (Mb). The sensitivity and specificity of the technology were tested in normal-versus-normal control experiments and through the screening of patients with known microdeletion syndromes. Subsequently, a series of 20 cytogenetically normal patients with mental retardation and dysmorphisms suggestive of a chromosomal abnormality were analyzed. In this series, three microdeletions and two microduplications were identified and validated. Two of these genomic changes were identified also in one of the parents, indicating that these are large-scale genomic polymorphisms. Deletions and duplications as small as 1 Mb could be reliably detected by our approach. The percentage of false-positive results was reduced to a minimum by use of a dye-swap-replicate analysis, all but eliminating the need for laborious validation experiments and facilitating implementation in a routine diagnostic setting. This high-resolution assay will facilitate the identification of novel genes involved in human mental retardation and/or malformation syndromes and will provide insight into the flexibility and plasticity of the human genome. PMID:14628292

  20. coliBASE: an online database for Escherichia coli, Shigella and Salmonella comparative genomics

    PubMed Central

    Chaudhuri, Roy R.; Khan, Arshad M.; Pallen, Mark J.

    2004-01-01

    We have constructed coliBASE, a database for Escherichia coli, Shigella and Salmonella comparative genomics available online at http://colibase.bham.ac.uk. Unlike other E.coli databases, which focus on the laboratory model strain K12, coliBASE is intended to reflect the full diversity of E.coli and its relatives. The database contains comparative data including whole genome alignments and lists of putative orthologous genes, together with numerous analytical tools and links to existing online resources. The data are stored in a relational database, accessible by a number of user-friendly search methods and graphical browsers. The database schema is generic and can easily be applied to other bacterial genomes. Two such databases, CampyDB (for the analysis of Campylobacter spp.) and ClostriDB (for Clostridium spp.) are also available at http://campy.bham.ac.uk and http://clostri.bham.ac.uk, respectively. An example of the power of E.coli comparative analyses such as those available through coliBASE is presented. PMID:14681417

  1. Somatic alterations in the melanoma genome: a high-resolution array-based comparative genomic hybridization study.

    PubMed

    Gast, Andreas; Scherer, Dominique; Chen, Bowang; Bloethner, Sandra; Melchert, Stephanie; Sucker, Antje; Hemminki, Kari; Schadendorf, Dirk; Kumar, Rajiv

    2010-08-01

    We performed DNA microarray-based comparative genomic hybridization to identify somatic alterations specific to melanoma genome in 60 human cell lines from metastasized melanoma and from 44 corresponding peripheral blood mononuclear cells. Our data showed gross but nonrandom somatic changes specific to the tumor genome. Although the CDKN2A (78%) and PTEN (70%) loci were the major targets of mono-allelic and bi-allelic deletions, amplifications affected loci with BRAF (53%) and NRAS (12%) as well as EGFR (52%), MITF (40%), NOTCH2 (35%), CCND1 (18%), MDM2 (18%), CCNE1 (10%), and CDK4 (8%). The amplified loci carried additional genes, many of which could potentially play a role in melanoma. Distinct patterns of copy number changes showed that alterations in CDKN2A tended to be more clustered in cell lines with mutations in the BRAF and NRAS genes; the PTEN locus was targeted mainly in conjunction with BRAF mutations. Amplification of CCND1, CDK4, and other loci was significantly increased in cell lines without BRAF-NRAS mutations and so was the loss of chromosome arms 13q and 16q. Our data suggest involvement of distinct genetic pathways that are driven either through oncogenic BRAF and NRAS mutations complemented by aberrations in the CDKN2A and PTEN genes or involve amplification of oncogenic genomic loci and loss of 13q and 16q. It also emerges that each tumor besides being affected by major and most common somatic genetic alterations also acquires additional genetic alterations that could be crucial in determining response to small molecular inhibitors that are being currently pursued. PMID:20544847

  2. UniPrimer: A Web-Based Primer Design Tool for Comparative Analyses of Primate Genomes.

    PubMed

    Batnyam, Nomin; Lee, Jimin; Lee, Jungnam; Hong, Seung Bok; Oh, Sejong; Han, Kyudong

    2012-01-01

    Whole genome sequences of various primates have been released due to advanced DNA-sequencing technology. A combination of computational data mining and the polymerase chain reaction (PCR) assay to validate the data is an excellent method for conducting comparative genomics. Thus, designing primers for PCR is an essential procedure for a comparative analysis of primate genomes. Here, we developed and introduced UniPrimer for use in those studies. UniPrimer is a web-based tool that designs PCR- and DNA-sequencing primers. It compares the sequences from six different primates (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque) and designs primers on the conserved region across species. UniPrimer is linked to RepeatMasker, Primer3Plus, and OligoCalc softwares to produce primers with high accuracy and UCSC In-Silico PCR to confirm whether the designed primers work. To test the performance of UniPrimer, we designed primers on sample sequences using UniPrimer and manually designed primers for the same sequences. The comparison of the two processes showed that UniPrimer was more effective than manual work in terms of saving time and reducing errors. PMID:22693428

  3. Microarray-based comparative genomic hybridisation of breast cancer patients receiving neoadjuvant chemotherapy

    PubMed Central

    Pierga, J-Y; Reis-Filho, J S; Cleator, S J; Dexter, T; MacKay, A; Simpson, P; Fenwick, K; Iravani, M; Salter, J; Hills, M; Jones, C; Ashworth, A; Smith, I E; Powles, T; Dowsett, M

    2006-01-01

    We analysed the molecular genetic profiles of breast cancer samples before and after neoadjuvant chemotherapy with combination doxorubicin and cyclophosphamide (AC). DNA was obtained from microdissected frozen breast core biopsies from 44 patients before chemotherapy. Additional samples were obtained before the second course of chemotherapy (D21) and after the completion of the treatment (surgical specimens) in 17 and 21 patients, respectively. Microarray-based comparative genome hybridisation was performed using a platform containing ∼5800 bacterial artificial chromosome clones (genome-wide resolution: 0.9 Mb). Analysis of the 44 pretreatment biopsies revealed that losses of 4p, 4q, 5q, 12q13.11–12q13.12, 17p11.2 and 17q11.2; and gains of 1p, 2p, 7q, 9p, 11q, 19p and 19q were significantly associated with oestrogen receptor negativity. 16q21–q22.1 losses were associated with lobular and 8q24 gains with ductal types. Losses of 5q33.3–q4 and 18p11.31 and gains of 6p25.1–p25.2 and Xp11.4 were associated with HER2 amplification. No correlations between DNA copy number changes and clinical response to AC were found. Microarray-based comparative genome hybridisation analysis of matched pretreatment and D21 biopsies failed to identify statistically significant differences, whereas a comparison between matched pretreatment and surgical samples revealed a statistically significant acquired copy number gain on 11p15.2–11p15.5. The modest chemotherapy-driven genomic changes, despite profound loss of cell numbers, suggest that there is little therapeutic selection of resistant non-modal cell lineages. PMID:17133270

  4. Identification of Functional Differences in Metabolic Networks Using Comparative Genomics and Constraint-Based Models

    PubMed Central

    Hamilton, Joshua J.; Reed, Jennifer L.

    2012-01-01

    Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different

  5. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  6. Enhancer Identification through Comparative Genomics

    SciTech Connect

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  7. COMPARATIVE GENOMICS IN LEGUMES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The legume plant family will soon include three sequenced genomes. The majority of the gene-containing portions of the model legumes Medicago truncatula and Lotus japonicus have been sequenced in clone-by-clone projects, and the sequencing of the soybean genome is underway in a whole-genome shotgun ...

  8. The role of duplications in the evolution of genomes highlights the need for evolutionary-based approaches in comparative genomics

    PubMed Central

    2011-01-01

    Understanding the evolutionary plasticity of the genome requires a global, comparative approach in which genetic events are considered both in a phylogenetic framework and with regard to population genetics and environmental variables. In the mechanisms that generate adaptive and non-adaptive changes in genomes, segmental duplications (duplication of individual genes or genomic regions) and polyploidization (whole genome duplications) are well-known driving forces. The probability of fixation and maintenance of duplicates depends on many variables, including population sizes and selection regimes experienced by the corresponding genes: a combination of stochastic and adaptive mechanisms has shaped all genomes. A survey of experimental work shows that the distinction made between fixation and maintenance of duplicates still needs to be conceptualized and mathematically modeled. Here we review the mechanisms that increase or decrease the probability of fixation or maintenance of duplicated genes, and examine the outcome of these events on the adaptation of the organisms. Reviewers This article was reviewed by Dr. Etienne Joly, Dr. Lutz Walter and Dr. W. Ford Doolittle. PMID:21333002

  9. Exome sequencing and array-based comparative genomic hybridisation analysis of preferential 6-methylmercaptopurine producers.

    PubMed

    Chua, E W; Cree, S; Barclay, M L; Doudney, K; Lehnert, K; Aitchison, A; Kennedy, M A

    2015-10-01

    Preferential conversion of azathioprine or 6-mercaptopurine into methylated metabolites is a major cause of thiopurine resistance. To seek potentially Mendelian causes of thiopurine hypermethylation, we recruited 12 individuals who exhibited extreme therapeutic resistance while taking azathioprine or 6-mercaptopurine and performed whole-exome sequencing (WES) and copy-number variant analysis by array-based comparative genomic hybridisation (aCGH). Exome-wide variant filtering highlighted four genes potentially associated with thiopurine metabolism (ENOSF1 and NFS1), transport (SLC17A4) or therapeutic action (RCC2). However, variants of each gene were found only in two or three patients, and it is unclear whether these genes could influence thiopurine hypermethylation. Analysis by aCGH did not identify any unusual or pathogenic copy-number variants. This suggests that if causative mutations for the hypermethylation phenotype exist they may be heterogeneous, occurring in several different genes, or they may lie within regulatory regions not captured by WES. Alternatively, hypermethylation may arise from the involvement of multiple genes with small effects. To test this hypothesis would require recruitment of large patient samples and application of genome-wide association studies. PMID:25752523

  10. Comparative genomics: methods and applications

    NASA Astrophysics Data System (ADS)

    Haubold, Bernhard; Wiehe, Thomas

    2004-09-01

    Interpreting the functional content of a given genomic sequence is one of the central challenges of biology today. Perhaps the most promising approach to this problem is based on the comparative method of classic biology in the modern guise of sequence comparison. For instance, protein-coding regions tend to be conserved between species. Hence, a simple method for distinguishing a functional exon from the chance absence of stop codons is to investigate its homologue from closely related species. Predicting regulatory elements is even more difficult than exon prediction, but again, comparisons pinpointing conserved sequence motifs upstream of translation start sites are helping to unravel gene regulatory networks. In addition to interspecific studies, intraspecific sequence comparison yields insights into the evolutionary forces that have acted on a species in the past. Of particular interest here is the identification of selection events such as selective sweeps. Both intra- and interspecific sequence comparisons are based on a variety of computational methods, including alignment, phylogenetic reconstruction, and coalescent theory. This article surveys the biology and the central computational ideas applied in recent comparative genomics projects. We argue that the most fruitful method of understanding the functional content of genomes is to study them in the context of related genomic sequences. In particular, such a study may reveal selection, a fundamental pointer to biological relevance.

  11. Enhancer Identification through Comparative Genomics

    PubMed Central

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2007-01-01

    With the availability of genomic sequence from numerous vertebrates, a paradigm shift has occurred in the identification of distant-acting gene regulatory elements. In contrast to traditional gene-centric studies in which investigators randomly scanned genomic fragments that flank genes of interest in functional assays, the modern approach begins electronically with publicly available comparative sequence datasets that provide investigators with prioritized lists of putative functional sequences based on their evolutionary conservation. However, although a large number of tools and resources are now available, application of comparative genomic approaches remains far from trivial. In particular, it requires users to dynamically consider the species and methods for comparison depending on the specific biological question under investigation. While there is currently no single general rule to this end, it is clear that when applied appropriately, comparative genomic approaches exponentially increase our power in generating biological hypotheses for subsequent experimental testing. It is anticipated that cardiac-related genes and the identification of their distant-acting transcriptional enhancers are particularly poised to benefit from these modern capabilities. PMID:17276707

  12. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  13. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  14. Using comparative genomics to uncover new kinds of protein-based metabolic organelles in bacteria

    PubMed Central

    Jorda, Julien; Lopez, David; Wheatley, Nicole M; Yeates, Todd O

    2013-01-01

    Bacterial microcompartment (MCP) organelles are cytosolic, polyhedral structures consisting of a thin protein shell and a series of encapsulated, sequentially acting enzymes. To date, different microcompartments carrying out three distinct types of metabolic processes have been characterized experimentally in various bacteria. In the present work, we use comparative genomics to explore the existence of yet uncharacterized microcompartments encapsulating a broader set of metabolic pathways. A clustering approach was used to group together enzymes that show a strong tendency to be encoded in chromosomal proximity to each other while also being near genes for microcompartment shell proteins. The results uncover new types of putative microcompartments, including one that appears to encapsulate B12-independent, glycyl radical-based degradation of 1,2-propanediol, and another potentially involved in amino alcohol metabolism in mycobacteria. Preliminary experiments show that an unusual shell protein encoded within the glycyl radical-based microcompartment binds an iron-sulfur cluster, hinting at complex mechanisms in this uncharacterized system. In addition, an examination of the computed microcompartment clusters suggests the existence of specific functional variations within certain types of MCPs, including the alpha carboxysome and the glycyl radical-based microcompartment. The findings lead to a deeper understanding of bacterial microcompartments and the pathways they sequester. PMID:23188745

  15. Cytogenetic analysis of myxoid liposarcoma and myxofibrosarcoma by array‐based comparative genomic hybridisation

    PubMed Central

    Ohguri, T; Hisaoka, M; Kawauchi, S; Sasaki, K; Aoki, T; Kanemitsu, S; Matsuyama, A; Korogi, Y; Hashimoto, H

    2006-01-01

    Aim To investigate overall chromosomal alterations using array‐based comparative genomic hybridisation (CGH) of myxoid liposarcomas (MLSs) and myxofibrosarcomas (MFSs). Materials and methods Genomic DNA extracted from fresh‐frozen tumour tissues was labelled with fluorochromes and then hybridised on to an array consisting of 1440 bacterial artificial chromosome clones representing regions throughout the entire human genome important in cytogenetics and oncology. Results DNA copy number aberrations (CNAs) were found in all the 8 MFSs, but no alterations were found in 7 (70%) of 10 MLSs. In MFSs, the most frequent CNAs were gains at 7p21.1–p22.1 and 12q15–q21.1 and a loss at 13q14.3–q34. The second most frequent CNAs were gains at 7q33–q35, 9q22.31–q22.33, 12p13.32–pter, 17q22–q23, Xp11.2 and Xq12 and losses at 10p13–p14, 10q25, 11p11–p14, 11q23.3–q25, 20p11–p12 and 21q22.13–q22.2, which were detected in 38% of the MFSs examined. In MLSs, only a few CNAs were found in two sarcomas with gains at 8p21.2–p23.3, 8q11.22–q12.2 and 8q23.1–q24.3, and in one with gains at 5p13.2–p14.3 and 5q11.2–5q35.2 and a loss at 21q22.2–qter. Conclusions MFS has more frequent and diverse CNAs than MLS, which reinforces the hypothesis that MFS is genetically different from MLS. Out‐array CGH analysis may also provide several entry points for the identification of candidate genes associated with oncogenesis and progression in MFS. PMID:16751306

  16. Comparative genomics of Brassicaceae crops

    PubMed Central

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-01-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  17. Comparative genomics of Brassicaceae crops.

    PubMed

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-05-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  18. Pyrosequencing-Based Comparative Genome Analysis of Vibrio vulnificus Environmental Isolates

    PubMed Central

    Morrison, Shatavia S.; Williams, Tiffany; Cain, Aurora; Froelich, Brett; Taylor, Casey; Baker-Austin, Craig; Verner-Jeffreys, David; Hartnell, Rachel; Oliver, James D.; Gibas, Cynthia J.

    2012-01-01

    Between 1996 and 2006, the US Centers for Disease Control reported that the only category of food-borne infections increasing in frequency were those caused by members of the genus Vibrio. The Gram-negative bacterium Vibrio vulnificus is a ubiquitous inhabitant of estuarine waters, and is the number one cause of seafood-related deaths in the US. Many V. vulnificus isolates have been studied, and it has been shown that two genetically distinct subtypes, distinguished by 16S rDNA and other gene polymorphisms, are associated predominantly with either environmental or clinical isolation. While local genetic differences between the subtypes have been probed, only the genomes of clinical isolates have so far been completely sequenced. In order to better understand V. vulnificus as an agent of disease and to identify the molecular components of its virulence mechanisms, we have completed whole genome shotgun sequencing of three diverse environmental genotypes using a pyrosequencing approach. V. vulnificus strain JY1305 was sequenced to a depth of 33×, and strains E64MW and JY1701 were sequenced to lesser depth, covering approximately 99.9% of each genome. We have performed a comparative analysis of these sequences against the previously published sequences of three V. vulnificus clinical isolates. We find that the genome of V. vulnificus is dynamic, with 1.27% of genes in the C-genotype genomes not found in the E- genotype genomes. We identified key genes that differentiate between the genomes of the clinical and environmental genotypes. 167 genes were found to be specifically associated with environmental genotypes and 278 genes with clinical genotypes. Genes specific to the clinical strains include components of sialic acid catabolism, mannitol fermentation, and a component of a Type IV secretory pathway VirB4, as well as several other genes with potential significance for human virulence. Genes specific to environmental strains included several that may have

  19. Application of Array-Based Comparative Genomic Hybridization to Pediatric Neurologic Diseases

    PubMed Central

    Byeon, Jung Hye; Shin, Eunsim; Kim, Gun-Ha; Lee, Kyungok; Hong, Young Sook; Lee, Joo Won

    2014-01-01

    Purpose Array comparative genomic hybridization (array-CGH) is a technique used to analyze quantitative increase or decrease of chromosomes by competitive DNA hybridization of patients and controls. This study aimed to evaluate the benefits and yield of array-CGH in comparison with conventional karyotyping in pediatric neurology patients. Materials and Methods We included 87 patients from the pediatric neurology clinic with at least one of the following features: developmental delay, mental retardation, dysmorphic face, or epilepsy. DNA extracted from patients and controls was hybridized on the Roche NimbleGen 135K oligonucleotide array and compared with G-band karyotyping. The results were analyzed with findings reported in recent publications and internet databases. Results Chromosome imbalances, including 9 cases detected also by G-band karyotyping, were found in 28 patients (32.2%), and at least 19 of them seemed to be causally related to the abnormal phenotypes. Regarding each clinical symptom, 26.2% of 42 developmental delay patients, 44.4% of 18 mental retardation patients, 42.9% of 28 dysmorphic face patients, and 34.6% of 26 epilepsy patients showed abnormal array results. Conclusion Although there were relatively small number of tests in patients with pediatric neurologic disease, this study demonstrated that array-CGH is a very useful tool for clinical diagnosis of unknown genome abnormalities performed in pediatric neurology clinics. PMID:24339284

  20. Modeling bacterial evolution with comparative-genome-based marker systems: application to Mycobacterium tuberculosis evolution and pathogenesis.

    PubMed

    Alland, David; Whittam, Thomas S; Murray, Megan B; Cave, M Donald; Hazbon, Manzour H; Dix, Kim; Kokoris, Mark; Duesterhoeft, Andreas; Eisen, Jonathan A; Fraser, Claire M; Fleischmann, Robert D

    2003-06-01

    The comparative-genomic sequencing of two Mycobacterium tuberculosis strains enabled us to identify single nucleotide polymorphism (SNP) markers for studies of evolution, pathogenesis, and epidemiology in clinical M. tuberculosis. Phylogenetic analysis using these "comparative-genome markers" (CGMs) produced a highly unusual phylogeny with a complete absence of secondary branches. To investigate CGM-based phylogenies, we devised computer models to simulate sequence evolution and calculate new phylogenies based on an SNP format. We found that CGMs represent a distinct class of phylogenetic markers that depend critically on the genetic distances between compared "reference strains." Properly distanced reference strains generate CGMs that accurately depict evolutionary relationships, distorted only by branch collapse. Improperly distanced reference strains generate CGMs that distort and reroot outgroups. Applying this understanding to the CGM-based phylogeny of M. tuberculosis, we found evidence to suggest that this species is highly clonal without detectable lateral gene exchange. We noted indications of evolutionary bottlenecks, including one at the level of the PHRI "C" strain previously associated with particular virulence characteristics. Our evidence also suggests that loss of IS6110 to fewer than seven elements per genome is uncommon. Finally, we present population-based evidence that KasA, an important component of mycolic acid biosynthesis, develops G312S polymorphisms under selective pressure. PMID:12754238

  1. Comparative genomics for biodiversity conservation

    PubMed Central

    Grueber, Catherine E.

    2015-01-01

    Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem. PMID:26106461

  2. Full Genome Sequence-Based Comparative Study of Wild-Type and Vaccine Strains of Infectious Laryngotracheitis Virus from Italy

    PubMed Central

    Niero, Giulia; Moreno, Ana; Massi, Paola; Franchin, Elisa; Toppo, Stefano; Salata, Cristiano; Palù, Giorgio

    2016-01-01

    Infectious laryngotracheitis (ILT) is an acute and highly contagious respiratory disease of chickens caused by an alphaherpesvirus, infectious laryngotracheitis virus (ILTV). Recently, full genome sequences of wild-type and vaccine strains have been determined worldwide, but none was from Europe. The aim of this study was to determine and analyse the complete genome sequences of five ILTV strains. Sequences were also compared to reveal the similarity of strains across time and to discriminate between wild-type and vaccine strains. Genomes of three ILTV field isolates from outbreaks occurred in Italy in 1980, 2007 and 2011, and two commercial chicken embryo origin (CEO) vaccines were sequenced using the 454 Life Sciences technology. The comparison with the Serva genome showed that 35 open reading frames (ORFs) differed across the five genomes. Overall, 54 single nucleotide polymorphisms (SNPs) and 27 amino acid differences in 19 ORFs and two insertions in the UL52 and ORFC genes were identified. Similarity among the field strains and between the field and the vaccine strains ranged from 99.96% to 99.99%. Phylogenetic analysis revealed a close relationship among them, as well. This study generated data on genomic variation among Italian ILTV strains revealing that, even though the genetic variability of the genome is well conserved across time and between wild-type and vaccine strains, some mutations may help in differentiating among them and may be involved in ILTV virulence/attenuation. The results of this study can contribute to the understanding of the molecular bases of ILTV pathogenicity and provide genetic markers to differentiate between wild-type and vaccine strains. PMID:26890525

  3. Full Genome Sequence-Based Comparative Study of Wild-Type and Vaccine Strains of Infectious Laryngotracheitis Virus from Italy.

    PubMed

    Piccirillo, Alessandra; Lavezzo, Enrico; Niero, Giulia; Moreno, Ana; Massi, Paola; Franchin, Elisa; Toppo, Stefano; Salata, Cristiano; Palù, Giorgio

    2016-01-01

    Infectious laryngotracheitis (ILT) is an acute and highly contagious respiratory disease of chickens caused by an alphaherpesvirus, infectious laryngotracheitis virus (ILTV). Recently, full genome sequences of wild-type and vaccine strains have been determined worldwide, but none was from Europe. The aim of this study was to determine and analyse the complete genome sequences of five ILTV strains. Sequences were also compared to reveal the similarity of strains across time and to discriminate between wild-type and vaccine strains. Genomes of three ILTV field isolates from outbreaks occurred in Italy in 1980, 2007 and 2011, and two commercial chicken embryo origin (CEO) vaccines were sequenced using the 454 Life Sciences technology. The comparison with the Serva genome showed that 35 open reading frames (ORFs) differed across the five genomes. Overall, 54 single nucleotide polymorphisms (SNPs) and 27 amino acid differences in 19 ORFs and two insertions in the UL52 and ORFC genes were identified. Similarity among the field strains and between the field and the vaccine strains ranged from 99.96% to 99.99%. Phylogenetic analysis revealed a close relationship among them, as well. This study generated data on genomic variation among Italian ILTV strains revealing that, even though the genetic variability of the genome is well conserved across time and between wild-type and vaccine strains, some mutations may help in differentiating among them and may be involved in ILTV virulence/attenuation. The results of this study can contribute to the understanding of the molecular bases of ILTV pathogenicity and provide genetic markers to differentiate between wild-type and vaccine strains. PMID:26890525

  4. Comparative genomic analyses in Asparagus.

    PubMed

    Kuhl, Joseph C; Havey, Michael J; Martin, William J; Cheung, Foo; Yuan, Qiaoping; Landherr, Lena; Hu, Yi; Leebens-Mack, James; Town, Christopher D; Sink, Kenneth C

    2005-12-01

    Garden asparagus (Asparagus officinalis L.) belongs to the monocot family Asparagaceae in the order Asparagales. Onion (Allium cepa L.) and Asparagus officinalis are 2 of the most economically important plants of the core Asparagales, a well supported monophyletic group within the Asparagales. Coding regions in onion have lower GC contents than the grasses. We compared the GC content of 3374 unique expressed sequence tags (ESTs) from A. officinalis with Lycoris longituba and onion (both members of the core Asparagales), Acorus americanus (sister to all other monocots), the grasses, and Arabidopsis. Although ESTs in A. officinalis and Acorus had a higher average GC content than Arabidopsis, Lycoris, and onion, all were clearly lower than the grasses. The Asparagaceae have the smallest nuclear genomes among all plants in the core Asparagales, which typically have huge genomes. Within the Asparagaceae, European Asparagus species have approximately twice the nuclear DNA of that of southern African Asparagus species. We cloned and sequenced 20 genomic amplicons from European A. officinalis and the southern African species Asparagus plumosus and observed no clear evidence for a recent genome doubling in A. officinalis relative to A. plumosus. These results indicate that members of the genus Asparagus with smaller genomes may be useful genomic models for plants in the core Asparagales. PMID:16391674

  5. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi.

    PubMed

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  6. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi

    PubMed Central

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  7. Validation of the Agilent 244K oligonucleotide array-based comparative genomic hybridization platform for clinical cytogenetic diagnosis.

    PubMed

    Yu, Shihui; Bittel, Douglas C; Kibiryeva, Nataliya; Zwick, David L; Cooley, Linda D

    2009-09-01

    High-resolution microarray comparative genomic hybridization (aCGH) is being adopted for diagnostic evaluation of genomic disorders, but validation for clinical diagnosis has not yet been reported. We present validation data for the Agilent Human Genome Microarray Kit 244K for clinical application. The platform contains approximately 240,000 distinct 60-mer oligonucleotide probes spanning the entire human genome. We studied 45 previously characterized samples (43 abnormal, 2 normal), 32 with knowledge of prior results and 13 in a blinded manner with 11 performed in a reference laboratory providing microarray testing. Array analysis confirmed known aberrations in 43 samples and a normal result in 2. The array analysis corrected 1 karyotype and clarified 2 additional cases. Array data from 6 patients with 22q11.2 deletion found an average of 2.56 megabases (Mb; range, 2.49-2.62 Mb) with a common 2.43-Mb deleted region. Approximately 7 copy number variants from 400 base pairs to 1.6 Mb were identified per sample. Results demonstrate the usefulness of the aCGH-244K platform as a powerful diagnostic tool. PMID:19687311

  8. Comparative Analysis of Human B Cell Epitopes Based on BCG Genomes

    PubMed Central

    Liu, Haican; Zhao, Xiuqin; Wan, Kanglin

    2016-01-01

    Background. Tuberculosis is a huge global health problem. BCG is the only vaccine used for about 100 years against TB, but the reasons for protection variability in populations remain unclear. To improve BCG efficacy and develop a strategy for new vaccines, the underlying genetic differences among BCG subtypes should be understood urgently. Methods and Findings. Human B cell epitope data were collected from the Immune Epitope Database. Epitope sequences were mapped with those of 15 genomes, including 13 BCGs, M. bovis AF2122/97, and M. tuberculosis H37Rv, to identify epitopes distribution. Among 398 experimentally verified B cell epitopes, 321 (80.7%) were conserved, while the remaining 77 (19.3%) were lost to varying degrees in BCGs. The variable protective efficacy of BCGs may result from the degree of B cell epitopes deficiency. Conclusions. Here we firstly analyzed the genetic characteristics of BCGs based on B cell epitopes and found that B cell epitopes distribution may contribute to vaccine efficacy. Restoration of important antigens or effective B cell epitopes in BCG could be a useful strategy for vaccine development. PMID:27382565

  9. Comparative Analysis of Human B Cell Epitopes Based on BCG Genomes.

    PubMed

    Li, Machao; Liu, Haican; Zhao, Xiuqin; Wan, Kanglin

    2016-01-01

    Background. Tuberculosis is a huge global health problem. BCG is the only vaccine used for about 100 years against TB, but the reasons for protection variability in populations remain unclear. To improve BCG efficacy and develop a strategy for new vaccines, the underlying genetic differences among BCG subtypes should be understood urgently. Methods and Findings. Human B cell epitope data were collected from the Immune Epitope Database. Epitope sequences were mapped with those of 15 genomes, including 13 BCGs, M. bovis AF2122/97, and M. tuberculosis H37Rv, to identify epitopes distribution. Among 398 experimentally verified B cell epitopes, 321 (80.7%) were conserved, while the remaining 77 (19.3%) were lost to varying degrees in BCGs. The variable protective efficacy of BCGs may result from the degree of B cell epitopes deficiency. Conclusions. Here we firstly analyzed the genetic characteristics of BCGs based on B cell epitopes and found that B cell epitopes distribution may contribute to vaccine efficacy. Restoration of important antigens or effective B cell epitopes in BCG could be a useful strategy for vaccine development. PMID:27382565

  10. Microarray-based Comparative Genomic Indexing of the Cronobacter genus (Enterobacter sakazakii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cronobacter is a recently defined genus synonymous with Enterobacter sakazakii. This new genus currently comprises 6 genomospecies. To extend our understanding of the genetic relationship between Cronobacter sakazakii BAA-894 and the other species of this genus, microarray-based comparative genomi...

  11. Prenatal diagnosis of chromosomal abnormalities using array-based comparative genomic hybridization

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study was designed to evaluate the feasibility of using a targeted array-CGH strategy for prenatal diagnosis of genomic imbalances in a clinical setting of current pregnancies. Women undergoing prenatal diagnosis were counseled and offered array-CGH (BCM V4.0) in addition to routine chromosome ...

  12. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map

    PubMed Central

    Li, Ximei; Jin, Xin; Wang, Hantao; Zhang, Xianlong; Lin, Zhongxu

    2016-01-01

    A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19. PMID:27084896

  13. Comparative Genomics of Carp Herpesviruses

    PubMed Central

    Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

    2013-01-01

    Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803

  14. Comparative genomics of carp herpesviruses.

    PubMed

    Davison, Andrew J; Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P; Waltzek, Thomas B

    2013-03-01

    Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803

  15. Comparative genomic hybridization: an overview.

    PubMed Central

    Houldsworth, J.; Chaganti, R. S.

    1994-01-01

    Comparative genomic hybridization (CGH) is a newly described molecular-cytogenetic assay that globally assays for chromosomal gains and losses in a genomic complement. In this assay, normal human metaphase chromosomes are competitively hybridized with two differentially labeled genomic DNAs (test and reference), which upon fluorescence microscopy, reveal the chromosomal locations of copy number changes in DNA sequences between the two complements. Application of CGH to DNAs extracted from fresh frozen specimens and cell lines of various tumor types has revealed a number of recurring chromosomal gains and losses that were undetected by traditional cytogenetic analysis. Few previously known sites were found to be in higher copy number, or lost by CGH, while many novel amplified regions were identified. These regions warrant further molecular genetic studies aimed at isolating the perturbed genes. Since CGH can also be performed on DNA extracted from formalin-fixed paraffin-embedded archived tumor specimens with few modifications, gains and losses of genetic material can be determined for specimens that would otherwise be unanalyzable. Prospective and retrospective application of CGH to tumor specimens would permit correlative studies to be performed, possibly identifying diagnostic and prognostic indicators of disease. CGH may also have a future role in detection and identification of chromosomal abnormalities in prenatal diagnosis and in dysmorphic anomalies. Images Figure 1 Figure 2 PMID:7992829

  16. Sequencing and comparing whole mitochondrial genomes ofanimals

    SciTech Connect

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  17. Comparative primate genomics: emerging patterns of genome content and dynamics

    PubMed Central

    Rogers, Jeffrey; Gibbs, Richard A.

    2014-01-01

    Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

  18. Comparative primate genomics: emerging patterns of genome content and dynamics.

    PubMed

    Rogers, Jeffrey; Gibbs, Richard A

    2014-05-01

    Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future. PMID:24709753

  19. Ebolavirus comparative genomics

    SciTech Connect

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Ussery, David W.

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.

  20. Comparative genomics of Shiga toxin encoding bacteriophages

    PubMed Central

    2012-01-01

    Background Stx bacteriophages are responsible for driving the dissemination of Stx toxin genes (stx) across their bacterial host range. Lysogens carrying Stx phages can cause severe, life-threatening disease and Stx toxin is an integral virulence factor. The Stx-bacteriophage vB_EcoP-24B, commonly referred to as Ф24B, is capable of multiply infecting a single bacterial host cell at a high frequency, with secondary infection increasing the rate at which subsequent bacteriophage infections can occur. This is biologically unusual, therefore determining the genomic content and context of Ф24B compared to other lambdoid Stx phages is important to understanding the factors controlling this phenomenon and determining whether they occur in other Stx phages. Results The genome of the Stx2 encoding phage, Ф24B was sequenced and annotated. The genomic organisation and general features are similar to other sequenced Stx bacteriophages induced from Enterohaemorrhagic Escherichia coli (EHEC), however Ф24B possesses significant regions of heterogeneity, with implications for phage biology and behaviour. The Ф24B genome was compared to other sequenced Stx phages and the archetypal lambdoid phage, lambda, using the Circos genome comparison tool and a PCR-based multi-loci comparison system. Conclusions The data support the hypothesis that Stx phages are mosaic, and recombination events between the host, phages and their remnants within the same infected bacterial cell will continue to drive the evolution of Stx phage variants and the subsequent dissemination of shigatoxigenic potential. PMID:22799768

  1. Identification of Clinically Important Chromosomal Aberrations in Acute Myeloid Leukemia by Array-Based Comparative Genomic Hybridization

    PubMed Central

    Mehrotra, Meenakshi; Luthra, Rajyalakshmi; Ravandi, Farhad; Sargent, Rachel L.; Barkoh, Bedia A; Abraham, Ronald; Mishra, Bal Mukund; Medeiros, L. Jeffrey; Patel, Keyur P.

    2014-01-01

    Array-based comparative genomic hybridization (aCGH) chromosomal analysis facilitates rapid detection of cytogenetic abnormalities previously undetectable by conventional cytogenetics. In this study, we analyze 48 uniformly treated acute myeloid leukemia (AML) patients by 44K aCGH and correlated the findings with clinical outcome. aCGH identified previously undetected aberrations, as small as 5 kb, of currently unknown significance. The 36.7 Mb minimally deleted region on chromosome 5 lies between 5q14.3 to 5q33.3 contains 634 genes and 15 microRNAs whereas loss of chromosome 17 spans 3,194 kb involves 342 genes and 12 microRNAs. Loss of 155 kilobase (kb) region on 5q33.3 (p<0.05) is associated with achievement of complete remission. In contrast, loss of 17p11.2-q11.1 was associated with lower CR rate and poorer overall survival (Kaplan-Meier analysis, p<0.0096). aCGH detected loss of 17p in 12/48 patients as compared to 9/48 by conventional karyotyping. In conclusion, aCGH analysis adds to the prognostic stratification of AML patients. PMID:24446873

  2. Cocoa/Cotton Comparative Genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  3. Array-based comparative genomic hybridization for the detection of DNA sequence copy number changes in Barrett's adenocarcinoma.

    PubMed

    Albrecht, Bettina; Hausmann, Michael; Zitzelsberger, Horst; Stein, Hubert; Siewert, Jörg Rüdiger; Hopt, Ulrich; Langer, Rupert; Höfler, Heinz; Werner, Martin; Walch, Axel

    2004-07-01

    Array-based comparative genomic hybridization (aCGH) allows the identification of DNA sequence copy number changes at high resolution by co-hybridizing differentially labelled test and control DNAs to a micro-array of genomic clones. The present study has analysed a series of 23 formalin-fixed, paraffin wax-embedded tissue samples of Barrett's adenocarcinoma (BCA, n = 18) and non-neoplastic squamous oesophageal (n = 2) and gastric cardia mucosa (n = 3) by aCGH. The micro-arrays used contained 287 genomic targets covering oncogenes, tumour suppressor genes, and DNA sequences localized within chromosomal regions previously reported to be altered in BCA. DNA sequence copy number changes for a panel of approximately 50 genes were identified, most of which have not been previously described in BCA. DNA sequence copy number gains (mean 41 +/- 25/BCA) were more frequent than DNA sequence copy number losses (mean 20 +/- 15/BCA). The highest frequencies for DNA sequence copy number gains were detected for SNRPN (61%); GNLY (44%); NME1 (44%); DDX15, ABCB1 (MDR), ATM, LAMA3, MYBL2, ZNF217, and TNFRSF6B (39% each); and MSH2, TERC, SERPINE1, AFM137XA11, IGF1R, and PTPN1 (33% each). DNA sequence copy number losses were identified for PDGFB (44%); D17S125 (39%); AKT3 (28%); and RASSFI, FHIT, CDKN2A (p16), and SAS (CDK4) (28% each). In all non-neoplastic tissue samples of squamous oesophageal and gastric cardia mucosa, the measured mean ratios were 1.00 (squamous oesophageal mucosa) or 1.01 (gastric mucosa), indicating that no DNA sequence copy number chances were present. For validation, the DNA sequence copy number changes of selected clones (SNRPN, CMYC, HER2, ZNF217) detected by aCGH were confirmed by fluorescence in situ hybridization (FISH). These data show the sensitivity of aCGH for the identification of DNA sequence copy number changes at high resolution in BCA. The newly identified genes may include so far unknown biomarkers in BCA and are therefore a starting point for

  4. Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

    PubMed

    Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

    2016-01-01

    One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. PMID:27238013

  5. Comparative genome analysis of Basidiomycete fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  6. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales

    PubMed Central

    2012-01-01

    Background To date, exon capture has largely been restricted to species with fully sequenced genomes, which has precluded its application to lineages that lack high quality genomic resources. We developed a novel strategy for designing array-based exon capture in chipmunks (Tamias) based on de novo transcriptome assemblies. We evaluated the performance of our approach across specimens from four chipmunk species. Results We selectively targeted 11,975 exons (~4 Mb) on custom capture arrays, and enriched over 99% of the targets in all libraries. The percentage of aligned reads was highly consistent (24.4-29.1%) across all specimens, including in multiplexing up to 20 barcoded individuals on a single array. Base coverage among specimens and within targets in each species library was uniform, and the performance of targets among independent exon captures was highly reproducible. There was no decrease in coverage among chipmunk species, which showed up to 1.5% sequence divergence in coding regions. We did observe a decline in capture performance of a subset of targets designed from a much more divergent ground squirrel genome (30 My), however, over 90% of the targets were also recovered. Final assemblies yielded over ten thousand orthologous loci (~3.6 Mb) with thousands of fixed and polymorphic SNPs among species identified. Conclusions Our study demonstrates the potential of a transcriptome-enabled, multiplexed, exon capture method to create thousands of informative markers for population genomic and phylogenetic studies in non-model species across the tree of life. PMID:22900609

  7. Comparative genomics of protoploid Saccharomycetaceae.

    PubMed

    Souciet, Jean-Luc; Dujon, Bernard; Gaillardin, Claude; Johnston, Mark; Baret, Philippe V; Cliften, Paul; Sherman, David J; Weissenbach, Jean; Westhof, Eric; Wincker, Patrick; Jubin, Claire; Poulain, Julie; Barbe, Valérie; Ségurens, Béatrice; Artiguenave, François; Anthouard, Véronique; Vacherie, Benoit; Val, Marie-Eve; Fulton, Robert S; Minx, Patrick; Wilson, Richard; Durrens, Pascal; Jean, Géraldine; Marck, Christian; Martin, Tiphaine; Nikolski, Macha; Rolland, Thomas; Seret, Marie-Line; Casarégola, Serge; Despons, Laurence; Fairhead, Cécile; Fischer, Gilles; Lafontaine, Ingrid; Leh, Véronique; Lemaire, Marc; de Montigny, Jacky; Neuvéglise, Cécile; Thierry, Agnès; Blanc-Lenfle, Isabelle; Bleykasten, Claudine; Diffels, Julie; Fritsch, Emilie; Frangeul, Lionel; Goëffon, Adrien; Jauniaux, Nicolas; Kachouri-Lafond, Rym; Payen, Célia; Potier, Serge; Pribylova, Lenka; Ozanne, Christophe; Richard, Guy-Franck; Sacerdot, Christine; Straub, Marie-Laure; Talla, Emmanuel

    2009-10-01

    Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call "protoploid" because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified. PMID:19525356

  8. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  9. [Research proceedings on primate comparative genomics].

    PubMed

    Liao, Cheng-Hong; Su, Bing

    2012-02-01

    With the accomplishment of genome sequencing of human, chimpanzee and other primates, there has been a great amount of primate genome information accumulated. Primate comparative genomics has become a new research field at current genome era. In this article, we reviewed recent progress in phylogeny, genome structure and gene expression of human and nonhuman primates, and we elaborated the major biological differences among human, chimpanzee and other non-human primate species, which is informative in revealing the mechanism of human evolution. PMID:22345018

  10. Comparative Reannotation of 21 Aspergillus Genomes

    SciTech Connect

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  11. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    SciTech Connect

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  12. Gramene: a growing plant comparative genomics resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  13. Gramene 2013: Comparative plant genomics resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework fo...

  14. Orthology for comparative genomics in the mouse genome database.

    PubMed

    Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A

    2015-08-01

    The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource. PMID:26223881

  15. Comparative genomics of BCG vaccines.

    PubMed

    Behr, M A

    2001-01-01

    Bacille Calmette-Guérin (BCG) vaccines have been given to more people than any other vaccine. They have also probably resulted in as much controversy as any other vaccine. In clinical trials, the efficacy of BCG vaccination against pulmonary TB has been widely variable. At the same time, a number of investigators have observed phenotypic differences between BCG daughter strains, raising the possibility that differences between BCG products may in some way translate into different outcomes. With recent genomic analysis of BCG strains, it has become possible to piece together the molecular events that have resulted in current BCG vaccines. Between the derivation of BCG in 1921 and the lyophilization of BCG Pasteur 1173 in 1961, there have been at least seven genetic events, including deletions, duplications and a single nucleotide polymorphism. The phenotypic relevance of these changes in BCG vaccines remains to be explored. PMID:11463238

  16. Linking the genomes of nonmodel teleosts through comparative genomics.

    PubMed

    Sarropoulou, E; Nousdili, D; Magoulas, A; Kotoulas, G

    2008-01-01

    Recently the genomes of two more teleost species have been released: the medaka (Oryzias latipes), and the three-spined stickleback (Gasterosteus aculateus). The rapid developments in genomics of fish species paved the way to new and valuable research in comparative genetics and genomics. With the accumulation of information in model species, the genetic and genomic characterization of nonmodel, but economically important species, is now feasible. Furthermore, comparison of low coverage gene maps of aquacultured fish species against fully sequenced fish species will enhance the efficiency of candidate genes identification projected for quantitative trait loci (QTL) scans for traits of commercial interest. This study shows the syntenic relationship between the genomes of six different teleost species, including three fully sequenced model species: Tetraodon nigroviridis, Oryzias latipes, Gasterosteus aculateus, and three marine species of commercial and evolutionary interest: Sparus aurata, Dicentrarchus labrax, Oreochromis spp. All three commercial fish species belong to the order Perciformes, which is the richest in number of species (approximately 10,000) but poor in terms of available genomic information and tools. Syntenic relationships were established by using 800 EST and microsatellites sequences successfully mapped on the RH map of seabream. Comparison to the stickleback genome produced most positive BLAT hits (58%) followed by medaka (32%) and Tetraodon (30%). Thus, stickleback was used as the major stepping stone to compare seabass and tilapia to seabream. In addition to the significance for the aquaculture industry, this approach can encompass important ecological and evolutionary implications. PMID:18297360

  17. Homology-independent metrics for comparative genomics.

    PubMed

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. PMID:26029354

  18. Cyberinfrastructure for (Comparative) Plant Genome Research Through PlantGDB

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Accurate and comprehensive gene structure annotation in emerging and assembled genomes is fundamental to comparative, functional, and translational genomics. We plan to build the cyberinfrastructure necessary for defining and accessing the plant gene space. Our Plant Genetic Data Base (PlantGDB) r...

  19. Quantitative analysis of comparative genomic hybridization

    SciTech Connect

    Manoir, S. du; Bentz, M.; Joos, S. |

    1995-01-01

    Comparative genomic hybridization (CGH) is a new molecular cytogenetic method for the detection of chromosomal imbalances. Following cohybridization of DNA prepared from a sample to be studied and control DNA to normal metaphase spreads, probes are detected via different fluorochromes. The ratio of the test and control fluorescence intensities along a chromosome reflects the relative copy number of segments of a chromosome in the test genome. Quantitative evaluation of CGH experiments is required for the determination of low copy changes, e.g., monosomy or trisomy, and for the definition of the breakpoints involved in unbalanced rearrangements. In this study, a program for quantitation of CGH preparations is presented. This program is based on the extraction of the fluorescence ratio profile along each chromosome, followed by averaging of individual profiles from several metaphase spreads. Objective parameters critical for quantitative evaluations were tested, and the criteria for selection of suitable CGH preparations are described. The granularity of the chromosome painting and the regional inhomogeneity of fluorescence intensities in metaphase spreads proved to be crucial parameters. The coefficient of variation of the ratio value for chromosomes in balanced state (CVBS) provides a general quality criterion for CGH experiments. Different cutoff levels (thresholds) of average fluorescence ratio values were compared for their specificity and sensitivity with regard to the detection of chromosomal imbalances. 27 refs., 15 figs., 1 tab.

  20. Comparative Genomic Analyses of Attenuated Strains of Mycoplasma gallisepticum▿ †

    PubMed Central

    Szczepanek, S. M.; Tulman, E. R.; Gorton, T. S.; Liao, X.; Lu, Z.; Zinski, J.; Aziz, F.; Frasca, S.; Kutish, G. F.; Geary, S. J.

    2010-01-01

    Mycoplasma gallisepticum is a significant respiratory and reproductive pathogen of domestic poultry. While the complete genomic sequence of the virulent, low-passage M. gallisepticum strain R (Rlow) has been reported, genomic determinants responsible for differences in virulence and host range remain to be completely identified. Here, we utilize genome sequencing and microarray-based comparative genomic data to identify these genomic determinants of virulence and to elucidate genomic variability among strains of M. gallisepticum. Analysis of the high-passage, attenuated derivative of Rlow, Rhigh, indicated that relatively few total genomic changes (64 loci) occurred, yet they are potentially responsible for the observed attenuation of this strain. In addition to previously characterized mutations in cytadherence-related proteins, changes included those in coding sequences of genes involved in sugar metabolism. Analyses of the genome of the M. gallisepticum vaccine strain F revealed numerous differences relative to strain R, including a highly divergent complement of vlhA surface lipoprotein genes, and at least 16 genes absent or significantly fragmented relative to strain R. Notably, an Rlow isogenic mutant in one of these genes (MGA_1107) caused significantly fewer severe tracheal lesions in the natural host compared to virulent M. gallisepticum Rlow. Comparative genomic hybridizations indicated few genetic loci commonly affected in F and vaccine strains ts-11 and 6/85, which would correlate with proteins affecting strain R virulence. Together, these data provide novel insights into inter- and intrastrain M. gallisepticum genomic variability and the genetic basis of M. gallisepticum virulence. PMID:20123709

  1. Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus

    PubMed Central

    Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

    2016-01-01

    Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria. PMID:26900859

  2. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    PubMed

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-01

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. PMID:26578582

  3. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database

    PubMed Central

    Winsor, Geoffrey L.; Griffiths, Emma J.; Lo, Raymond; Dhillon, Bhavjinder K.; Shay, Julie A.; Brinkman, Fiona S. L.

    2016-01-01

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. PMID:26578582

  4. Using Comparative Genomics for Inquiry-Based Learning to Dissect Virulence of "Escherichia coli" O157:H7 and "Yersinia pestis"

    ERIC Educational Resources Information Center

    Baumler, David J.; Banta, Lois M.; Hung, Kai F.; Schwarz, Jodi A.; Cabot, Eric L.; Glasner, Jeremy D.; Perna, Nicole T.

    2012-01-01

    Genomics and bioinformatics are topics of increasing interest in undergraduate biological science curricula. Many existing exercises focus on gene annotation and analysis of a single genome. In this paper, we present two educational modules designed to enable students to learn and apply fundamental concepts in comparative genomics using examples…

  5. GenoSets: Visual Analytic Methods for Comparative Genomics

    PubMed Central

    Cain, Aurora A.; Kosara, Robert; Gibas, Cynthia J.

    2012-01-01

    Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest. PMID:23056299

  6. Phytozome System for Comparative Plant Genomics

    Energy Science and Technology Software Center (ESTSC)

    2011-09-27

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the UC Berkeley Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Families of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These families allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release 7.0, Phytozome providesmore » access to twenty-five sequenced and annotated green plant genomes which have been clustered into gene families at eleven evolutionarily significant nodes., Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are lyper-linked and searchable.« less

  7. Phytozome System for Comparative Plant Genomics

    SciTech Connect

    2011-09-27

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the UC Berkeley Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Families of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These families allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release 7.0, Phytozome provides access to twenty-five sequenced and annotated green plant genomes which have been clustered into gene families at eleven evolutionarily significant nodes., Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are lyper-linked and searchable.

  8. Homology-Independent Metrics for Comparative Genomics

    PubMed Central

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of “genomic dark matter” with no significant similarity — and, consequently, no inferred homology to any other known sequence — from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. PMID:26029354

  9. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. PMID:25480115

  10. A combinatorial approach of comprehensive QTL-based comparative genome mapping and transcript profiling identified a seed weight-regulating candidate gene in chickpea.

    PubMed

    Bajaj, Deepak; Upadhyaya, Hari D; Khan, Yusuf; Das, Shouvik; Badoni, Saurabh; Shree, Tanima; Kumar, Vinod; Tripathi, Shailesh; Gowda, C L L; Singh, Sube; Sharma, Shivali; Tyagi, Akhilesh K; Chattopdhyay, Debasis; Parida, Swarup K

    2015-01-01

    High experimental validation/genotyping success rate (94-96%) and intra-specific polymorphic potential (82-96%) of 1536 SNP and 472 SSR markers showing in silico polymorphism between desi ICC 4958 and kabuli ICC 12968 chickpea was obtained in a 190 mapping population (ICC 4958 × ICC 12968) and 92 diverse desi and kabuli genotypes. A high-density 2001 marker-based intra-specific genetic linkage map comprising of eight LGs constructed is comparatively much saturated (mean map-density: 0.94 cM) in contrast to existing intra-specific genetic maps in chickpea. Fifteen robust QTLs (PVE: 8.8-25.8% with LOD: 7.0-13.8) associated with pod and seed number/plant (PN and SN) and 100 seed weight (SW) were identified and mapped on 10 major genomic regions of eight LGs. One of 126.8 kb major genomic region harbouring a strong SW-associated robust QTL (Caq'SW1.1: 169.1-171.3 cM) has been delineated by integrating high-resolution QTL mapping with comprehensive marker-based comparative genome mapping and differential expression profiling. This identified one potential regulatory SNP (G/A) in the cis-acting element of candidate ERF (ethylene responsive factor) TF (transcription factor) gene governing seed weight in chickpea. The functionally relevant molecular tags identified have potential to be utilized for marker-assisted genetic improvement of chickpea. PMID:25786576

  11. Comparative genomic hybridization with single cells after whole genome amplification

    SciTech Connect

    Haddad, B.R.; Baldini, A.; Hughes, M.R.

    1994-09-01

    Conventional karyotype analysis is the ideal way to diagnose chromosomal imbalances. However it requires cell culture and chromosome preparation. There are instances where a very small number of cells are available for cytogenetic evaluation and chromosomes cannot be obtained. Comparative genomic hybridization (CGH) is a novel molecular cytogenetic technique that provides information about genetic imbalances affecting the genome. The power of this technique lies in its ability to detect genetic imbalances using total genomic DNA. We have previously demonstrated the feasibility of whole genome amplification from single cells for subsequent analysis of multiple genetic loci by PCR. In this present work, we combine whole genome amplification with CGH to detect chromosomal imbalances from small numbers of cells. Both cytogenetically normal and abnormal cells were individually picked by micromanipulation and subjected to whole genome amplification using random oligonucleotide primers. Amplified test and control DNA were differentially labeled by incorporation of digoxigenin or biotin, mixed together and hybridized to normal male metaphase spreads. Hybridization was detected with two fluorochromes, rhodamine-anti-digoxigenin and FITC -Avidin. Ratio of intensities of the two fluorochromes along the target chromosomes was analyzed using locally developed computer imaging software. Using the combination of whole genome amplification and CGH, we were able to detect different chromosomal aneuploidies from 30, 20, and 10 cells. It can also be applied to the analysis of fetal cells sorted from maternal circulation, or to tumor cells obtained from needle biopsies or from different body fluids and effusions. Finally, its successful application to single cells will have a great impact on preimplantation diagnosis.

  12. VISTA - computational tools for comparative genomics

    SciTech Connect

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  13. The Aedes aegypti genome: a comparative perspective.

    PubMed

    Waterhouse, R M; Wyder, S; Zdobnov, E M

    2008-02-01

    The sequencing of the second mosquito genome, Aedes aegypti, in addition to Anopheles gambiae, is a major milestone that will drive molecular-level and genome-wide high-throughput studies of not only these but also other mosquito vectors of human pathogens. Here we overview the ancestry of the mosquito genes, list the major expansions of gene families that may relate to species adaptation processes, as exemplified by CYP9 cytochrome P450 genes, and discuss the conservation of chromosomal gene arrangements among the two mosquitoes and fruit fly. Many more invertebrate genomes are expected to be sequenced in the near future, including additional vectors of human pathogens (see http://www.vectorbase.org), and further comparative analyses will become increasingly refined and informative, hopefully improving our understanding of the genetic basis of phenotypical differences among these species, their vectorial capacity, and ultimately leading to the development of novel disease control strategies. PMID:18237279

  14. [Comparative genomic classification of human hepatocellular carcinoma].

    PubMed

    Kaposi-Novák, Pál

    2009-03-01

    , increased microvessel density, and shortened survival. A prediction model based on 111 cross-species conserved c-Met signature genes was able to diversify HCC patients into good and bad prognostic groups with 83-95% accuracy. Our results therefore demonstrate that careful experimental design and state-of-the-art laboratory methods could open the way for global expression profiling of archived and limited availability pathologic samples. Comparative functional genomics based analysis of the cancer transcriptome could lead to novel molecular classification systems which are essential for the introduction of individualized cancer therapeutics. PMID:19318328

  15. Comparative genomics of brain size evolution

    PubMed Central

    Enard, Wolfgang

    2014-01-01

    Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large search space of mammalian genomes. Hence, it is crucial to add functional information, for example by limiting the search space to genes and regulatory elements known to play a role in the relevant cell types during brain development. Similarly, it is crucial to experimentally follow up on hypotheses generated by such a comparative approach. Recent progress in understanding the molecular and cellular mechanisms of mammalian brain development, in genome sequencing and in genome editing, promises to make a close integration of evolutionary and experimental methods a fruitful approach to better understand the genetics of mammalian brain size evolution. PMID:24904382

  16. Mycobacterial species as case-study of comparative genome analysis.

    PubMed

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-01-01

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species. PMID:21396338

  17. Comparative genomics tools applied to bioterrorism defence.

    PubMed

    Slezak, Tom; Kuczmarski, Tom; Ott, Linda; Torres, Clinton; Medeiros, Dan; Smith, Jason; Truitt, Brian; Mulakken, Nisha; Lam, Marisa; Vitalis, Elizabeth; Zemla, Adam; Zhou, Carol Ecale; Gardner, Shea

    2003-06-01

    Rapid advances in the genomic sequencing of bacteria and viruses over the past few years have made it possible to consider sequencing the genomes of all pathogens that affect humans and the crops and livestock upon which our lives depend. Recent events make it imperative that full genome sequencing be accomplished as soon as possible for pathogens that could be used as weapons of mass destruction or disruption. This sequence information must be exploited to provide rapid and accurate diagnostics to identify pathogens and distinguish them from harmless near-neighbours and hoaxes. The Chem-Bio Non-Proliferation (CBNP) programme of the US Department of Energy (DOE) began a large-scale effort of pathogen detection in early 2000 when it was announced that the DOE would be providing bio-security at the 2002 Winter Olympic Games in Salt Lake City, Utah. Our team at the Lawrence Livermore National Lab (LLNL) was given the task of developing reliable and validated assays for a number of the most likely bioterrorist agents. The short timeline led us to devise a novel system that utilised whole-genome comparison methods to rapidly focus on parts of the pathogen genomes that had a high probability of being unique. Assays developed with this approach have been validated by the Centers for Disease Control (CDC). They were used at the 2002 Winter Olympics, have entered the public health system, and have been in continual use for non-publicised aspects of homeland defence since autumn 2001. Assays have been developed for all major threat list agents for which adequate genomic sequence is available, as well as for other pathogens requested by various government agencies. Collaborations with comparative genomics algorithm developers have enabled our LLNL team to make major advances in pathogen detection, since many of the existing tools simply did not scale well enough to be of practical use for this application. It is hoped that a discussion of a real-life practical application of

  18. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

    PubMed Central

    Jung, Kyongyong; Park, Jongsun; Choi, Jaeyoung; Park, Bongsoo; Kim, Seungill; Ahn, Kyohun; Choi, Jaehyuk; Choi, Doil; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP) was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB) integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets) and 34 plant and animal (38 datasets) species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site . PMID:19055845

  19. The MicrobesOnline Web site for comparative genomics

    SciTech Connect

    Alm, Eric J.; Huang, Katherine H.; Price, Morgan N.; Koche,Richard P.; Keller, Keith; Dubchak, Inna L.; Arkin, Adam P.

    2004-11-05

    At present, hundreds of microbial genomes have been sequenced, and hundreds more are currently in the pipeline. The Virtual Institute for Microbial Stress and Survival has developed a publicly available suite of Web-based comparative genomic tools (http://www.microbesonline.org) designed to facilitate multispecies comparison among prokaryotes. Highlights of the Microbes Online Web site include operon and regulon predictions, a multispecies genome browser, a multispecies Gene Ontology browser, a comparative KEGG metabolic pathway viewer, a Bioinformatics Workbench for in-depth sequence analysis, and Gene Carts that allow users to save genes of interest for further study while they browse. In addition, we provide an interface for genome annotation, which like all of the tools reported here, is freely available to the scientific community.

  20. COMPARISON OF COMPARATIVE GENOMIC HYBRIDIZATIONS TECHNOLOGIES ACROSS MICROARRAY PLATFORMS

    EPA Science Inventory

    Comparative Genomic Hybridization (CGH) measures DNA copy number differences between a reference genome and a test genome. The DNA samples are differentially labeled and hybridized to an immobilized substrate. In early CGH experiments, the DNA targets were hybridized to metaphase...

  1. Evolutionary and comparative analyses of the soybean genome

    PubMed Central

    Cannon, Steven B.; Shoemaker, Randy C.

    2012-01-01

    The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplication that occurred ~13 million years ago (Mya); and fainter remnants of older polyploidies that occurred ~58 Mya and >130 Mya. The genome sequence has been used to identify the genetic basis for numerous traits, including disease resistance, nutritional characteristics, and developmental features. The genome sequence has provided a scaffold for placement of many genomic feature elements, both from within soybean and from related species. These may be accessed at several websites, including http://www.phytozome.net, http://soybase.org, http://comparative-legumes.org, and http://www.legumebase.brc.miyazaki-u.ac.jp. The taxonomic position of soybean in the Phaseoleae tribe of the legumes means that there are approximately two dozen other beans and relatives that have undergone independent domestication, and which may have traits that will be useful for transfer to soybean. Methods of translating information between species in the Phaseoleae range from design of markers for marker assisted selection, to transformation with Agrobacterium or with other experimental transformation methods. PMID:23136483

  2. Comparative genomics of biotechnologically important yeasts.

    PubMed

    Riley, Robert; Haridas, Sajeet; Wolfe, Kenneth H; Lopes, Mariana R; Hittinger, Chris Todd; Göker, Markus; Salamov, Asaf A; Wisecaver, Jennifer H; Long, Tanya M; Calvey, Christopher H; Aerts, Andrea L; Barry, Kerrie W; Choi, Cindy; Clum, Alicia; Coughlan, Aisling Y; Deshpande, Shweta; Douglass, Alexander P; Hanson, Sara J; Klenk, Hans-Peter; LaButti, Kurt M; Lapidus, Alla; Lindquist, Erika A; Lipzen, Anna M; Meier-Kolthoff, Jan P; Ohm, Robin A; Otillar, Robert P; Pangilinan, Jasmyn L; Peng, Yi; Rokas, Antonis; Rosa, Carlos A; Scheuner, Carmen; Sibirny, Andriy A; Slot, Jason C; Stielow, J Benjamin; Sun, Hui; Kurtzman, Cletus P; Blackwell, Meredith; Grigoriev, Igor V; Jeffries, Thomas W

    2016-08-30

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the clade sister to the known CUG-Ser clade. Our well-resolved yeast phylogeny shows that some traits, such as methylotrophy, are restricted to single clades, whereas others, such as l-rhamnose utilization, have patchy phylogenetic distributions. Gene clusters, with variable organization and distribution, encode many pathways of interest. Genomics can predict some biochemical traits precisely, but the genomic basis of others, such as xylose utilization, remains unresolved. Our data also provide insight into early evolution of ascomycetes. We document the loss of H3K9me2/3 heterochromatin, the origin of ascomycete mating-type switching, and panascomycete synteny at the MAT locus. These data and analyses will facilitate the engineering of efficient biosynthetic and degradative pathways and gateways for genomic manipulation. PMID:27535936

  3. Gramene 2016: comparative plant genomics and pathway resources

    PubMed Central

    Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  4. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  5. Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication

    PubMed Central

    2009-01-01

    Background Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics. Results We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago. Conclusions This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution. PMID:19821981

  6. Comparative Genomic and Phylogenomic Analyses Reveal a Conserved Core Genome Shared by Estuarine and Oceanic Cyanopodoviruses

    PubMed Central

    Huang, Sijun; Zhang, Si; Jiao, Nianzhi; Chen, Feng

    2015-01-01

    Podoviruses are among the major viral groups that infect marine picocyanobacteria Prochlorococcus and Synechococcus. Here, we reported the genome sequences of five Synechococcus podoviruses isolated from the estuarine environment, and performed comparative genomic and phylogenomic analyses based on a total of 20 cyanopodovirus genomes. The genomes of all the known marine cyanopodoviruses are highly syntenic. A pan-genome of 349 clustered orthologous groups was determined, among which 15 were core genes. These core genes make up nearly half of each genome in length, reflecting the high level of genome conservation among this cyanophage type. The whole genome phylogenies based on concatenated core genes and gene content were highly consistent and confirmed the separation of two discrete marine cyanopodovirus clusters MPP-A and MPP-B. The genomes within cluster MPP-B grouped into subclusters mainly corresponding to Prochlorococcus or Synechococcus host types. Auxiliary metabolic genes tend to occur in a specific phylogenetic group of these cyanopodoviruses. All the MPP-B phages analyzed here encode the photosynthesis gene psbA, which are absent in all the MPP-A genomes thus far. Interestingly, all the MPP-B and two MPP-A Synechococcus podoviruses encode the thymidylate synthase gene thyX, while at the same genome locus all the MPP-B Prochlorococcus podoviruses encode the transaldolase gene talC. Both genes are hypothesized to have the potential to facilitate the biosynthesis of deoxynucleotide for phage replication. Inheritance of specific functional genes could be important to the evolution and ecological fitness of certain cyanophage genotypes. Our analyses demonstrate that cyanopodoviruses of estuarine and oceanic origins share a conserved core genome and suggest that accessory genes may be related to environmental adaptation. PMID:26569403

  7. Insights into mutualism mechanism and versatile metabolism of Ketogulonicigenium vulgare Hbe602 based on comparative genomics and metabolomics studies.

    PubMed

    Jia, Nan; Ding, Ming-Zhu; Du, Jin; Pan, Cai-Hui; Tian, Geng; Lang, Ji-Dong; Fang, Jian-Huo; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Ketogulonicigenium vulgare has been widely used in vitamin C two steps fermentation and requires companion strain for optimal growth. However, the understanding of K. vulgare as well as its companion strain is still preliminary. Here, the complete genome of K. vulgare Hbe602 was deciphered to provide insight into the symbiosis mechanism and the versatile metabolism. K. vulgare contains the LuxR family proteins, chemokine proteins, flagellar structure proteins, peptides and transporters for symbiosis consortium. Besides, the growth state and metabolite variation of K. vulgare were observed when five carbohydrates (D-sorbitol, L-sorbose, D-glucose, D-fructose and D-mannitol) were used as carbon source. The growth increased by 40.72% and 62.97% respectively when K. vulgare was cultured on D-mannitol/D-sorbitol than on L-sorbose. The insufficient metabolism of carbohydrates, amino acids and vitamins is the main reason for the slow growth of K. vulgare. The combined analysis of genomics and metabolomics indicated that TCA cycle, amino acid and nucleotide metabolism were significantly up-regulated when K. vulgare was cultured on the D-mannitol/D-sorbitol, which facilitated the better growth. The present study would be helpful to further understand its metabolic structure and guide the engineering transformation. PMID:26979567

  8. Insights into mutualism mechanism and versatile metabolism of Ketogulonicigenium vulgare Hbe602 based on comparative genomics and metabolomics studies

    PubMed Central

    Jia, Nan; Ding, Ming-Zhu; Du, Jin; Pan, Cai-Hui; Tian, Geng; Lang, Ji-Dong; Fang, Jian-Huo; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Ketogulonicigenium vulgare has been widely used in vitamin C two steps fermentation and requires companion strain for optimal growth. However, the understanding of K. vulgare as well as its companion strain is still preliminary. Here, the complete genome of K. vulgare Hbe602 was deciphered to provide insight into the symbiosis mechanism and the versatile metabolism. K. vulgare contains the LuxR family proteins, chemokine proteins, flagellar structure proteins, peptides and transporters for symbiosis consortium. Besides, the growth state and metabolite variation of K. vulgare were observed when five carbohydrates (D-sorbitol, L-sorbose, D-glucose, D-fructose and D-mannitol) were used as carbon source. The growth increased by 40.72% and 62.97% respectively when K. vulgare was cultured on D-mannitol/D-sorbitol than on L-sorbose. The insufficient metabolism of carbohydrates, amino acids and vitamins is the main reason for the slow growth of K. vulgare. The combined analysis of genomics and metabolomics indicated that TCA cycle, amino acid and nucleotide metabolism were significantly up-regulated when K. vulgare was cultured on the D-mannitol/D-sorbitol, which facilitated the better growth. The present study would be helpful to further understand its metabolic structure and guide the engineering transformation. PMID:26979567

  9. Image analysis in comparative genomic hybridization

    SciTech Connect

    Lundsteen, C.; Maahr, J.; Christensen, B.

    1995-01-01

    Comparative genomic hybridization (CGH) is a new technique by which genomic imbalances can be detected by combining in situ suppression hybridization of whole genomic DNA and image analysis. We have developed software for rapid, quantitative CGH image analysis by a modification and extension of the standard software used for routine karyotyping of G-banded metaphase spreads in the Magiscan chromosome analysis system. The DAPI-counterstained metaphase spread is karyotyped interactively. Corrections for image shifts between the DAPI, FITC, and TRITC images are done manually by moving the three images relative to each other. The fluorescence background is subtracted. A mean filter is applied to smooth the FITC and TRITC images before the fluorescence ratio between the individual FITC and TRITC-stained chromosomes is computed pixel by pixel inside the area of the chromosomes determined by the DAPI boundaries. Fluorescence intensity ratio profiles are generated, and peaks and valleys indicating possible gains and losses of test DNA are marked if they exceed ratios below 0.75 and above 1.25. By combining the analysis of several metaphase spreads, consistent findings of gains and losses in all or almost all spreads indicate chromosomal imbalance. Chromosomal imbalances are detected either by visual inspection of fluorescence ratio (FR) profiles or by a statistical approach that compares FR measurements of the individual case with measurements of normal chromosomes. The complete analysis of one metaphase can be carried out in approximately 10 minutes. 8 refs., 7 figs., 1 tab.

  10. Comparative genome map of human and cattle

    SciTech Connect

    Solinas-Toldo, S.; Fries, R.; Lengauer, C.

    1995-06-10

    Chromosomal homologies between individual human chromosomes and the bovine karyotype have been established by using a new approach termed Zoo-FISH. Labeled DNA libraries from flow-sorted human chromosomes were used as probes for fluorescence in situ hybridization on cattle chromosomes. All human DNA libraries, except the Y chromosome library, hybridized to one or more cattle chromosomes, identifying and delineating 50 segments of homology, most of them corresponding to the regions of homology as identified by the previous mapping of individual conserved loci. However, Zoo-FISH refines the comparative maps constructed by molecular gene mapping of individual loci by providing information on the boundaries of conserved regions in the absence of obvious cytogenetic homologies of human and bovine chromosomes. It allows study of karyotypic evolution and opens new avenues for genomic analysis by facilitating the extrapolation of results from the human genome initiative. 50 refs., 3 figs., 1 tab.

  11. Evolutionary and comparative analyses of the soybean genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplicatio...

  12. Genome evolution in the eremothecium clade of the Saccharomyces complex revealed by comparative genomics.

    PubMed

    Wendland, Jürgen; Walther, Andrea

    2011-12-01

    We used comparative genomics to elucidate the genome evolution within the pre-whole-genome duplication genus Eremothecium. To this end, we sequenced and assembled the complete genome of Eremothecium cymbalariae, a filamentous ascomycete representing the Eremothecium type strain. Genome annotation indicated 4712 gene models and 143 tRNAs. We compared the E. cymbalariae genome with that of its relative, the riboflavin overproducer Ashbya (Eremothecium) gossypii, and the reconstructed yeast ancestor. Decisive changes in the Eremothecium lineage leading to the evolution of the A. gossypii genome include the reduction from eight to seven chromosomes, the downsizing of the genome by removal of 10% or 900 kb of DNA, mostly in intergenic regions, the loss of a TY3-Gypsy-type transposable element, the re-arrangement of mating-type loci, and a massive increase of its GC content. Key species-specific events are the loss of MNN1-family of mannosyltransferases required to add the terminal fourth and fifth α-1,3-linked mannose residue to O-linked glycans and genes of the Ehrlich pathway in E. cymbalariae and the loss of ZMM-family of meiosis-specific proteins and acquisition of riboflavin overproduction in A. gossypii. This reveals that within the Saccharomyces complex genome, evolution is not only based on genome duplication with subsequent gene deletions and chromosomal rearrangements but also on fungi associated with specific environments (e.g. involving fungal-insect interactions as in Eremothecium), which have encountered challenges that may be reflected both in genome streamlining and their biosynthetic potential. PMID:22384365

  13. Genome Evolution in the Eremothecium Clade of the Saccharomyces Complex Revealed by Comparative Genomics

    PubMed Central

    Wendland, Jürgen; Walther, Andrea

    2011-01-01

    We used comparative genomics to elucidate the genome evolution within the pre–whole-genome duplication genus Eremothecium. To this end, we sequenced and assembled the complete genome of Eremothecium cymbalariae, a filamentous ascomycete representing the Eremothecium type strain. Genome annotation indicated 4712 gene models and 143 tRNAs. We compared the E. cymbalariae genome with that of its relative, the riboflavin overproducer Ashbya (Eremothecium) gossypii, and the reconstructed yeast ancestor. Decisive changes in the Eremothecium lineage leading to the evolution of the A. gossypii genome include the reduction from eight to seven chromosomes, the downsizing of the genome by removal of 10% or 900 kb of DNA, mostly in intergenic regions, the loss of a TY3-Gypsy–type transposable element, the re-arrangement of mating-type loci, and a massive increase of its GC content. Key species-specific events are the loss of MNN1-family of mannosyltransferases required to add the terminal fourth and fifth α-1,3-linked mannose residue to O-linked glycans and genes of the Ehrlich pathway in E. cymbalariae and the loss of ZMM-family of meiosis-specific proteins and acquisition of riboflavin overproduction in A. gossypii. This reveals that within the Saccharomyces complex genome, evolution is not only based on genome duplication with subsequent gene deletions and chromosomal rearrangements but also on fungi associated with specific environments (e.g. involving fungal-insect interactions as in Eremothecium), which have encountered challenges that may be reflected both in genome streamlining and their biosynthetic potential. PMID:22384365

  14. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  15. Comparative genomic hybridization (CGH) in genotoxicology.

    PubMed

    Baumgartner, Adolf

    2013-01-01

    In the past two decades comparative genomic hybridization (CGH) and array CGH have become crucial and indispensable tools in clinical diagnostics. Initially developed for the genome-wide screening of chromosomal imbalances in tumor cells, CGH as well as array CGH have also been employed in genotoxicology and most recently in toxicogenomics. The latter methodology allows a multi-endpoint analysis of how genes and proteins react to toxic agents revealing molecular mechanisms of toxicology. This chapter provides a background on the use of CGH and array CGH in the context of genotoxicology as well as a protocol for conventional CGH to understand the basic principles of CGH. Array CGH is still cost intensive and requires suitable analytical algorithms but might become the dominating assay in the future when more companies provide a large variety of different commercial DNA arrays/chips leading to lower costs for array CGH equipment as well as consumables such as DNA chips. As the amount of data generated with microarrays exponentially grows, the demand for powerful adaptive algorithms for analysis, competent databases, as well as a sound regulatory framework will also increase. Nevertheless, chromosomal and array CGH are being demonstrated to be effective tools for investigating copy number changes/variations in the whole genome, DNA expression patterns, as well as loss of heterozygosity after a genotoxic impact. This will lead to new insights into affected genes and the underlying structures of regulatory and signaling pathways in genotoxicology and could conclusively identify yet unknown harmful toxicants. PMID:23896881

  16. Array-based comparative genomic hybridization in early-stage mycosis fungoides: recurrent deletion of tumor suppressor genes BCL7A, SMAC/DIABLO, and RHOF.

    PubMed

    Carbone, Angelo; Bernardini, Laura; Valenzano, Francesco; Bottillo, Irene; De Simone, Clara; Capizzi, Rodolfo; Capalbo, Anna; Romano, Francesca; Novelli, Antonio; Dallapiccola, Bruno; Amerio, Pierluigi

    2008-12-01

    The etiology of mycosis fungoides (MF), the most frequent form of cutaneous T cell lymphoma (CTCL), is poorly understood. No specific genetic aberration has been detected, especially in early-stage disease, possibly due to the clinical and histological heterogeneity of patient series and to the different sources of malignant cells (skin, blood, or lymph node) included in most studies. Frozen skin biopsies from 16 patients with early-stage MF were studied using array-based comparative genomic hybridization. A DNA pool from healthy donors was used as the reference. Results demonstrated recurrent loss of 19, 7p22.1-p22.3, 7q11.1-q11.23, 9q34.12, 12q24.31, and 16q22.3-q23.1, and gain of 8q22.3-q23.1 and 21q22.12. The 12q24.31 region was recurrently deleted in 7/16 patients. Real-time PCR investigation for deletion of genes BCL7A, SMAC/DIABLO, and RHOF-three tumor suppressor genes with a putative role in hematological malignancies-demonstrated that they were deleted in 9, 10, and 13 cases, respectively. The identified genomic alterations and individual genes could yield important insights into the early steps of MF pathogenesis. PMID:18663754

  17. Sockeye: a 3D environment for comparative genomics.

    PubMed

    Montgomery, Stephen B; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A Gordon; Sleumer, Monica; Siddiqui, Asim S; Jones, Steven J M

    2004-05-01

    Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization. PMID:15123592

  18. Comparative genomics of ten solanaceous plastomes.

    PubMed

    Kaur, Harpreet; Singh, Bhupinder Pal; Singh, Harpreet; Nagpal, Avinash Kaur

    2014-01-01

    Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna). AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura. PMID:25477958

  19. Comparative Genomics of Ten Solanaceous Plastomes

    PubMed Central

    Kaur, Harpreet; Singh, Bhupinder Pal; Singh, Harpreet; Nagpal, Avinash Kaur

    2014-01-01

    Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna). AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura. PMID:25477958

  20. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  1. MicroScope: a platform for microbial genome annotation and comparative genomics

    PubMed Central

    Vallenet, D.; Engelen, S.; Mornico, D.; Cruveiller, S.; Fleury, L.; Lajus, A.; Rouy, Z.; Roche, D.; Salvignol, G.; Scarpelli, C.; Médigue, C.

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope’s rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of

  2. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    PubMed Central

    Griffin, Darren K; Robertson, Lindsay B; Tempest, Helen G; Vignal, Alain; Fillon, Valérie; Crooijmans, Richard PMA; Groenen, Martien AM; Deryusheva, Svetlana; Gaginskaya, Elena; Carré, Wilfrid; Waddington, David; Talbot, Richard; Völker, Martin; Masabanda, Julio S; Burt, Dave W

    2008-01-01

    Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs) between birds. Results We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres) and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs. Conclusion This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents. PMID:18410676

  3. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    SciTech Connect

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  4. A web server for mining Comparative Genomic Hybridization (CGH) data

    NASA Astrophysics Data System (ADS)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  5. Comparative genomics of autism and schizophrenia

    PubMed Central

    Crespi, Bernard; Stead, Philip; Elliot, Michael

    2010-01-01

    We used data from studies of copy-number variants (CNVs), single-gene associations, growth-signaling pathways, and intermediate phenotypes associated with brain growth to evaluate four alternative hypotheses for the genomic and developmental relationships between autism and schizophrenia: (i) autism subsumed in schizophrenia, (ii) independence, (iii) diametric, and (iv) partial overlap. Data from CNVs provides statistical support for the hypothesis that autism and schizophrenia are associated with reciprocal variants, such that at four loci, deletions predispose to one disorder, whereas duplications predispose to the other. Data from single-gene studies are inconsistent with a hypothesis based on independence, in that autism and schizophrenia share associated genes more often than expected by chance. However, differentiation between the partial overlap and diametric hypotheses using these data is precluded by limited overlap in the specific genetic markers analyzed in both autism and schizophrenia. Evidence from the effects of risk variants on growth-signaling pathways shows that autism-spectrum conditions tend to be associated with up-regulation of pathways due to loss of function mutations in negative regulators, whereas schizophrenia is associated with reduced pathway activation. Finally, data from studies of head and brain size phenotypes indicate that autism is commonly associated with developmentally-enhanced brain growth, whereas schizophrenia is characterized, on average, by reduced brain growth. These convergent lines of evidence appear most compatible with the hypothesis that autism and schizophrenia represent diametric conditions with regard to their genomic underpinnings, neurodevelopmental bases, and phenotypic manifestations as reflecting under-development versus dysregulated over-development of the human social brain. PMID:19955444

  6. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    PubMed

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; van Hylckama Vlieg, Johan E T; Siezen, Roland J

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link

  7. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  8. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  9. African Relapsing Fever Borreliae Genomospecies Revealed by Comparative Genomics

    PubMed Central

    Elbir, Haitham; Abi-Rached, Laurent; Pontarotti, Pierre; Yoosuf, Niyaz; Drancourt, Michel

    2014-01-01

    Background: Relapsing fever borreliae are vector-borne bacteria responsible for febrile infection in humans in North America, Africa, Asia, and in the Iberian Peninsula in Europe. Relapsing fever borreliae are phylogenetically closely related, yet they differ in pathogenicity and vectors. Their long-term taxonomy, based on geography and vector grouping, needs to be re-apprised in a genomic context. We therefore embarked into genomic analyses of relapsing fever borreliae, focusing on species found in Africa. Results: Genome-wide phylogenetic analyses group Old World Borrelia crocidurae, Borrelia hispanica, B. duttonii, and B. recurrentis in one clade, and New World Borrelia turicatae and Borrelia hermsii in a second clade. Accordingly, average nucleotide identity is 99% among B. duttonii, B. recurrentis, and B. crocidurae and 96% between latter borreliae and B. hispanica while the similarity is 86% between Old World and New World borreliae. Comparative genomics indicates that the Old World relapsing fever B. duttonii, B. recurrentis, B. crocidurae, and B. hispanica have a 2,514-gene pan genome and a 933-gene core genome that includes 788 chromosomal and 145 plasmidic genes. Analyzing the role that natural selection has played in the evolution of Old World borreliae species revealed that 55 loci were under positive diversifying selection, including loci coding for membrane, flagellar, and chemotaxis proteins, three categories associated with adaption to specific niches. Conclusion: Genomic analyses led to a reappraisal of the taxonomy of relapsing fever borreliae in Africa. These analyses suggest that B. crocidurae, B. duttonii, and B. recurrentis are ecotypes of a unique genomospecies, while B. hispanica is a distinct species. PMID:25229054

  10. Comparative genomics of the lactic acid bacteria

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacter...

  11. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer) Mitochondrion.

    PubMed

    Wang, Xuelin; Bi, Changwei; Xu, Yiqing; Wei, Suyun; Dai, Xiaogang; Yin, Tongming; Ye, Ning

    2016-01-01

    The complete nucleotide sequences of the mitochondrial (mt) genome of an extremophile species Thellungiella parvula (T. parvula) have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs), and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1%) through simple sequence repeat (SSR) analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes' evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants. PMID:27148547

  12. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer) Mitochondrion

    PubMed Central

    Wang, Xuelin; Bi, Changwei; Xu, Yiqing; Wei, Suyun; Dai, Xiaogang; Yin, Tongming; Ye, Ning

    2016-01-01

    The complete nucleotide sequences of the mitochondrial (mt) genome of an extremophile species Thellungiella parvula (T. parvula) have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs), and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1%) through simple sequence repeat (SSR) analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes' evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants. PMID:27148547

  13. Lactobacillus paracasei Comparative Genomics: Towards Species Pan-Genome Definition and Exploitation of Diversity

    PubMed Central

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; Vlieg, Johan E. T. van Hylckama; Siezen, Roland J.

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its “pan-genome”. We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800–3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25–53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to

  14. Comparative genomic hybridization in clinical cytogenetics

    SciTech Connect

    Bryndorf, T.; Kirchhoff, M.; Rose, H.

    1995-11-01

    We report the results of applying comparative genomic hybridization (CGH) in a cytogenetic service laboratory for (1) determination of the origin of extra and missing chromosomal material in intricate cases of unbalanced aberrations and (2) detection of common prenatal numerical chromosome aberrations. A total of 11 fetal samples were analyzed. Seven cases of complex unbalanced aberrations that could not be identified reliably by conventional cytogenetics were successfully resolved by CGH analysis. CGH results were validated by using FISH with chromosome-specific probes. Four cases representing common prenatal numerical aberrations (trisomy 21, 18, and 13 and monosomy X) were also successfully diagnosed by CGH. We conclude that CGH is a powerful adjunct to traditional cytogenetic techniques that makes it possible to solve clinical cases of intricate unbalanced aberrations in a single hybridization. CGH may also be a useful adjunct to screen for euchromatic involvement in marker chromosomes. Further technical development may render CGH applicable for routine aberration screening. 16 refs., 4 figs., 2 tabs.

  15. Databases of homologous gene families for comparative genomics

    PubMed Central

    Penel, Simon; Arigon, Anne-Muriel; Dufayard, Jean-François; Sertier, Anne-Sophie; Daubin, Vincent; Duret, Laurent; Gouy, Manolo; Perrière, Guy

    2009-01-01

    Background Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable. Methods We developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster. Results Three databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at . PMID:19534752

  16. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    SciTech Connect

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The species P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this

  17. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE PAGESBeta

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; et al

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but

  18. Comparative genome-based identification of a cell wall-anchored protein from Lactobacillus plantarum increases adhesion of Lactococcus lactis to human epithelial cells

    PubMed Central

    Zhang, Bo; Zuo, Fanglei; Yu, Rui; Zeng, Zhu; Ma, Huiqin; Chen, Shangwu

    2015-01-01

    Adhesion to host cells is considered important for Lactobacillus plantarum as well as other lactic acid bacteria (LAB) to persist in human gut and thus exert probiotic effects. Here, we sequenced the genome of Lt. plantarum strain NL42 originating from a traditional Chinese dairy product, performed comparative genomic analysis and characterized a novel adhesion factor. The genome of NL42 was highly divergent from its closest neighbors, especially in six large genomic regions. NL42 harbors a total of 42 genes encoding adhesion-associated proteins; among them, cwaA encodes a protein containing multiple domains, including five cell wall surface anchor repeat domains and an LPxTG-like cell wall anchor motif. Expression of cwaA in Lactococcus lactis significantly increased its autoaggregation and hydrophobicity, and conferred the new ability to adhere to human colonic epithelial HT-29 cells by targeting cellular surface proteins, and not carbohydrate moieties, for CwaA adhesion. In addition, the recombinant Lc. lactis inhibited adhesion of Staphylococcus aureus and Escherichia coli to HT-29 cells, mainly by exclusion. We conclude that CwaA is a novel adhesion factor in Lt. plantarum and a potential candidate for improving the adhesion ability of probiotics or other bacteria of interest. PMID:26370773

  19. Comparative genome-based identification of a cell wall-anchored protein from Lactobacillus plantarum increases adhesion of Lactococcus lactis to human epithelial cells.

    PubMed

    Zhang, Bo; Zuo, Fanglei; Yu, Rui; Zeng, Zhu; Ma, Huiqin; Chen, Shangwu

    2015-01-01

    Adhesion to host cells is considered important for Lactobacillus plantarum as well as other lactic acid bacteria (LAB) to persist in human gut and thus exert probiotic effects. Here, we sequenced the genome of Lt. plantarum strain NL42 originating from a traditional Chinese dairy product, performed comparative genomic analysis and characterized a novel adhesion factor. The genome of NL42 was highly divergent from its closest neighbors, especially in six large genomic regions. NL42 harbors a total of 42 genes encoding adhesion-associated proteins; among them, cwaA encodes a protein containing multiple domains, including five cell wall surface anchor repeat domains and an LPxTG-like cell wall anchor motif. Expression of cwaA in Lactococcus lactis significantly increased its autoaggregation and hydrophobicity, and conferred the new ability to adhere to human colonic epithelial HT-29 cells by targeting cellular surface proteins, and not carbohydrate moieties, for CwaA adhesion. In addition, the recombinant Lc. lactis inhibited adhesion of Staphylococcus aureus and Escherichia coli to HT-29 cells, mainly by exclusion. We conclude that CwaA is a novel adhesion factor in Lt. plantarum and a potential candidate for improving the adhesion ability of probiotics or other bacteria of interest. PMID:26370773

  20. Genome Sequence and Comparative Genome Analysis of Lactobacillus casei: Insights into Their Niche-Associated Evolution

    PubMed Central

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F.; Broadbent, Jeff R.

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  1. Genome sequence and comparative genome analysis of Lactobacillus casei: insights into their niche-associated evolution.

    PubMed

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F; Broadbent, Jeff R; Steele, James L

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  2. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    PubMed Central

    2011-01-01

    Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster, and genes

  3. Comparative genomic hybridization: Detection of segmental aneusomies

    SciTech Connect

    Cronin, J.E.; Magrane, G.G.; Gray, J.W.

    1994-09-01

    Comparative genomic hybridization (CGH) has been used successfully to detect whole chromosome and segmental aneusomies. However, its sensitivity for detection of segmental aneusomies is still not well known. We present here an analysis of CGH sensitivity with emphasis on detection of abnormalities commonly found during pre-and neo-natal diagnosis. CGH is performed by hybridizing green and red fluorescing test and normal DNA samples, respectively, to normal metaphase spreads and measuring green:red fluorescence ratios along all chromosomes. The ratios are normalized such that 2 copies of a normal chromosome region in the test sample gives a ratio of 1.0. Alterations in test vs. control gene copy number range from 1.5 [trisomy] to 0.5 [monosomy]. Clinical samples analyzed included Wolf Hirschhorn (4p-), Cri du Chat (5p-) and DiGeorge (22q-). In addition, 7 cell lines with chromosome 21 segmental aneusomies were analyzed. These included 3 with terminal duplications, 1 with a terminal deletion, 1 with an interstitial deletion and 2 with interstitial amplifications. The DiGeorge deletion was the only deletion not deleted by CGH. This is not surprising as standard G banding does not routinely detect this 1-2 megabase deletion. The 4p- and 5p- monosomies were detected and breakpoints correctly assigned prospectively. Proximal alterations involving 21q22.11 are unambiguously defined. Specifically, two interstitial aneusomies involving this region are detected. Studies involving late prophase chromosome normal spreads gave identical breakpoints. Thus, analysis of extended chromosomes did not improve the sensitivity of the technique. Taken together, these data suggest that CGH can detect segmental aneusomies greater than 8 megabases in extent. Smaller aneusomies can, at times, be detected. Work is now underway to modify the analysis software to increase sensitivity and to decrease the amount of material needed for analysis.

  4. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    SciTech Connect

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott; Motin, Vladimir L.; Adkins, Joshua N.

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus

  5. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  6. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    PubMed

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

  7. Comparative genomics of Cluster O mycobacteriophages.

    PubMed

    Cresawn, Steven G; Pope, Welkin H; Jacobs-Sera, Deborah; Bowman, Charles A; Russell, Daniel A; Dedrick, Rebekah M; Adair, Tamarah; Anders, Kirk R; Ball, Sarah; Bollivar, David; Breitenberger, Caroline; Burnett, Sandra H; Butela, Kristen; Byrnes, Deanna; Carzo, Sarah; Cornely, Kathleen A; Cross, Trevor; Daniels, Richard L; Dunbar, David; Findley, Ann M; Gissendanner, Chris R; Golebiewska, Urszula P; Hartzog, Grant A; Hatherill, J Robert; Hughes, Lee E; Jalloh, Chernoh S; De Los Santos, Carla; Ekanem, Kevin; Khambule, Sphindile L; King, Rodney A; King-Smith, Christina; Klyczek, Karen; Krukonis, Greg P; Laing, Christian; Lapin, Jonathan S; Lopez, A Javier; Mkhwanazi, Sipho M; Molloy, Sally D; Moran, Deborah; Munsamy, Vanisha; Pacey, Eddie; Plymale, Ruth; Poxleitner, Marianne; Reyna, Nathan; Schildbach, Joel F; Stukey, Joseph; Taylor, Sarah E; Ware, Vassie C; Wellmann, Amanda L; Westholm, Daniel; Wodarski, Donna; Zajko, Michelle; Zikalala, Thabiso S; Hendrix, Roger W; Hatfull, Graham F

    2015-01-01

    Mycobacteriophages--viruses of mycobacterial hosts--are genetically diverse but morphologically are all classified in the Caudovirales with double-stranded DNA and tails. We describe here a group of five closely related mycobacteriophages--Corndog, Catdawg, Dylan, Firecracker, and YungJamal--designated as Cluster O with long flexible tails but with unusual prolate capsids. Proteomic analysis of phage Corndog particles, Catdawg particles, and Corndog-infected cells confirms expression of half of the predicted gene products and indicates a non-canonical mechanism for translation of the Corndog tape measure protein. Bioinformatic analysis identifies 8-9 strongly predicted SigA promoters and all five Cluster O genomes contain more than 30 copies of a 17 bp repeat sequence with dyad symmetry located throughout the genomes. Comparison of the Cluster O phages provides insights into phage genome evolution including the processes of gene flux by horizontal genetic exchange. PMID:25742016

  8. Comparative Genomics of Cluster O Mycobacteriophages

    PubMed Central

    Cresawn, Steven G.; Pope, Welkin H.; Jacobs-Sera, Deborah; Bowman, Charles A.; Russell, Daniel A.; Dedrick, Rebekah M.; Adair, Tamarah; Anders, Kirk R.; Ball, Sarah; Bollivar, David; Breitenberger, Caroline; Burnett, Sandra H.; Butela, Kristen; Byrnes, Deanna; Carzo, Sarah; Cornely, Kathleen A.; Cross, Trevor; Daniels, Richard L.; Dunbar, David; Findley, Ann M.; Gissendanner, Chris R.; Golebiewska, Urszula P.; Hartzog, Grant A.; Hatherill, J. Robert; Hughes, Lee E.; Jalloh, Chernoh S.; De Los Santos, Carla; Ekanem, Kevin; Khambule, Sphindile L.; King, Rodney A.; King-Smith, Christina; Klyczek, Karen; Krukonis, Greg P.; Laing, Christian; Lapin, Jonathan S.; Lopez, A. Javier; Mkhwanazi, Sipho M.; Molloy, Sally D.; Moran, Deborah; Munsamy, Vanisha; Pacey, Eddie; Plymale, Ruth; Poxleitner, Marianne; Reyna, Nathan; Schildbach, Joel F.; Stukey, Joseph; Taylor, Sarah E.; Ware, Vassie C.; Wellmann, Amanda L.; Westholm, Daniel; Wodarski, Donna; Zajko, Michelle; Zikalala, Thabiso S.; Hendrix, Roger W.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophages – viruses of mycobacterial hosts – are genetically diverse but morphologically are all classified in the Caudovirales with double-stranded DNA and tails. We describe here a group of five closely related mycobacteriophages – Corndog, Catdawg, Dylan, Firecracker, and YungJamal – designated as Cluster O with long flexible tails but with unusual prolate capsids. Proteomic analysis of phage Corndog particles, Catdawg particles, and Corndog-infected cells confirms expression of half of the predicted gene products and indicates a non-canonical mechanism for translation of the Corndog tape measure protein. Bioinformatic analysis identifies 8–9 strongly predicted SigA promoters and all five Cluster O genomes contain more than 30 copies of a 17 bp repeat sequence with dyad symmetry located throughout the genomes. Comparison of the Cluster O phages provides insights into phage genome evolution including the processes of gene flux by horizontal genetic exchange. PMID:25742016

  9. Comparative genomics of parasitic silkworm microsporidia reveal an association between genome expansion and host adaptation

    PubMed Central

    2013-01-01

    Background Microsporidian Nosema bombycis has received much attention because the pébrine disease of domesticated silkworms results in great economic losses in the silkworm industry. So far, no effective treatment could be found for pébrine. Compared to other known Nosema parasites, N. bombycis can unusually parasitize a broad range of hosts. To gain some insights into the underlying genetic mechanism of pathological ability and host range expansion in this parasite, a comparative genomic approach is conducted. The genome of two Nosema parasites, N. bombycis and N. antheraeae (an obligatory parasite to undomesticated silkworms Antheraea pernyi), were sequenced and compared with their distantly related species, N. ceranae (an obligatory parasite to honey bees). Results Our comparative genomics analysis show that the N. bombycis genome has greatly expanded due to the following three molecular mechanisms: 1) the proliferation of host-derived transposable elements, 2) the acquisition of many horizontally transferred genes from bacteria, and 3) the production of abundnant gene duplications. To our knowledge, duplicated genes derived not only from small-scale events (e.g., tandem duplications) but also from large-scale events (e.g., segmental duplications) have never been seen so abundant in any reported microsporidia genomes. Our relative dating analysis further indicated that these duplication events have arisen recently over very short evolutionary time. Furthermore, several duplicated genes involving in the cytotoxic metabolic pathway were found to undergo positive selection, suggestive of the role of duplicated genes on the adaptive evolution of pathogenic ability. Conclusions Genome expansion is rarely considered as the evolutionary outcome acting on those highly reduced and compact parasitic microsporidian genomes. This study, for the first time, demonstrates that the parasitic genomes can expand, instead of shrink, through several common molecular mechanisms

  10. Floral gene resources from basal angiosperms for comparative genomics research

    PubMed Central

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and

  11. Comparative Genomics in Identifying Aflatoxin Biosynthetic Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aspergillus flavus produces the most toxic and the most carcinogenic mycotoxins, aflatoxin B1 and B2. In order to solve aflatoxin contamination of food commodities, A. flavus genomics tools for identification of genes involved in aflatoxin biosynthesis have been employed. A. flavus Expressed Seque...

  12. fPoxDB: fungal peroxidase database for comparative genomics

    PubMed Central

    2014-01-01

    Background Peroxidases are a group of oxidoreductases which mediate electron transfer from hydrogen peroxide (H2O2) and organic peroxide to various electron acceptors. They possess a broad spectrum of impact on industry and fungal biology. There are numerous industrial applications using peroxidases, such as to catalyse highly reactive pollutants and to breakdown lignin for recycling of carbon sources. Moreover, genes encoding peroxidases play important roles in fungal pathogenicity in both humans and plants. For better understanding of fungal peroxidases at the genome-level, a novel genomics platform is required. To this end, Fungal Peroxidase Database (fPoxDB; http://peroxidase.riceblast.snu.ac.kr/) has been developed to provide such a genomics platform for this important gene family. Description In order to identify and classify fungal peroxidases, 24 sequence profiles were built and applied on 331 genomes including 216 from fungi and Oomycetes. In addition, NoxR, which is known to regulate NADPH oxidases (NoxA and NoxB) in fungi, was also added to the pipeline. Collectively, 6,113 genes were predicted to encode 25 gene families, presenting well-separated distribution along the taxonomy. For instance, the genes encoding lignin peroxidase, manganese peroxidase, and versatile peroxidase were concentrated in the rot-causing basidiomycetes, reflecting their ligninolytic capability. As a genomics platform, fPoxDB provides diverse analysis resources, such as gene family predictions based on fungal sequence profiles, pre-computed results of eight bioinformatics programs, similarity search tools, a multiple sequence alignment tool, domain analysis functions, and taxonomic distribution summary, some of which are not available in the previously developed peroxidase resource. In addition, fPoxDB is interconnected with other family web systems, providing extended analysis opportunities. Conclusions fPoxDB is a fungi-oriented genomics platform for peroxidases. The sequence-based

  13. Comparative Genome Mapping of Sorghum and Maize

    PubMed Central

    Whitkus, R.; Doebley, J.; Lee, M.

    1992-01-01

    Linkage relationships were determined among 85 maize low copy number nuclear DNA probes and seven isozyme loci in an F(2) population derived from a cross of Sorghum bicolor ssp. bicolor X S. bicolor ssp. arundinaceum. Thirteen linkage groups were defined, three more than the 10 chromosomes of sorghum. Use of maize DNA probes to produce the sorghum linkage map allowed us to make several inferences concerning processes involved in the evolutionary divergence of the maize and sorghum genomes. The results show that many linkage groups are conserved between these two genomes and that the amount of recombination in these conserved linkage groups is roughly equivalent in maize and sorghum. Estimates of the proportions of duplicated loci suggest that a larger proportion of the loci are duplicated in the maize genome than in the sorghum genome. This result concurs with a prior estimate that the nuclear DNA content of maize is three to four times greater than that of sorghum. The pattern of conserved linkages between maize and sorghum is such that most sorghum linkage groups are composed of loci that map to two maize chromosomes. This pattern is consistent with the hypothesized ancient polyploid origin of maize and sorghum. There are nine cases in which locus order within shared linkage groups is inverted in sorghum relative to maize. These may have arisen from either inversions or intrachromosomal translocations. We found no evidence for large interchromosomal translocations. Overall, the data suggest that the primary processes involved in divergence of the maize and sorghum genomes were duplications (either by polyploidy or segmental duplication) and inversions or intrachromosomal translocations. PMID:1360933

  14. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  15. Comparative genome analysis of Burkholderia phytofirmans PsJN reveals a wide spectrum of endophytic lifestyles based on interaction strategies with host plants

    PubMed Central

    Mitter, Birgit; Petric, Alexandra; Shin, Maria W.; Chain, Patrick S. G.; Hauberg-Lotte, Lena; Reinhold-Hurek, Barbara; Nowak, Jerzy; Sessitsch, Angela

    2013-01-01

    Burkholderia phytofirmans PsJN is a naturally occurring plant-associated bacterial endophyte that effectively colonizes a wide range of plants and stimulates their growth and vitality. Here we analyze whole genomes, of PsJN and of eight other endophytic bacteria. This study illustrates that a wide spectrum of endophytic life styles exists. Although we postulate the existence of typical endophytic traits, no unique gene cluster could be exclusively linked to the endophytic lifestyle. Furthermore, our study revealed a high genetic diversity among bacterial endophytes as reflected in their genotypic and phenotypic features. B. phytofirmans PsJN is in many aspects outstanding among the selected endophytes. It has the biggest genome consisting of two chromosomes and one plasmid, well-equipped with genes for the degradation of complex organic compounds and detoxification, e.g., 24 glutathione-S-transferase (GST) genes. Furthermore, strain PsJN has a high number of cell surface signaling and secretion systems and harbors the 3-OH-PAME quorum-sensing system that coordinates the switch of free-living to the symbiotic lifestyle in the plant-pathogen R. solanacearum. The ability of B. phytofirmans PsJN to successfully colonize such a wide variety of plant species might be based on its large genome harboring a broad range of physiological functions. PMID:23641251

  16. Comparative genomics of the lactic acid bacteria

    SciTech Connect

    Makarova, K.; Slesarev, A.; Wolf, Y.; Sorokin, A.; Mirkin, B.; Koonin, E.; Pavlov, A.; Pavlova, N.; Karamychev, V.; Polouchine, N.; Shakhova, V.; Grigoriev, I.; Lou, Y.; Rokhsar, D.; Lucas, S.; Huang, K.; Goodstein, D. M.; Hawkins, T.; Plengvidhya, V.; Welker, D.; Hughes, J.; Goh, Y.; Benson, A.; Baldwin, K.; Lee, J. -H.; Diaz-Muniz, I.; Dosti, B.; Smeianov, V; Wechter, W.; Barabote, R.; Lorca, G.; Altermann, E.; Barrangou, R.; Ganesan, B.; Xie, Y.; Rawsthorne, H.; Tamir, D.; Parker, C.; Breidt, F.; Broadbent, J.; Hutkins, R.; O'Sullivan, D.; Steele, J.; Unlu, G.; Saier, M.; Klaenhammer, T.; Richardson, P.; Kozyavkin, S.; Weimer, B.; Mills, D.

    2006-06-01

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.

  17. Retroviral enhancer detection insertions in zebrafish combined with comparative genomics reveal genomic regulatory blocks - a fundamental feature of vertebrate genomes

    PubMed Central

    Kikuta, Hiroshi; Fredman, David; Rinkwitz, Silke; Lenhard, Boris; Becker, Thomas S

    2007-01-01

    A large-scale enhancer detection screen was performed in the zebrafish using a retroviral vector carrying a basal promoter and a fluorescent protein reporter cassette. Analysis of insertional hotspots uncovered areas around developmental regulatory genes in which an insertion results in the same global expression pattern, irrespective of exact position. These areas coincide with vertebrate chromosomal segments containing identical gene order; a phenomenon known as conserved synteny and thought to be a vestige of evolution. Genomic comparative studies have found large numbers of highly conserved noncoding elements (HCNEs) spanning these and other loci. HCNEs are thought to act as transcriptional enhancers based on the finding that many of those that have been tested direct tissue specific expression in transient or transgenic assays. Although gene order in hox and other gene clusters has long been known to be conserved because of shared regulatory sequences or overlapping transcriptional units, the chromosomal areas found through insertional hotspots contain only one or a few developmental regulatory genes as well as phylogenetically unrelated genes. We have termed these regions genomic regulatory blocks (GRBs), and show that they underlie the phenomenon of conserved synteny through all sequenced vertebrate genomes. After teleost whole genome duplication, a subset of GRBs were retained in two copies, underwent degenerative changes compared with tetrapod loci that exist as single copy, and that therefore can be viewed as representing the ancestral form. We discuss these findings in light of evolution of vertebrate chromosomal architecture and the identification of human disease mutations. PMID:18047696

  18. Comparative genetics and genomics of nematodes: genome structure, development, and lifestyle.

    PubMed

    Sommer, Ralf J; Streit, Adrian

    2011-01-01

    Nematodes are found in virtually all habitats on earth. Many of them are parasites of plants and animals, including humans. The free-living nematode, Caenorhabditis elegans, is one of the genetically best-studied model organisms and was the first metazoan whose genome was fully sequenced. In recent years, the draft genome sequences of another six nematodes representing four of the five major clades of nematodes were published. Compared to mammalian genomes, all these genomes are very small. Nevertheless, they contain almost the same number of genes as the human genome. Nematodes are therefore a very attractive system for comparative genetic and genomic studies, with C. elegans as an excellent baseline. Here, we review the efforts that were made to extend genetic analysis to nematodes other than C. elegans, and we compare the seven available nematode genomes. One of the most striking findings is the unexpectedly high incidence of gene acquisition through horizontal gene transfer (HGT). PMID:21721943

  19. A New System for Comparative Functional Genomics of Saccharomyces Yeasts

    PubMed Central

    Caudy, Amy A.; Guan, Yuanfang; Jia, Yue; Hansen, Christina; DeSevo, Chris; Hayes, Alicia P.; Agee, Joy; Alvarez-Dominguez, Juan R.; Arellano, Hugo; Barrett, Daniel; Bauerle, Cynthia; Bisaria, Namita; Bradley, Patrick H.; Breunig, J. Scott; Bush, Erin; Cappel, David; Capra, Emily; Chen, Walter; Clore, John; Combs, Peter A.; Doucette, Christopher; Demuren, Olukunle; Fellowes, Peter; Freeman, Sam; Frenkel, Evgeni; Gadala-Maria, Daniel; Gawande, Richa; Glass, David; Grossberg, Samuel; Gupta, Anita; Hammonds-Odie, Latanya; Hoisos, Aaron; Hsi, Jenny; Hsu, Yu-Han Huang; Inukai, Sachi; Karczewski, Konrad J.; Ke, Xiaobo; Kojima, Mina; Leachman, Samuel; Lieber, Danny; Liebowitz, Anna; Liu, Julia; Liu, Yufei; Martin, Trevor; Mena, Jose; Mendoza, Rosa; Myhrvold, Cameron; Millian, Christian; Pfau, Sarah; Raj, Sandeep; Rich, Matt; Rokicki, Joe; Rounds, William; Salazar, Michael; Salesi, Matthew; Sharma, Rajani; Silverman, Sanford; Singer, Cara; Sinha, Sandhya; Staller, Max; Stern, Philip; Tang, Hanlin; Weeks, Sharon; Weidmann, Maxwell; Wolf, Ashley; Young, Carmen; Yuan, Jie; Crutchfield, Christopher; McClean, Megan; Murphy, Coleen T.; Llinás, Manuel; Botstein, David; Troyanskaya, Olga G.; Dunham, Maitreya J.

    2013-01-01

    Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast. PMID:23852385

  20. Comparative genomics of Serratia spp.: two paths towards endosymbiotic life.

    PubMed

    Manzano-Marín, Alejandro; Lamelas, Araceli; Moya, Andrés; Latorre, Amparo

    2012-01-01

    Symbiosis is a widespread phenomenon in nature, in which insects show a great number of these associations. Buchnera aphidicola, the obligate endosymbiont of aphids, coexists in some species with another intracellular bacterium, Serratia symbiotica. Of particular interest is the case of the cedar aphid Cinara cedri, where B. aphidicola BCc and S. symbiotica SCc need each other to fulfil their symbiotic role with the insect. Moreover, various features seem to indicate that S. symbiotica SCc is closer to an obligate endosymbiont than to other facultative S. symbiotica, such as the one described for the aphid Acirthosyphon pisum (S. symbiotica SAp). This work is based on the comparative genomics of five strains of Serratia, three free-living and two endosymbiotic ones (one facultative and one obligate) which should allow us to dissect the genome reduction taking place in the adaptive process to an intracellular life-style. Using a pan-genome approach, we have identified shared and strain-specific genes from both endosymbiotic strains and gained insight into the different genetic reduction both S. symbiotica have undergone. We have identified both retained and reduced functional categories in S. symbiotica compared to the Free-Living Serratia (FLS) that seem to be related with its endosymbiotic role in their specific host-symbiont systems. By means of a phylogenomic reconstruction we have solved the position of both endosymbionts with confidence, established the probable insect-pathogen origin of the symbiotic clade as well as the high amino-acid substitution rate in S. symbiotica SCc. Finally, we were able to quantify the minimal number of rearrangements suffered in the endosymbiotic lineages and reconstruct a minimal rearrangement phylogeny. All these findings provide important evidence for the existence of at least two distinctive S. symbiotica lineages that are characterized by different rearrangements, gene content, genome size and branch lengths. PMID:23077583

  1. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    PubMed Central

    Chari, Raj; Lockwood, William W.; Lam, Wan L.

    2006-01-01

    Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development. PMID:17992253

  2. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

    PubMed Central

    Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  3. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis.

    PubMed

    Bengelsdorf, Frank R; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood-Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (P thlA ) from C. acetobutylicum or native pta-ack promoter (P pta-ack ) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  4. Comparative Genomics of an Emerging Amphibian Virus

    PubMed Central

    Epstein, Brendan; Storfer, Andrew

    2015-01-01

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination. PMID:26530419

  5. Comparative Genomics of an Emerging Amphibian Virus.

    PubMed

    Epstein, Brendan; Storfer, Andrew

    2016-01-01

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination. PMID:26530419

  6. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains

    PubMed Central

    da Silva, Vivian S; Shida, Cláudio S; Rodrigues, Fabiana B; Ribeiro, Diógenes CD; de Souza, Alessandra A; Coletta-Filho, Helvécio D; Machado, Marcos A; Nunes, Luiz R; de Oliveira, Regina Costa

    2007-01-01

    Background The xylem-inhabiting bacterium Xylella fastidiosa (Xf) is the causal agent of Pierce's disease (PD) in vineyards and citrus variegated chlorosis (CVC) in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains. Results This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH), identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Conclusion Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly identified ORFs, obtained by

  7. High frequency of submicroscopic chromosomal imbalances in patients with syndromic craniosynostosis detected by a combined approach of microsatellite segregation analysis, multiplex ligation-dependent probe amplification and array-based comparative genome hybridisation.

    PubMed

    Jehee, F S; Krepischi-Santos, A C V; Rocha, K M; Cavalcanti, D P; Kim, C A; Bertola, D R; Alonso, L G; D'Angelo, C S; Mazzeu, J F; Froyen, G; Lugtenberg, D; Vianna-Morgante, A M; Rosenberg, C; Passos-Bueno, M R

    2008-07-01

    We present the first comprehensive study, to our knowledge, on genomic chromosomal analysis in syndromic craniosynostosis. In total, 45 patients with craniosynostotic disorders were screened with a variety of methods including conventional karyotype, microsatellite segregation analysis, subtelomeric multiplex ligation-dependent probe amplification) and whole-genome array-based comparative genome hybridisation. Causative abnormalities were present in 42.2% (19/45) of the samples, and 27.8% (10/36) of the patients with normal conventional karyotype carried submicroscopic imbalances. Our results include a wide variety of imbalances and point to novel chromosomal regions associated with craniosynostosis. The high incidence of pure duplications or trisomies suggests that these are important mechanisms in craniosynostosis, particularly in cases involving the metopic suture. PMID:18456720

  8. Gramene 2016: comparative plant genomics and pathway resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...

  9. Genomic-associated Markers and comparative Genome Maps of Xanthomonas oryzae pv. oryzae and X. oryzae pv. oryzicola.

    PubMed

    Feng, Wenjie; Wang, Yi; Huang, Lisha; Feng, Chuanshun; Chu, Zhaohui; Ding, Xinhua; Yang, Long

    2015-09-01

    Xanthomonas oryzae pv. oryzae (Xoo) and X. oryzae pv. oryzicola (Xoc) cause two major seed quarantine diseases in rice, bacterial blight and bacterial leaf streak, respectively. Xoo and Xoc share high similarity in genomic sequence, which results in hard differentiation of the two pathogens. Genomic-associated Markers and comparative Genome Maps database (GMGM) is an integrated database providing comprehensive information including compared genome maps and full genomic-coverage molecular makers of Xoo and Xoc. This database was established based on bioinformatic analysis of complete sequenced genomes of several X. oryzae pathovars of which the similarity of the genomes was up to 91.39 %. The program was designed with a series of specific PCR primers, including 286 pairs of Xoo dominant markers, 288 pairs of Xoc dominant markers, and 288 pairs of Xoo and Xoc co-dominant markers, which were predicted to distinguish two pathovars. Test on a total of 40 donor pathogen strains using randomly selected 120 pairs of primers demonstrated that over 52.5 % of the primers were efficacious. The GMGM web portal ( http://biodb.sdau.edu.cn/gmgm/ ) will be a powerful tool that can present highly specific diagnostic markers, and it also provides information about comparative genome maps of the two pathogens for future evolution study. PMID:26093644

  10. Microbial NAD metabolism: lessons from comparative genomics.

    PubMed

    Gazzaniga, Francesca; Stebbins, Rebecca; Chang, Sheila Z; McPeek, Mark A; Brenner, Charles

    2009-09-01

    NAD is a coenzyme for redox reactions and a substrate of NAD-consuming enzymes, including ADP-ribose transferases, Sir2-related protein lysine deacetylases, and bacterial DNA ligases. Microorganisms that synthesize NAD from as few as one to as many as five of the six identified biosynthetic precursors have been identified. De novo NAD synthesis from aspartate or tryptophan is neither universal nor strictly aerobic. Salvage NAD synthesis from nicotinamide, nicotinic acid, nicotinamide riboside, and nicotinic acid riboside occurs via modules of different genes. Nicotinamide salvage genes nadV and pncA, found in distinct bacteria, appear to have spread throughout the tree of life via horizontal gene transfer. Biochemical, genetic, and genomic analyses have advanced to the point at which the precursors and pathways utilized by a microorganism can be predicted. Challenges remain in dissecting regulation of pathways. PMID:19721089

  11. Array-Based Comparative Genomic Hybridization Analysis Reveals Chromosomal Copy Number Aberrations Associated with Clinical Outcome in Canine Diffuse Large B-Cell Lymphoma

    PubMed Central

    Bresolin, Silvia; Marconato, Laura; Comazzi, Stefano; Te Kronnie, Geertruy; Aresu, Luca

    2014-01-01

    Canine Diffuse Large B-cell Lymphoma (cDLBCL) is an aggressive cancer with variable clinical response. Despite recent attempts by gene expression profiling to identify the dog as a potential animal model for human DLBCL, this tumor remains biologically heterogeneous with no prognostic biomarkers to predict prognosis. The aim of this work was to identify copy number aberrations (CNAs) by high-resolution array comparative genomic hybridization (aCGH) in 12 dogs with newly diagnosed DLBCL. In a subset of these dogs, the genetic profiles at the end of therapy and at relapse were also assessed. In primary DLBCLs, 90 different genomic imbalances were counted, consisting of 46 gains and 44 losses. Two gains in chr13 were significantly correlated with clinical stage. In addition, specific regions of gains and losses were significantly associated to duration of remission. In primary DLBCLs, individual variability was found, however 14 recurrent CNAs (>30%) were identified. Losses involving IGK, IGL and IGH were always found, and gains along the length of chr13 and chr31 were often observed (>41%). In these segments, MYC, LDHB, HSF1, KIT and PDGFRα are annotated. At the end of therapy, dogs in remission showed four new CNAs, whereas three new CNAs were observed in dogs at relapse compared with the previous profiles. One ex novo CNA, involving TCR, was present in dogs in remission after therapy, possibly induced by the autologous vaccine. Overall, aCGH identified small CNAs associated with outcome, which, along with future expression studies, may reveal target genes relevant to cDLBCL. PMID:25372838

  12. Comparative Genomics and Extensive Recombinations in Phage Communities

    NASA Astrophysics Data System (ADS)

    Poisson, Guylaine; Belcaid, Mahdi; Bergeron, Anne

    Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities.

  13. Plastic architecture of bacterial genome revealed by comparative genomics of Photorhabdus variants

    PubMed Central

    Gaudriault, Sophie; Pages, Sylvie; Lanois, Anne; Laroui, Christine; Teyssier, Corinne; Jumas-Bilak, Estelle; Givaudan, Alain

    2008-01-01

    Background The phenotypic consequences of large genomic architecture modifications within a clonal bacterial population are rarely evaluated because of the difficulties associated with using molecular approaches in a mixed population. Bacterial variants frequently arise among Photorhabdus luminescens, a nematode-symbiotic and insect-pathogenic bacterium. We therefore studied genome plasticity within Photorhabdus variants. Results We used a combination of macrorestriction and DNA microarray experiments to perform a comparative genomic study of different P. luminescens TT01 variants. Prolonged culturing of TT01 strain and a genomic variant, collected from the laboratory-maintained symbiotic nematode, generated bacterial lineages composed of primary and secondary phenotypic variants and colonial variants. The primary phenotypic variants exhibit several characteristics that are absent from the secondary forms. We identify substantial plasticity of the genome architecture of some variants, mediated mainly by deletions in the 'flexible' gene pool of the TT01 reference genome and also by genomic amplification. We show that the primary or secondary phenotypic variant status is independent from global genomic architecture and that the bacterial lineages are genomic lineages. We focused on two unusual genomic changes: a deletion at a new recombination hotspot composed of long approximate repeats; and a 275 kilobase single block duplication belonging to a new class of genomic duplications. Conclusion Our findings demonstrate that major genomic variations occur in Photorhabdus clonal populations. The phenotypic consequences of these genomic changes are cryptic. This study provides insight into the field of bacterial genome architecture and further elucidates the role played by clonal genomic variation in bacterial genome evolution. PMID:18647395

  14. DNA sequence copy number analysis by Comparative Genomic Hybridization (CGH)

    SciTech Connect

    Pinkel, D.; Kallioniemi, A.; Kallioniemi, O.; Waldman, F.; Sudar, D.; Gray, I. ); Rutovitz, D.; Piper, I. )

    1993-01-01

    Comparative Genomic Hybridization (CGH) uses the kinetics of in situ hybridization to compare the copy numbers of different DNA sequences within the same genome and the copy numbers of the same sequences among different genomes. In a typical application genomic DNA from a tumor and from normal cells are differentially labeled and simultaneously hybridized to normal metaphase chromosomes, and detected with different fluorochromes. Properly registered images of each fluorochrome are obtained using a microscope equipped with multi-band filters and a CCD camera. Digital image analysis permits measurement of intensity ratio profiles along each of the target chromosomes. Studies of cells with known aberrations indicate that the intensity ratio at each position is proportional to the ratio of the copy numbers of the sequences that bind there in the tumor and normal genomes. Analytical challenges posed by the need to efficiently obtain copy number karyotypes are discussed.

  15. BLAT-based comparative analysis for transposable elements: BLATCAT.

    PubMed

    Lee, Sangbum; Oh, Sumin; Kang, Keunsoo; Han, Kyudong

    2014-01-01

    The availability of several whole genome sequences makes comparative analyses possible. In primate genomes, the priority of transposable elements (TEs) is significantly increased because they account for ~45% of the primate genomes, they can regulate the gene expression level, and they are associated with genomic fluidity in their host genomes. Here, we developed the BLAST-like alignment tool (BLAT) based comparative analysis for transposable elements (BLATCAT) program. The BLATCAT program can compare specific regions of six representative primate genome sequences (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque) on the basis of BLAT and simultaneously carry out RepeatMasker and/or Censor functions, which are widely used Windows-based web-server functions to detect TEs. All results can be stored as a HTML file for manual inspection of a specific locus. BLATCAT will be very convenient and efficient for comparative analyses of TEs in various primate genomes. PMID:24959585

  16. Comparative genomics of vesicomyid clam (Bivalvia: Mollusca) chemosynthetic symbionts

    PubMed Central

    Newton, Irene LG; Girguis, Peter R; Cavanaugh, Colleen M

    2008-01-01

    Background The Vesicomyidae (Bivalvia: Mollusca) are a family of clams that form symbioses with chemosynthetic gamma-proteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a reduced gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. Recently, two vesicomyid symbiont genomes were sequenced, illuminating the possible nutritional contributions of the symbiont to the host and making genome-wide evolutionary analyses possible. Results To examine the genomic evolution of the vesicomyid symbionts, a comparative genomics framework, including the existing genomic data combined with heterologous microarray hybridization results, was used to analyze conserved gene content in four vesicomyid symbiont genomes. These four symbionts were chosen to include a broad phylogenetic sampling of the vesicomyid symbionts and represent distinct chemosynthetic environments: cold seeps and hydrothermal vents. Conclusion The results of this comparative genomics analysis emphasize the importance of the symbionts' chemoautotrophic metabolism within their hosts. The fact that these symbionts appear to be metabolically capable autotrophs underscores the extent to which the host depends on them for nutrition and reveals the key to invertebrate colonization of these challenging environments. PMID:19055818

  17. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont.

    PubMed

    Lindsey, Amelia R I; Werren, John H; Richards, Stephen; Stouthamer, Richard

    2016-01-01

    Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801

  18. Comparative rates of evolution in endosymbiotic nuclear genomes

    PubMed Central

    Patron, Nicola J; Rogers, Matthew B; Keeling, Patrick J

    2006-01-01

    Background The nucleomorphs associated with secondary plastids of cryptomonads and chlorarachniophytes are the sole examples of organelles with eukaryotic nuclear genomes. Although not as widespread as their prokaryotic equivalents in mitochondria and plastids, nucleomorph genomes share similarities in terms of reduction and compaction. They also differ in several aspects, not least in that they encode proteins that target to the plastid, and so function in a different compartment from that in which they are encoded. Results Here, we test whether the phylogenetically distinct nucleomorph genomes of the cryptomonad, Guillardia theta, and the chlorarachniophyte, Bigelowiella natans, have experienced similar evolutionary pressures during their transformation to reduced organelles. We compared the evolutionary rates of genes from nuclear, nucleomorph, and plastid genomes, all of which encode proteins that function in the same cellular compartment, the plastid, and are thus subject to similar selection pressures. Furthermore, we investigated the divergence of nucleomorphs within cryptomonads by comparing G. theta and Rhodomonas salina. Conclusion Chlorarachniophyte nucleomorph genes have accumulated errors at a faster rate than other genomes within the same cell, regardless of the compartment where the gene product functions. In contrast, most nucleomorph genes in cryptomonads have evolved faster than genes in other genomes on average, but genes for plastid-targeted proteins are not overly divergent, and it appears that cryptomonad nucleomorphs are not presently evolving rapidly and have therefore stabilized. Overall, these analyses suggest that the forces at work in the two lineages are different, despite the similarities between the structures of their genomes. PMID:16772046

  19. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

    PubMed Central

    Lindsey, Amelia R. I.; Werren, John H.; Richards, Stephen; Stouthamer, Richard

    2016-01-01

    Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801

  20. Genome sequence and comparative genome analysis of Pseudomonas syringae pv. syringae type strain ATCC 19310.

    PubMed

    Park, Yong-Soon; Jeong, Haeyoung; Sim, Young Mi; Yi, Hwe-Su; Ryu, Choong-Min

    2014-04-01

    Pseudomonas syringae pv. syringae (Psy) is a major bacterial pathogen of many economically important plant species. Despite the severity of its impact, the genome sequence of the type strain has not been reported. Here, we present the draft genome sequence of Psy ATCC 19310. Comparative genomic analysis revealed that Psy ATCC 19310 is closely related to Psy B728a. However, only a few type III effectors, which are key virulence factors, are shared by the two strains, indicating the possibility of host-pathogen specificity and genome dynamics, even under the pathovar level. PMID:24444998

  1. Sputnik: a database platform for comparative plant genomics.

    PubMed

    Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F X

    2003-01-01

    Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics. PMID:12519965

  2. Sequencing and comparative genomics analysis in Senecio scandens Buch.-Ham. Ex D. Don, based on full-length cDNA library

    PubMed Central

    Qian, Gang; Ping, Junjiao; Zhang, Zhen; Xu, Delin

    2014-01-01

    Senecio scandens Buch.-Ham. ex D. Don, an important antibacterial source of Chinese traditional medicine, has a widespread distribution in a few ecological habitats of China. We generated a full-length complementary DNA (cDNA) library from a sample of elite individuals with superior antibacterial properties, with satisfactory parameters such as library storage (4.30 × 106 CFU), efficiency of titre (1.30 × 106 CFU/mL), transformation efficiency (96.35%), full-length ratio (64.00%) and redundancy ratio (3.28%). The BLASTN search revealed the facile formation of counterparts between the experimental sample and Arabidopsis thaliana in view of high-homology cDNA sequence (90.79%) with e-values <1e – 50. Sequence similarities to known proteins indicate that the entire sequences of the full-length cDNA clones consist of the major of functional genes identified by a large set of microarray data from the present experimental material. For other Compositae species, a large set of full-length cDNA clones reported in the present article will serve as a useful resource to facilitate further research on the transferability of expressed sequence tag-derived simple sequence repeats (EST-SSR) development, comparative genomics and novel transcript profiles. PMID:26740776

  3. Comparative analysis of prophage-like elements in Helicobacter sp. genomes

    PubMed Central

    Fan, Xiangyu; Li, Yumei; He, Rong

    2016-01-01

    Prophages are regarded as one of the factors underlying bacterial virulence, genomic diversification, and fitness, and are ubiquitous in bacterial genomes. Information on Helicobacter sp. prophages remains scarce. In this study, sixteen prophages were identified and analyzed in detail. Eight of them are described for the first time. Based on a comparative genomic analysis, these sixteen prophages can be classified into four different clusters. Phylogenetic relationships of Cluster A Helicobacter prophages were investigated. Furthermore, genomes of Helicobacter prophages from Clusters B, C, and D were analyzed. Interestingly, some putative antibiotic resistance proteins and virulence factors were associated with Helicobacter prophages. PMID:27169002

  4. BACFinder: genomic localisation of large insert genomic clones based on restriction fingerprinting

    PubMed Central

    Crowe, Mark L.; Rana, Debashis; Fraser, Fiona; Bancroft, Ian; Trick, Martin

    2002-01-01

    We have developed software that allows the prediction of the genomic location of a bacterial artificial chromosome (BAC) clone, or other large genomic clone, based on a simple restriction digest of the BAC. The mapping is performed by comparing the experimentally derived restriction digest of the BAC DNA with a virtual restriction digest of the whole genome sequence. Our trials indicate that this program identified the genomic regions represented by BAC clones with a degree of accuracy comparable to that of end-sequencing, but at considerably less cost. Although the program has been developed principally for use with Arabidopsis BACs, it should align large insert genomic clones to any fully sequenced genome. PMID:12409477

  5. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  6. Comparative Genome Analysis of Basidiomycete Fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  7. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    PubMed

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  8. Comparative genome analysis of Solanum lycopersicum and Solanum tuberosum

    PubMed Central

    Lall, Rohit; Thomas, George; Singh, Satendra; Singh, Archana; Wadhwa, Gulshan

    2013-01-01

    Solanum lycopersicum and Solanum tuberosum are agriculturally important crop species as they are rich sources of starch, protein, antioxidants, lycopene, beta-carotene, vitamin C, and fiber. The genomes of S. lycopersicum and S. tuberosum are currently available. However the linear strings of nucleotides that together comprise a genome sequence are of limited significance by themselves. Computational and bioinformatics approaches can be used to exploit the genomes for fundamental research for improving their varieties. The comparative genome analysis, Pfam analysis of predicted reviewed paralogous proteins was performed. It was found that S. lycopersicum proteins belong to more families, domains and clans in comparison with S. tuberosum. It was also found that mostly intergenic regions are conserved in two genomes followed by exons, intron and UTR. This can be exploited to predict regions between genomes that are similar to each other and to study the evolutionary relationship between two genomes, leading towards the development of disease resistance, stress tolerance and improved varieties of tomato. PMID:24307771

  9. Comparative analysis of RNAi screening technologies at genome-scale reveals an inherent processing inefficiency of the plasmid-based shRNA hairpin

    PubMed Central

    Bhinder, Bhavneet; Shum, David; Djaballah, Hakim

    2014-01-01

    RNAi screening in combination with the genome-sequencing projects would constitute the Holy Grail of modern genetics; enabling discovery and validation towards a better understanding of fundamental biology leading to novel targets to combat disease. Hit discordance at inter-screen level together with the lack of reproducibility is emerging as the technology's main pitfalls. To examine some of the underlining factors leading to such discrepancies, we reasoned that perhaps there is an inherent difference in knockdown efficiency of the various RNAi technologies. For this purpose, we utilized the two most popular ones, chemically synthesized siRNA duplex and plasmid-based shRNA hairpin, in order to perform a head to head comparison. Using a previously developed gain-of-function assay probing modulators of the miRNA biogenesis pathway, we first executed on a siRNA screen against the Silencer Select V4.0 library (AMB) nominating 1,273, followed by an shRNA screen against the TRC1 library (TRC1) nominating 497 gene candidates. We observed a poor overlap of only 29 hits given that there are 15,068 overlapping genes between the two libraries; with DROSHA as the only common hit out of the seven known core miRNA biogenesis genes. Distinct genes interacting with the same biogenesis regulators were observed in both screens, with a dismal cross-network overlap of only 3 genes (DROSHA, TGFBR1, and DIS3). Taken together, our study demonstrates differential knockdown activities between the two technologies, possibly due to the inefficient intracellular processing and potential cell-type specificity determinants in generating intended targeting sequences for the plasmid-based shRNA hairpins; and suggests this observed inefficiency as potential culprit in addressing the lack of reproducibility. PMID:24433414

  10. Comparative analysis of RNAi screening technologies at genome-scale reveals an inherent processing inefficiency of the plasmid-based shRNA hairpin.

    PubMed

    Bhinder, Bhavneet; Shum, David; Djaballah, Hakim

    2014-02-01

    RNAi screening in combination with the genome-sequencing projects would constitute the Holy Grail of modern genetics; enabling discovery and validation towards a better understanding of fundamental biology leading to novel targets to combat disease. Hit discordance at inter-screen level together with the lack of reproducibility is emerging as the technology's main pitfalls. To examine some of the underlining factors leading to such discrepancies, we reasoned that perhaps there is an inherent difference in knockdown efficiency of the various RNAi technologies. For this purpose, we utilized the two most popular ones, chemically synthesized siRNA duplex and plasmid-based shRNA hairpin, in order to perform a head to head comparison. Using a previously developed gain-of-function assay probing modulators of the miRNA biogenesis pathway, we first executed on a siRNA screen against the Silencer Select V4.0 library (AMB) nominating 1,273, followed by an shRNA screen against the TRC1 library (TRC1) nominating 497 gene candidates. We observed a poor overlap of only 29 hits given that there are 15,068 overlapping genes between the two libraries; with DROSHA as the only common hit out of the seven known core miRNA biogenesis genes. Distinct genes interacting with the same biogenesis regulators were observed in both screens, with a dismal cross-network overlap of only 3 genes (DROSHA, TGFBR1, and DIS3). Taken together, our study demonstrates differential knockdown activities between the two technologies, possibly due to the inefficient intracellular processing and potential cell-type specificity determinants in generating intended targeting sequences for the plasmid-based shRNA hairpins; and suggests this observed inefficiency as potential culprit in addressing the lack of reproducibility. PMID:24433414

  11. Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes

    SciTech Connect

    Gupta, Nitin; Benhamida, Jamal; Bhargava, Vipul; Goodman, Daniel; Kain , Elisabeth; Kerman, Ian; Nguyen , Ngan; Ollikainen, Noah; Rodriguez, Jesse; Wang, J.; Lipton, Mary S.; Romine, Margaret F.; Bafna, Vineet; Smith, Richard D.; Pevzner, Pavel A.

    2008-07-30

    While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages and cleavage of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.

  12. The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology

    PubMed Central

    Wang, Dapeng; Xia, Yan; Li, Xinna; Hou, Lixia; Yu, Jun

    2013-01-01

    Over the past 10 years, genomes of cultivated rice cultivars and their wild counterparts have been sequenced although most efforts are focused on genome assembly and annotation of two major cultivated rice (Oryza sativa L.) subspecies, 93-11 (indica) and Nipponbare (japonica). To integrate information from genome assemblies and annotations for better analysis and application, we now introduce a comparative rice genome database, the Rice Genome Knowledgebase (RGKbase, http://rgkbase.big.ac.cn/RGKbase/). RGKbase is built to have three major components: (i) integrated data curation for rice genomics and molecular biology, which includes genome sequence assemblies, transcriptomic and epigenomic data, genetic variations, quantitative trait loci (QTLs) and the relevant literature; (ii) User-friendly viewers, such as Gbrowse, GeneBrowse and Circos, for genome annotations and evolutionary dynamics and (iii) Bioinformatic tools for compositional and synteny analyses, gene family classifications, gene ontology terms and pathways and gene co-expression networks. RGKbase current includes data from five rice cultivars and species: Nipponbare (japonica), 93-11 (indica), PA64s (indica), the African rice (Oryza glaberrima) and a wild rice species (Oryza brachyantha). We are also constantly introducing new datasets from variety of public efforts, such as two recent releases—sequence data from ∼1000 rice varieties, which are mapped into the reference genome, yielding ample high-quality single-nucleotide polymorphisms and insertions–deletions. PMID:23193278

  13. DCODE.ORG Anthology of Comparative Genomic Tools

    SciTech Connect

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.

  14. Phytozome: a Tool for Green Plant Comparative Genomics

    DOE Data Explorer

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Clusters of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These clusters allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release v4.0, Phytozome provides access to nine sequenced and annotated green plant genomes, eight of which have been clustered into gene families at six evolutionarily significant nodes. Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are hyper-linked and searchable. [Copied from the Overview at http://www.phytozome.net/Phytozome_info.php

  15. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    SciTech Connect

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  16. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    PubMed Central

    Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

    2009-01-01

    Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249

  17. Barcode server: a visualization-based genome analysis system.

    PubMed

    Mao, Fenglou; Olman, Victor; Wang, Yan; Xu, Ying

    2013-01-01

    We have previously developed a computational method for representing a genome as a barcode image, which makes various genomic features visually apparent. We have demonstrated that this visual capability has made some challenging genome analysis problems relatively easy to solve. We have applied this capability to a number of challenging problems, including (a) identification of horizontally transferred genes, (b) identification of genomic islands with special properties and (c) binning of metagenomic sequences, and achieved highly encouraging results. These application results inspired us to develop this barcode-based genome analysis server for public service, which supports the following capabilities: (a) calculation of the k-mer based barcode image for a provided DNA sequence; (b) detection of sequence fragments in a given genome with distinct barcodes from those of the majority of the genome, (c) clustering of provided DNA sequences into groups having similar barcodes; and (d) homology-based search using Blast against a genome database for any selected genomic regions deemed to have interesting barcodes. The barcode server provides a job management capability, allowing processing of a large number of analysis jobs for barcode-based comparative genome analyses. The barcode server is accessible at http://csbl1.bmb.uga.edu/Barcode. PMID:23457606

  18. Barcode Server: A Visualization-Based Genome Analysis System

    PubMed Central

    Mao, Fenglou; Olman, Victor; Wang, Yan; Xu, Ying

    2013-01-01

    We have previously developed a computational method for representing a genome as a barcode image, which makes various genomic features visually apparent. We have demonstrated that this visual capability has made some challenging genome analysis problems relatively easy to solve. We have applied this capability to a number of challenging problems, including (a) identification of horizontally transferred genes, (b) identification of genomic islands with special properties and (c) binning of metagenomic sequences, and achieved highly encouraging results. These application results inspired us to develop this barcode-based genome analysis server for public service, which supports the following capabilities: (a) calculation of the k-mer based barcode image for a provided DNA sequence; (b) detection of sequence fragments in a given genome with distinct barcodes from those of the majority of the genome, (c) clustering of provided DNA sequences into groups having similar barcodes; and (d) homology-based search using Blast against a genome database for any selected genomic regions deemed to have interesting barcodes. The barcode server provides a job management capability, allowing processing of a large number of analysis jobs for barcode-based comparative genome analyses. The barcode server is accessible at http://csbl1.bmb.uga.edu/Barcode. PMID:23457606

  19. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  20. The tiger genome and comparative analysis with lion and snow leopard genomes

    PubMed Central

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  1. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  2. An Integrative Method for Accurate Comparative Genome Mapping

    PubMed Central

    Swidan, Firas; Rocha, Eduardo P. C; Shmoish, Michael; Pinter, Ron Y

    2006-01-01

    We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying “maximal similar segments,” and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes. PMID:16933978

  3. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics.

    PubMed

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  4. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics

    PubMed Central

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein–protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  5. Sequencing and comparative analyses of the genomes of zoysiagrasses.

    PubMed

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-04-01

    Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella 'Wakaba' and Z. pacifica 'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica'Kyoto', Z. japonica'Miyagi' and Z. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' at http://zoysia.kazusa.or.jp. PMID:26975196

  6. Sequencing and comparative analyses of the genomes of zoysiagrasses

    PubMed Central

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-01-01

    Zoysia is a warm-season turfgrass, which comprises 11 allotetraploid species (2n = 4x = 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession ‘Nagirizaki’ (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella ‘Wakaba’ and Z. pacifica ‘Zanpa’ were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica ‘Kyoto’, Z. japonica ‘Miyagi’ and Z. matrella ‘Chiba Fair Green’, were accumulated, and aligned against the reference genome of ‘Nagirizaki’ along with those from ‘Wakaba’ and ‘Zanpa’. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the ‘Zoysia Genome Database’ at http://zoysia.kazusa.or.jp. PMID:26975196

  7. BrucellaBase: Genome information resource.

    PubMed

    Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Khader, L K M Abdul; Sridhar, Jayavel; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2016-09-01

    Brucella sp. causes a major zoonotic disease, brucellosis. Brucella belongs to the family Brucellaceae under the order Rhizobiales of Alphaproteobacteria. We present BrucellaBase, a web-based platform, providing features of a genome database together with unique analysis tools. We have developed a web version of the multilocus sequence typing (MLST) (Whatmore et al., 2007) and phylogenetic analysis of Brucella spp. BrucellaBase currently contains genome data of 510 Brucella strains along with the user interfaces for BLAST, VFDB, CARD, pairwise genome alignment and MLST typing. Availability of these tools will enable the researchers interested in Brucella to get meaningful information from Brucella genome sequences. BrucellaBase will regularly be updated with new genome sequences, new features along with improvements in genome annotations. BrucellaBase is available online at http://www.dbtbrucellosis.in/brucellabase.html or http://59.99.226.203/brucellabase/homepage.html. PMID:27164438

  8. Comparative genomics and functional annotation of bacterial transporters

    NASA Astrophysics Data System (ADS)

    Gelfand, Mikhail S.; Rodionov, Dmitry A.

    2008-03-01

    Transport proteins are difficult to study experimentally, and because of that their functional characterization trails that of enzymes. The comparative genomic analysis is a powerful approach to functional annotation of proteins, which makes it possible to utilize the genomic sequence data from thousands of organisms. The use of computational techniques allows one to identify candidate transporters, predict their structure and localization in the membrane, and perform detailed functional annotation, which includes substrate specificity and cellular role. We overview the main techniques of analysis of transporters' structure and function. We consider the most popular algorithms to identify transmembrane segments in protein sequences and to predict topology of multispanning proteins. We describe the main approaches of the comparative genomics, and how they may be applied to the analysis of transporters, and provide examples showing how combinations of these techniques is used for functional annotation of new transporter specificities in known families, characterization of new families, and prediction of novel transport mechanisms.

  9. Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation

    PubMed Central

    Rasko, David A.; Worsham, Patricia L.; Abshire, Terry G.; Stanley, Scott T.; Bannan, Jason D.; Wilson, Mark R.; Langham, Richard J.; Decker, R. Scott; Jiang, Lingxia; Read, Timothy D.; Phillippy, Adam M.; Salzberg, Steven L.; Pop, Mihai; Van Ert, Matthew N.; Kenefic, Leo J.; Keim, Paul S.; Fraser-Liggett, Claire M.; Ravel, Jacques

    2011-01-01

    Before the anthrax letter attacks of 2001, the developing field of microbial forensics relied on microbial genotyping schemes based on a small portion of a genome sequence. Amerithrax, the investigation into the anthrax letter attacks, applied high-resolution whole-genome sequencing and comparative genomics to identify key genetic features of the letters’ Bacillus anthracis Ames strain. During systematic microbiological analysis of the spore material from the letters, we identified a number of morphological variants based on phenotypic characteristics and the ability to sporulate. The genomes of these morphological variants were sequenced and compared with that of the B. anthracis Ames ancestor, the progenitor of all B. anthracis Ames strains. Through comparative genomics, we identified four distinct loci with verifiable genetic mutations. Three of the four mutations could be directly linked to sporulation pathways in B. anthracis and more specifically to the regulation of the phosphorylation state of Spo0F, a key regulatory protein in the initiation of the sporulation cascade, thus linking phenotype to genotype. None of these variant genotypes were identified in single-colony environmental B. anthracis Ames isolates associated with the investigation. These genotypes were identified only in B. anthracis morphotypes isolated from the letters, indicating that the variants were not prevalent in the environment, not even the environments associated with the investigation. This study demonstrates the forensic value of systematic microbiological analysis combined with whole-genome sequencing and comparative genomics. PMID:21383169

  10. Comparative genomics reveals conserved positioning of essential genomic clusters in highly rearranged Thermococcales chromosomes

    PubMed Central

    Cossu, Matteo; Da Cunha, Violette; Toffano-Nioche, Claire; Forterre, Patrick; Oberto, Jacques

    2015-01-01

    The genomes of the 21 completely sequenced Thermococcales display a characteristic high level of rearrangements. As a result, the prediction of their origin and termination of replication on the sole basis of chromosomal DNA composition or skew is inoperative. Using a different approach based on biologically relevant sequences, we were able to determine oriC position in all 21 genomes. The position of dif, the site where chromosome dimers are resolved before DNA segregation could be predicted in 19 genomes. Computation of the core genome uncovered a number of essential gene clusters with a remarkably stable chromosomal position across species, in sharp contrast with the scrambled nature of their genomes. The active chromosomal reorganization of numerous genes acquired by horizontal transfer, mainly from mobile elements, could explain this phenomenon. PMID:26166067

  11. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    PubMed Central

    2011-01-01

    Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921

  12. Using Comparative Genomics to Drive New Discoveries in Microbiology

    PubMed Central

    Haft, Daniel H.

    2015-01-01

    Bioinformatics looks to many microbiologists like a service industry. In this view, annotation starts with what is known from experiments in the lab, makes reasonable inferences of which genes match other genes in function, builds databases to make all that we know accessible, but creates nothing truly new. Experiments lead, then biocuration and computational biology follow. But the astounding success of genome sequencing is changing the annotation paradigm. Every genome sequenced is an intercepted coded message from the microbial world, and as all cryptographers know, it is easier to decode a thousand messages than a single message. Some biology is best discovered not by phenomenology, but by decoding genome content, forming hypotheses, and doing the first few rounds of validation computationally. Through such reasoning, a role and function may be assigned to a protein with no sequence similarity to any protein yet studied. Experimentation can follow after the discovery to cement and to extend the findings. Unfortunately, this approach remains so unfamiliar to most bench scientists that lab work and comparative genomics typically segregate to different teams working on unconnected projects. This review will discuss several themes in comparative genomics as a discovery method, including highly derived data, use of patterns of design to reason by analogy, and in silico testing of computationally generated hypotheses. PMID:25617609

  13. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  14. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources. PMID:27446038

  15. Comparative Whole-Genome Hybridization Reveals Genomic Islands in Brucella Species†

    PubMed Central

    Rajashekara, Gireesh; Glasner, Jeremy D.; Glover, David A.; Splitter, Gary A.

    2004-01-01

    Brucella species are responsible for brucellosis, a worldwide zoonotic disease causing abortion in domestic animals and Malta fever in humans. Based on host preference, the genus is divided into six species. Brucella abortus, B. melitensis, and B. suis are pathogenic to humans, whereas B. ovis and B. neotomae are nonpathogenic to humans and B. canis human infections are rare. Limited genome diversity exists among Brucella species. Comparison of Brucella species whole genomes is, therefore, likely to identify factors responsible for differences in host preference and virulence restriction. To facilitate such studies, we used the complete genome sequence of B. melitensis 16M, the species highly pathogenic to humans, to construct a genomic microarray. Hybridization of labeled genomic DNA from Brucella species to this microarray revealed a total of 217 open reading frames (ORFs) altered in five Brucella species analyzed. These ORFs are often found in clusters (islands) in the 16M genome. Examination of the genomic context of these islands suggests that many are horizontally acquired. Deletions of genetic content identified in Brucella species are conserved in multiple strains of the same species, and genomic islands missing in a given species are often restricted to that particular species. These findings suggest that, whereas the loss or gain of genetic material may be related to the host range and virulence restriction of certain Brucella species for humans, independent mechanisms involving gene inactivation or altered expression of virulence determinants may also contribute to these differences. PMID:15262941

  16. CMG-Biotools, a Free Workbench for Basic Comparative Microbial Genomics

    PubMed Central

    Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David

    2013-01-01

    Background Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. Results The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. Conclusion This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training. PMID:23577086

  17. Genome informatics and vaccine targets in Corynebacterium urealyticum using two whole genomes, comparative genomics, and reverse vaccinology

    PubMed Central

    2015-01-01

    Background Corynebacterium urealyticum is an opportunistic pathogen that normally lives on skin and mucous membranes in humans. This high Gram-positive bacteria can cause acute or encrusted cystitis, encrusted pyelitis, and pyelonephritis in immunocompromised patients. The bacteria is multi-drug resistant, and knowledge about the genes that contribute to its virulence is very limited. Two complete genome sequences were used in this comparative genomic study: C. urealyticum DSM 7109 and C. urealyticum DSM 7111. Results We used comparative genomics strategies to compare the two strains, DSM 7109 and DSM 7111, and to analyze their metabolic pathways, genome plasticity, and to predict putative antigenic targets. The genomes of these two strains together encode 2,115 non-redundant coding sequences, 1,823 of which are common to both genomes. We identified 188 strain-specific genes in DSM 7109 and 104 strain-specific genes in DSM 7111. The high number of strain-specific genes may be a result of horizontal gene transfer triggered by the large number of transposons in the genomes of these two strains. Screening for virulence factors revealed the presence of the spaDEF operon that encodes pili forming proteins. Therefore, spaDEF may play a pivotal role in facilitating the adhesion of the pathogen to the host tissue. Application of the reverse vaccinology method revealed 19 putative antigenic proteins that may be used in future studies as candidate drug or vaccine targets. Conclusions The genome features and the presence of virulence factors in genomic islands in the two strains of C. urealyticum provide insights in the lifestyle of this opportunistic pathogen and may be useful in developing future therapeutic strategies. PMID:26041051

  18. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome

  19. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  20. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity.

    PubMed

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-08-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  1. Variation block-based genomics method for crop plants

    PubMed Central

    2014-01-01

    Background In contrast with wild species, cultivated crop genomes consist of reshuffled recombination blocks, which occurred by crossing and selection processes. Accordingly, recombination block-based genomics analysis can be an effective approach for the screening of target loci for agricultural traits. Results We propose the variation block method, which is a three-step process for recombination block detection and comparison. The first step is to detect variations by comparing the short-read DNA sequences of the cultivar to the reference genome of the target crop. Next, sequence blocks with variation patterns are examined and defined. The boundaries between the variation-containing sequence blocks are regarded as recombination sites. All the assumed recombination sites in the cultivar set are used to split the genomes, and the resulting sequence regions are termed variation blocks. Finally, the genomes are compared using the variation blocks. The variation block method identified recurring recombination blocks accurately and successfully represented block-level diversities in the publicly available genomes of 31 soybean and 23 rice accessions. The practicality of this approach was demonstrated by the identification of a putative locus determining soybean hilum color. Conclusions We suggest that the variation block method is an efficient genomics method for the recombination block-level comparison of crop genomes. We expect that this method will facilitate the development of crop genomics by bringing genomics technologies to the field of crop breeding. PMID:24929792

  2. Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments.

    PubMed

    Cenci, Alberto; Combes, Marie-Christine; Lashermes, Philippe

    2012-01-01

    Sequence comparison of orthologous regions enables estimation of the divergence between genomes, analysis of their evolution and detection of particular features of the genomes, such as sequence rearrangements and transposable elements. Despite the economic importance of Coffea species, little genomic information is currently available. Coffea is a relatively young genus that includes more than one hundred diploid species and a single tetraploid species. Three Coffea orthologous regions of 470-900 kb were analyzed and compared: both subgenomes of allotetraploid Coffea arabica (contributed by the diploid species Coffea eugenioides and Coffea canephora) and the genome of diploid C. canephora. Sequence divergence was calculated on global alignments or on coding and non-coding sequences separately. A search for transposable elements detected 43 retrotransposons and 198 transposons in the sequences analyzed. Comparative insertion analysis made it possible to locate 165 TE insertions in the phylogenetic tree of the three genomes/subgenomes. In the tetraploid C. arabica, a homoeologous non-reciprocal transposition (HNRT) was detected and characterized: a 50 kb region of the C. eugenioides derived subgenome replaced the C. canephora derived counterpart. Comparative sequence analysis on three Coffea genomes/subgenomes revealed almost perfect gene synteny, low sequence divergence and a high number of shared transposable elements. Compared to the results of similar analysis in other genera (Aegilops/Triticum and Oryza), Coffea genomes/subgenomes appeared to be dramatically less diverged, which is consistent with the relatively recent radiation of the Coffea genus. Based on nucleotide substitution frequency, the HNRT was dated at 10,000-50,000 years BP, which is also the most recent estimation of the origin of C. arabica. PMID:22086332

  3. Utility of array comparative genomic hybridization in cytogenetic analysis.

    PubMed

    Singh, Rashmi R; Cheung, K-John J; Horsman, Douglas E

    2011-01-01

    Conventional comparative genomic hybridization (CGH), high-resolution oligonucleotide, and BAC array CGH have modernized the field of cytogenetics to enable access to unbalanced genomic aberrations such as whole or partial chromosomal gains and losses. The basic principle of array CGH involves hybridizing differentially labeled proband/test (e.g., tumor) and normal reference DNA on an array of oligonucleotide or BAC clones instead of normal metaphases as in conventional CGH. The sub-megabase resolution tiling BAC arrays are extremely useful for the analysis of acquired aberrations in cancer genomes. Array CGH can be extremely useful to identify the chromosomal makeup of marker and ring chromosomes, to define/delineate the precise location/bands involved in structural aberrations and the accurate localization of translocation breakpoints in both simple and complex karyotypes either alone or in combination with standard karyotype analysis. PMID:21431645

  4. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    PubMed Central

    Ma, Li-Jun; van der Does, H. Charlotte; Borkovich, Katherine A.; Coleman, Jeffrey J.; Daboussi, Marie-Josée; Di Pietro, Antonio; Dufresne, Marie; Freitag, Michael; Grabherr, Manfred; Henrissat, Bernard; Houterman, Petra M.; Kang, Seogchan; Shim, Won-Bo; Woloshuk, Charles; Xie, Xiaohui; Xu, Jin-Rong; Antoniw, John; Baker, Scott E.; Bluhm, Burton H.; Breakspear, Andrew; Brown, Daren W.; Butchko, Robert A. E.; Chapman, Sinead; Coulson, Richard; Coutinho, Pedro M.; Danchin, Etienne G. J.; Diener, Andrew; Gale, Liane R.; Gardiner, Donald M.; Goff, Stephen; Hammond-Kosack, Kim E.; Hilburn, Karen; Hua-Van, Aurélie; Jonkers, Wilfried; Kazan, Kemal; Kodira, Chinnappa D.; Koehrsen, Michael; Kumar, Lokesh; Lee, Yong-Hwan; Li, Liande; Manners, John M.; Miranda-Saavedra, Diego; Mukherjee, Mala; Park, Gyungsoon; Park, Jongsun; Park, Sook-Young; Proctor, Robert H.; Regev, Aviv; Ruiz-Roldan, M. Carmen; Sain, Divya; Sakthikumar, Sharadha; Sykes, Sean; Schwartz, David C.; Turgeon, B. Gillian; Wapinski, Ilan; Yoder, Olen; Young, Sarah; Zeng, Qiandong; Zhou, Shiguo; Galagan, James; Cuomo, Christina A.; Kistler, H. Corby; Rep, Martijn

    2011-01-01

    Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum f. sp. lycopersici. Our analysis revealed lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity, indicative of horizontal acquisition. Experimentally, we demonstrate the transfer of two LS chromosomes between strains of F. oxysporum, converting a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in F. oxysporum. These findings put the evolution of fungal pathogenicity into a new perspective. PMID:20237561

  5. Comparative mitochondrial genomics within and among species of killifish

    PubMed Central

    Whitehead, Andrew

    2009-01-01

    Background This study was motivated by the observation of unusual mitochondrial haplotype distributions and associated physiological differences between populations of the killifish Fundulus heteroclitus distributed along the Atlantic coast of North America. A distinct "northern" haplotype is fixed in all populations north of New Jersey, and does not appear south of New Jersey except in extreme upper-estuary fresh water habitats, and northern individuals are known to be more tolerant of hyposmotic conditions than southern individuals. Complete mitochondrial genomes were sequenced from individuals from northern coastal, southern coastal, and fresh water populations (and from out-groups). Comparative genomics approaches were used to test multiple evolutionary hypotheses proposed to explain among-population genome variation including directional selection and hybridization. Results Structure and organization of the Fundulus mitochondrial genome is typical of animals, yet subtle differences in substitution patterns exist among populations. No signals of directional selection or hybridization were detected. Mitochondrial genes evolve at variable rates, but all genes exhibit very low dN/dS ratios across all lineages, and the southern population harbors more synonymous polymorphism than other populations. Conclusion Evolution of mitochondrial genomes within Fundulus is primarily governed by interaction between strong purifying selection and demographic influences, including larger historical population size in the south. Though directional selection and hybridization hypotheses were not supported, adaptive processes may indirectly contribute to partitioning of variation between populations. PMID:19144111

  6. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. PMID:25296770

  7. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  8. Comparative genomics of Synechococcus and proposal of the new genus Parasynechococcus.

    PubMed

    Coutinho, Felipe; Tschoeke, Diogo Antonio; Thompson, Fabiano; Thompson, Cristiane

    2016-01-01

    Synechococcus is among the most important contributors to global primary productivity. The genomes of several strains of this taxon have been previously sequenced in an effort to understand the physiology and ecology of these highly diverse microorganisms. Here we present a comparative study of Synechococcus genomes. For that end, we developed GenTaxo, a program written in Perl to perform genomic taxonomy based on average nucleotide identity, average amino acid identity and dinucleotide signatures, which revealed that the analyzed strains are drastically distinct regarding their genomic content. Phylogenomic reconstruction indicated a division of Synechococcus in two clades (i.e. Synechococcus and the new genus Parasynechococcus), corroborating evidences that this is in fact a polyphyletic group. By clustering protein encoding genes into homologue groups we were able to trace the Pangenome and core genome of both marine and freshwater Synechococcus and determine the genotypic traits that differentiate these lineages. PMID:26839740

  9. Comparative genomics of Synechococcus and proposal of the new genus Parasynechococcus

    PubMed Central

    Coutinho, Felipe; Tschoeke, Diogo Antonio; Thompson, Cristiane

    2016-01-01

    Synechococcus is among the most important contributors to global primary productivity. The genomes of several strains of this taxon have been previously sequenced in an effort to understand the physiology and ecology of these highly diverse microorganisms. Here we present a comparative study of Synechococcus genomes. For that end, we developed GenTaxo, a program written in Perl to perform genomic taxonomy based on average nucleotide identity, average amino acid identity and dinucleotide signatures, which revealed that the analyzed strains are drastically distinct regarding their genomic content. Phylogenomic reconstruction indicated a division of Synechococcus in two clades (i.e. Synechococcus and the new genus Parasynechococcus), corroborating evidences that this is in fact a polyphyletic group. By clustering protein encoding genes into homologue groups we were able to trace the Pangenome and core genome of both marine and freshwater Synechococcus and determine the genotypic traits that differentiate these lineages. PMID:26839740

  10. Comparative genomics of transcriptional regulation of methionine metabolism in Proteobacteria.

    PubMed

    Leyn, Semen A; Suvorova, Inna A; Kholina, Tatiana D; Sherstneva, Sofia S; Novichkov, Pavel S; Gelfand, Mikhail S; Rodionov, Dmitry A

    2014-01-01

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria. PMID:25411846

  11. Comparative Genomics of Transcriptional Regulation of Methionine Metabolism in Proteobacteria

    PubMed Central

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; Sherstneva, Sofia S.; Novichkov, Pavel S.; Gelfand, Mikhail S.; Rodionov, Dmitry A.

    2014-01-01

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria. PMID:25411846

  12. Comparative genomics of transcriptional regulation of methionine metabolism in proteobacteria

    DOE PAGESBeta

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; Sherstneva, Sofia S.; Novichkov, Pavel S.; Gelfand, Mikhail S.; Rodionov, Dmitry A.; Kuipers, Oscar P.

    2014-11-20

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ~200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific andmore » genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.« less

  13. Comparative genomics of transcriptional regulation of methionine metabolism in proteobacteria

    SciTech Connect

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; Sherstneva, Sofia S.; Novichkov, Pavel S.; Gelfand, Mikhail S.; Rodionov, Dmitry A.; Kuipers, Oscar P.

    2014-11-20

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ~200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.

  14. Comparative genomics and transcriptomics of trait-gene association

    PubMed Central

    2012-01-01

    Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs). We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis. PMID:23181781

  15. Comparative genomics of Enterococcus faecalis from healthy Norwegian infants

    PubMed Central

    Solheim, Margrete; Aakra, Ågot; Snipen, Lars G; Brede, Dag A; Nes, Ingolf F

    2009-01-01

    Background Enterococcus faecalis, traditionally considered a harmless commensal of the intestinal tract, is now ranked among the leading causes of nosocomial infections. In an attempt to gain insight into the genetic make-up of commensal E. faecalis, we have studied genomic variation in a collection of community-derived E. faecalis isolated from the feces of Norwegian infants. Results The E. faecalis isolates were first sequence typed by multilocus sequence typing (MLST) and characterized with respect to antibiotic resistance and properties associated with virulence. A subset of the isolates was compared to the vancomycin resistant strain E. faecalis V583 (V583) by whole genome microarray comparison (comparative genomic hybridization (CGH)). Several of the putative enterococcal virulence factors were found to be highly prevalent among the commensal baby isolates. The genomic variation as observed by CGH was less between isolates displaying the same MLST sequence type than between isolates belonging to different evolutionary lineages. Conclusion The variations in gene content observed among the investigated commensal E. faecalis is comparable to the genetic variation previously reported among strains of various origins thought to be representative of the major E. faecalis lineages. Previous MLST analysis of E. faecalis have identified so-called high-risk enterococcal clonal complexes (HiRECC), defined as genetically distinct subpopulations, epidemiologically associated with enterococcal infections. The observed correlation between CGH and MLST presented here, may offer a method for the identification of lineage-specific genes, and may therefore add clues on how to distinguish pathogenic from commensal E. faecalis. In this work, information on the core genome of E. faecalis is also substantially extended. PMID:19393078

  16. mGenomeSubtractor: a web-based tool for parallel in silico subtractive hybridization analysis of multiple bacterial genomes.

    PubMed

    Shao, Yucheng; He, Xinyi; Harrison, Ewan M; Tai, Cui; Ou, Hong-Yu; Rajakumar, Kumar; Deng, Zixin

    2010-07-01

    mGenomeSubtractor performs an mpiBLAST-based comparison of reference bacterial genomes against multiple user-selected genomes for investigation of strain variable accessory regions. With parallel computing architecture, mGenomeSubtractor is able to run rapid BLAST searches of the segmented reference genome against multiple subject genomes at the DNA or amino acid level within a minute. In addition to comparison of protein coding sequences, the highly flexible sliding window-based genome fragmentation approach offered can be used to identify short unique sequences within or between genes. mGenomeSubtractor provides powerful schematic outputs for exploration of identified core and accessory regions, including searches against databases of mobile genetic elements, virulence factors or bacterial essential genes, examination of G+C content and binucleotide distribution bias, and integrated primer design tools. mGenomeSubtractor also allows for the ready definition of species-specific gene pools based on available genomes. Pan-genomic arrays can be easily developed using the efficient oligonucleotide design tool. This simple high-throughput in silico 'subtractive hybridization' analytical tool will support the rapidly escalating number of comparative bacterial genomics studies aimed at defining genomic biomarkers of evolutionary lineage, phenotype, pathotype, environmental adaptation and/or disease-association of diverse bacterial species. mGenomeSubtractor is freely available to all users without any login requirement at: http://bioinfo-mml.sjtu.edu.cn/mGS/. PMID:20435682

  17. Genetic profiling of yeast industrial strains using in situ comparative genomic hybridization (CGH).

    PubMed

    Wnuk, Maciej; Panek, Anita; Golec, Ewelina; Magda, Michal; Deregowska, Anna; Adamczyk, Jagoda; Lewinska, Anna

    2015-09-20

    The genetic differences and changes in genomic stability may affect fermentation processes involving baker's, brewer's and wine yeast strains. Thus, it seems worthwhile to monitor the changes in genomic DNA copy number of industrial strains. In the present study, we developed an in situ comparative genomic hybridization (CGH) to investigate the ploidy and genetic differences between selected industrial yeast strains. The CGH-based system was validated using the laboratory Saccharomyces cerevisiae yeast strains (haploid BY4741 and diploid BY4743). DNA isolated from BY4743 cells was considered a reference DNA. The ploidy and DNA gains and losses of baker's, brewer's and wine strains were revealed. Taken together, the in situ CGH was shown a helpful molecular tool to identify genomic differences between yeast industrial strains. Moreover, the in situ CGH-based system may be used at the single-cell level of analysis to supplement array-based techniques and high-throughput analyses at the population scale. PMID:26116136

  18. Mitome: dynamic and interactive database for comparative mitochondrial genomics in metazoan animals.

    PubMed

    Lee, Yong Seok; Oh, Jeongsu; Kim, Young Uk; Kim, Namchul; Yang, Sungjin; Hwang, Ui Wook

    2008-01-01

    Mitome is a specialized mitochondrial genome database designed for easy comparative analysis of various features of metazoan mitochondrial genomes such as base frequency, A+T skew, codon usage and gene arrangement pattern. A particular function of the database is the automatic reconstruction of phylogenetic relationships among metazoans selected by a user from a taxonomic tree menu based on nucleotide sequences, amino acid sequences or gene arrangement patterns. Mitome also enables us (i) to easily find the taxonomic positions of organisms of which complete mitochondrial genome sequences are publicly available; (ii) to acquire various metazoan mitochondrial genome characteristics through a graphical genome browser; (iii) to search for homology patterns in mitochondrial gene arrangements; (iv) to download nucleotide or amino acid sequences not only of an entire mitochondrial genome but also of each component; and (v) to find interesting references easily through links with PubMed. In order to provide users with a dynamic, responsive, interactive and faster web database, Mitome is constructed using two recently highlighted techniques, Ajax (Asynchronous JavaScript and XML) and Web Services. Mitome has the potential to become very useful in the fields of molecular phylogenetics and evolution and comparative organelle genomics. The database is available at: http://www.mitome.info. PMID:17940090

  19. Sequence and comparative genomic analysis of actin-related proteins.

    PubMed

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  20. Comparative genomics of mutualistic viruses of Glyptapanteles parasitic wasps

    PubMed Central

    Desjardins, Christopher A; Gundersen-Rindal, Dawn E; Hostetler, Jessica B; Tallon, Luke J; Fadrosh, Douglas W; Fuester, Roger W; Pedroni, Monica J; Haas, Brian J; Schatz, Michael C; Jones, Kristine M; Crabtree, Jonathan; Forberger, Heather; Nene, Vishvanath

    2008-01-01

    Background Polydnaviruses, double-stranded DNA viruses with segmented genomes, have evolved as obligate endosymbionts of parasitoid wasps. Virus particles are replication deficient and produced by female wasps from proviral sequences integrated into the wasp genome. These particles are co-injected with eggs into caterpillar hosts, where viral gene expression facilitates parasitoid survival and, thereby, survival of proviral DNA. Here we characterize and compare the encapsidated viral genome sequences of bracoviruses in the family Polydnaviridae associated with Glyptapanteles gypsy moth parasitoids, along with near complete proviral sequences from which both viral genomes are derived. Results The encapsidated Glyptapanteles indiensis and Glyptapanteles flavicoxis bracoviral genomes, each composed of 29 different size segments, total approximately 517 and 594 kbp, respectively. They are generated from a minimum of seven distinct loci in the wasp genome. Annotation of these sequences revealed numerous novel features for polydnaviruses, including insect-like sugar transporter genes and transposable elements. Evolutionary analyses suggest that positive selection is widespread among bracoviral genes. Conclusions The structure and organization of G. indiensis and G. flavicoxis bracovirus proviral segments as multiple loci containing one to many viral segments, flanked and separated by wasp gene-encoding DNA, is confirmed. Rapid evolution of bracovirus genes supports the hypothesis of bracovirus genes in an 'arms race' between bracovirus and caterpillar. Phylogenetic analyses of the bracoviral genes encoding sugar transporters provides the first robust evidence of a wasp origin for some polydnavirus genes. We hypothesize transposable elements, such as those described here, could facilitate transfer of genes between proviral segments and host DNA. PMID:19116010

  1. Comparative Genomics of Multidrug Resistance in Acinetobacter baumannii

    PubMed Central

    2006-01-01

    Acinetobacter baumannii is a species of nonfermentative gram-negative bacteria commonly found in water and soil. This organism was susceptible to most antibiotics in the 1970s. It has now become a major cause of hospital-acquired infections worldwide due to its remarkable propensity to rapidly acquire resistance determinants to a wide range of antibacterial agents. Here we use a comparative genomic approach to identify the complete repertoire of resistance genes exhibited by the multidrug-resistant A. baumannii strain AYE, which is epidemic in France, as well as to investigate the mechanisms of their acquisition by comparison with the fully susceptible A. baumannii strain SDF, which is associated with human body lice. The assembly of the whole shotgun genome sequences of the strains AYE and SDF gave an estimated size of 3.9 and 3.2 Mb, respectively. A. baumannii strain AYE exhibits an 86-kb genomic region termed a resistance island—the largest identified to date—in which 45 resistance genes are clustered. At the homologous location, the SDF strain exhibits a 20 kb-genomic island flanked by transposases but devoid of resistance markers. Such a switching genomic structure might be a hotspot that could explain the rapid acquisition of resistance markers under antimicrobial pressure. Sequence similarity and phylogenetic analyses confirm that most of the resistance genes found in the A. baumannii strain AYE have been recently acquired from bacteria of the genera Pseudomonas, Salmonella, or Escherichia. This study also resulted in the discovery of 19 new putative resistance genes. Whole-genome sequencing appears to be a fast and efficient approach to the exhaustive identification of resistance genes in epidemic infectious agents of clinical significance. PMID:16415984

  2. Genomic copy number alterations in 33 malignant peritoneal mesothelioma analyzed by comparative genomic hybridization array.

    PubMed

    Chirac, Pierre; Maillet, Denis; Leprêtre, Frédéric; Isaac, Sylvie; Glehen, Olivier; Figeac, Martin; Villeneuve, Laurent; Péron, Julien; Gibson, Fernando; Galateau-Sallé, Françoise; Gilly, François-Noël; Brevet, Marie

    2016-09-01

    Malignant peritoneal mesotheliomas (MPM) are rare, accounting for approximately 8% of cases of mesothelioma in France. We performed comparative genomic hybridization (CGH) on frozen MPM samples using the Agilent Human Genome CGH 180 K array. Samples were taken from a total of 33 French patients, comprising 20 men and 13 women with a mean (range) age of 58.4 (17-76) years. Asbestos exposure was reported in 8 patients (24.2%). Median (range) overall survival (OS) was 39 (0-119) months. CGH analysis demonstrated the presence of chromosomal instability in patients with MPM, with a genomic pattern that was similar to that described for pleural mesothelioma, including the loss of chromosomal regions 3p21, 9p21, and 22q12. In addition, novel genomic copy number alterations were identified, including the 15q26.2 region and the 8p11.22 region. Median OS was associated with a low peritoneal cancer index (P=.011), epithelioid subtype (P=.038), and a low number of genomic aberrations (P=.015), all of which constitute good prognostic factors for MPM. Our results provide new insights into the genetic and genomic background of MPM. Although pleural and peritoneal mesotheliomas have different risk factors, different therapeutics, and different prognosis; these data provide support to combine pleural and peritoneal mesothelioma in same clinical assays. PMID:27184482

  3. Evolution of genome size in Carex (Cyperaceae) in relation to chromosome number and genomic base composition

    PubMed Central

    Lipnerová, Ivana; Bureš, Petr; Horová, Lucie; Šmarda, Petr

    2013-01-01

    Background and Aims The genus Carex exhibits karyological peculiarities related to holocentrism, specifically extremely broad and almost continual variation in chromosome number. However, the effect of these peculiarities on the evolution of the genome (genome size, base composition) remains unknown. While in monocentrics, determining the arithmetic relationship between the chromosome numbers of related species is usually sufficient for the detection of particular modes of karyotype evolution (i.e. polyploidy and dysploidy), in holocentrics where chromosomal fission and fusion occur such detection requires knowledge of the DNA content. Methods The genome size and GC content were estimated in 157 taxa using flow cytometry. The exact chromosome numbers were known for 96 measured samples and were taken from the available literature for other taxa. All relationships were tested in a phylogenetic framework using the ITS tree of 105 species. Key Results The 1C genome size varied between 0·24 and 1·64 pg in Carex secalina and C. cuspidata, respectively. The genomic GC content varied from 34·8 % to 40·6 % from C. secalina to C. firma. Both genomic parameters were positively correlated. Seven polyploid and two potentially polyploid taxa were detected in the core Carex clade. A strong negative correlation between genome size and chromosome number was documented in non-polyploid taxa. Non-polyploid taxa of the core Carex clade exhibited a higher rate of genome-size evolution compared with the Vignea clade. Three dioecious taxa exhibited larger genomes, larger chromosomes, and a higher GC content than their hermaphrodite relatives. Conclusions Genomes of Carex are relatively small and very GC-poor compared with other angiosperms. We conclude that the evolution of genome and karyotype in Carex is promoted by frequent chromosomal fissions/fusions, rare polyploidy and common repetitive DNA proliferation/removal. PMID:23175591

  4. Decoding the molecular evolution of human cognition using comparative genomics

    PubMed Central

    Usui, Noriyoshi; Co, Marissa; Konopka, Genevieve

    2014-01-01

    Identification of genetic and molecular factors responsible for the specialized cognitive abilities of humans is expected to provide important insights into the mechanisms responsible for disorders of cognition such as autism, schizophrenia, and Alzheimer’s disease. Here, we discuss the use of comparative genomics for identifying salient genes and gene networks that may underlie cognition. We focus on the comparison of human and non-human primate brain gene expression and the utility of building gene co-expression networks for prioritizing hundreds of genes that differ in expression among the species queried. We also discuss the importance and methods for functional studies of individual genes identified. Together, this integration of comparative genomics with cellular and animal models should provide improved systems for developing effective therapeutics for disorders of cognition. PMID:25247723

  5. Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis

    PubMed Central

    Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del

    2015-01-01

    Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096

  6. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes

    PubMed Central

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M.; Murphy, Robert W.; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-01-01

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies. PMID:25733869

  7. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

    PubMed

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-03-17

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies. PMID:25733869

  8. Unlocking Holocentric Chromosomes: New Perspectives from Comparative and Functional Genomics?

    PubMed Central

    Mandrioli, Mauro; Manicardi, Gian Carlo

    2012-01-01

    The presence of chromosomes with diffuse centromeres (holocentric chromosomes) has been reported in several taxa since more than fifty years, but a full understanding of their origin is still lacking. Comparative and functional genomics are nowadays furnishing new data to better understand holocentric chromosome evolution thus opening new perspectives to analyse karyotype rearrangements in species with holocentric chromosomes in particular evidencing unusual common features, such as the uniform GC content and gene distribution along chromosomes. PMID:23372420

  9. Mosaic supernumerary ring chromosome 19 identified by comparative genomic hybridisation.

    PubMed Central

    Ghaffari, S R; Boyd, E; Connor, J M; Jones, A M; Tolmie, J L

    1998-01-01

    We report the use of comparative genomic hybridisation (CGH) to define the origin of a supernumerary ring chromosome which conventional cytogenetic banding and fluorescence in situ hybridisation (FISH) methods had failed to identify. Targeted FISH using whole chromosome 19 library arm and site specific probes then confirmed the CGH results. This study shows the feasibility of using CGH for the identification of supernumerary marker chromosomes, even in fewer than 50% of cells, where no clinical or cytogenetic clues are present. Images PMID:9783708

  10. Comparative genome analysis and genome-guided physiological analysis of Roseobacter litoralis

    PubMed Central

    2011-01-01

    Background Roseobacter litoralis OCh149, the type species of the genus, and Roseobacter denitrificans OCh114 were the first described organisms of the Roseobacter clade, an ecologically important group of marine bacteria. Both species were isolated from seaweed and are able to perform aerobic anoxygenic photosynthesis. Results The genome of R. litoralis OCh149 contains one circular chromosome of 4,505,211 bp and three plasmids of 93,578 bp (pRLO149_94), 83,129 bp (pRLO149_83) and 63,532 bp (pRLO149_63). Of the 4537 genes predicted for R. litoralis, 1122 (24.7%) are not present in the genome of R. denitrificans. Many of the unique genes of R. litoralis are located in genomic islands and on plasmids. On pRLO149_83 several potential heavy metal resistance genes are encoded which are not present in the genome of R. denitrificans. The comparison of the heavy metal tolerance of the two organisms showed an increased zinc tolerance of R. litoralis. In contrast to R. denitrificans, the photosynthesis genes of R. litoralis are plasmid encoded. The activity of the photosynthetic apparatus was confirmed by respiration rate measurements, indicating a growth-phase dependent response to light. Comparative genomics with other members of the Roseobacter clade revealed several genomic regions that were only conserved in the two Roseobacter species. One of those regions encodes a variety of genes that might play a role in host association of the organisms. The catabolism of different carbon and nitrogen sources was predicted from the genome and combined with experimental data. In several cases, e.g. the degradation of some algal osmolytes and sugars, the genome-derived predictions of the metabolic pathways in R. litoralis differed from the phenotype. Conclusions The genomic differences between the two Roseobacter species are mainly due to lateral gene transfer and genomic rearrangements. Plasmid pRLO149_83 contains predominantly recently acquired genetic material whereas pRLO149