Sample records for comparative evolutionary genomics

  1. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia

    PubMed Central

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M.

    2016-01-01

    Abstract Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints. PMID:28175287

  2. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia.

    PubMed

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M; Ruiz-Herrera, Aurora

    2016-12-01

    Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints.

  3. Genomicus 2018: karyotype evolutionary trees and on-the-fly synteny computing

    PubMed Central

    Nguyen, Nga Thi Thuy; Vincens, Pierre

    2018-01-01

    Abstract Since 2010, the Genomicus web server is available online at http://genomicus.biologie.ens.fr/genomicus. This graphical browser provides access to comparative genomic analyses in four different phyla (Vertebrate, Plants, Fungi, and non vertebrate Metazoans). Users can analyse genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants, in an integrated evolutionary context. New analyses and visualization tools have recently been implemented in Genomicus Vertebrate. Karyotype structures from several genomes can now be compared along an evolutionary pathway (Multi-KaryotypeView), and synteny blocks can be computed and visualized between any two genomes (PhylDiagView). PMID:29087490

  4. Evolutionary genetics of insect innate immunity.

    PubMed

    Viljakainen, Lumi

    2015-11-01

    Patterns of evolution in immune defense genes help to understand the evolutionary dynamics between hosts and pathogens. Multiple insect genomes have been sequenced, with many of them having annotated immune genes, which paves the way for a comparative genomic analysis of insect immunity. In this review, I summarize the current state of comparative and evolutionary genomics of insect innate immune defense. The focus is on the conserved and divergent components of immunity with an emphasis on gene family evolution and evolution at the sequence level; both population genetics and molecular evolution frameworks are considered. © The Author 2015. Published by Oxford University Press.

  5. Genomicus 2018: karyotype evolutionary trees and on-the-fly synteny computing.

    PubMed

    Nguyen, Nga Thi Thuy; Vincens, Pierre; Roest Crollius, Hugues; Louis, Alexandra

    2018-01-04

    Since 2010, the Genomicus web server is available online at http://genomicus.biologie.ens.fr/genomicus. This graphical browser provides access to comparative genomic analyses in four different phyla (Vertebrate, Plants, Fungi, and non vertebrate Metazoans). Users can analyse genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants, in an integrated evolutionary context. New analyses and visualization tools have recently been implemented in Genomicus Vertebrate. Karyotype structures from several genomes can now be compared along an evolutionary pathway (Multi-KaryotypeView), and synteny blocks can be computed and visualized between any two genomes (PhylDiagView). © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Interpreting Mammalian Evolution using Fugu Genome Comparisons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stubbs, L; Ovcharenko, I; Loots, G G

    2004-04-02

    Comparative sequence analysis of the human and the pufferfish Fugu rubripes (fugu) genomes has revealed several novel functional coding and noncoding regions in the human genome. In particular, the fugu genome has been extremely valuable for identifying transcriptional regulatory elements in human loci harboring unusually high levels of evolutionary conservation to rodent genomes. In such regions, the large evolutionary distance between human and fishes provides an additional filter through which functional noncoding elements can be detected with high efficiency.

  7. Genome-wide comparative analysis of four Indian Drosophila species.

    PubMed

    Mohanty, Sujata; Khanna, Radhika

    2017-12-01

    Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.

  8. Is mammalian chromosomal evolution driven by regions of genome fragility?

    PubMed Central

    Ruiz-Herrera, Aurora; Castresana, Jose; Robinson, Terence J

    2006-01-01

    Background A fundamental question in comparative genomics concerns the identification of mechanisms that underpin chromosomal change. In an attempt to shed light on the dynamics of mammalian genome evolution, we analyzed the distribution of syntenic blocks, evolutionary breakpoint regions, and evolutionary breakpoints taken from public databases available for seven eutherian species (mouse, rat, cattle, dog, pig, cat, and horse) and the chicken, and examined these for correspondence with human fragile sites and tandem repeats. Results Our results confirm previous investigations that showed the presence of chromosomal regions in the human genome that have been repeatedly used as illustrated by a high breakpoint accumulation in certain chromosomes and chromosomal bands. We show, however, that there is a striking correspondence between fragile site location, the positions of evolutionary breakpoints, and the distribution of tandem repeats throughout the human genome, which similarly reflect a non-uniform pattern of occurrence. Conclusion These observations provide further evidence that certain chromosomal regions in the human genome have been repeatedly used in the evolutionary process. As a consequence, the genome is a composite of fragile regions prone to reorganization that have been conserved in different lineages, and genomic tracts that do not exhibit the same levels of evolutionary plasticity. PMID:17156441

  9. A High-Resolution Comparative Chromosome Map of Cricetus cricetus and Peromyscus eremicus Reveals the Involvement of Constitutive Heterochromatin in Breakpoint Regions.

    PubMed

    Vieira-da-Silva, Ana; Louzada, Sandra; Adega, Filomena; Chaves, Raquel

    2015-01-01

    Compared to humans and other mammals, rodent genomes, specifically Muroidea species, underwent intense chromosome reshuffling in which many complex structural rearrangements occurred. This fact makes them preferential animal models for studying the process of karyotype evolution. Here, we present the first combined chromosome comparative maps between 2 Cricetidae species, Cricetus cricetus and Peromyscus eremicus, and the index species Mus musculus and Rattus norvegicus. Comparative chromosome painting was done using mouse and rat paint probes together with in silico analysis from the Ensembl genome browser database. Hereby, evolutionary events (inter- and intrachromosomal rearrangements) that occurred in C. cricetus and P. eremicus since the putative ancestral Muroidea genome could be inferred, and evolutionary breakpoint regions could be detected. A colocalization of constitutive heterochromatin and evolutionary breakpoint regions in each genome was observed. Our results suggest the involvement of constitutive heterochromatin in karyotype restructuring of these species, despite the different levels of conservation of the C. cricetus (derivative) and P. eremicus (conserved) genomes. © 2015 S. Karger AG, Basel.

  10. Comparative functional pan-genome analyses to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon metabolism in the genus Mycobacterium.

    PubMed

    Kweon, Ohgew; Kim, Seong-Jae; Blom, Jochen; Kim, Sung-Kwan; Kim, Bong-Soo; Baek, Dong-Heon; Park, Su Inn; Sutherland, John B; Cerniglia, Carl E

    2015-02-14

    The bacterial genus Mycobacterium is of great interest in the medical and biotechnological fields. Despite a flood of genome sequencing and functional genomics data, significant gaps in knowledge between genome and phenome seriously hinder efforts toward the treatment of mycobacterial diseases and practical biotechnological applications. In this study, we propose the use of systematic, comparative functional pan-genomic analysis to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon (PAH) metabolism in the genus Mycobacterium. Phylogenetic, phenotypic, and genomic information for 27 completely genome-sequenced mycobacteria was systematically integrated to reconstruct a mycobacterial phenotype network (MPN) with a pan-genomic concept at a network level. In the MPN, mycobacterial phenotypes show typical scale-free relationships. PAH degradation is an isolated phenotype with the lowest connection degree, consistent with phylogenetic and environmental isolation of PAH degraders. A series of functional pan-genomic analyses provide conserved and unique types of genomic evidence for strong epistatic and pleiotropic impacts on evolutionary trajectories of the PAH-degrading phenotype. Under strong natural selection, the detailed gene gain/loss patterns from horizontal gene transfer (HGT)/deletion events hypothesize a plausible evolutionary path, an epistasis-based birth and pleiotropy-dependent death, for PAH metabolism in the genus Mycobacterium. This study generated a practical mycobacterial compendium of phenotypic and genomic changes, focusing on the PAH-degrading phenotype, with a pan-genomic perspective of the evolutionary events and the environmental challenges. Our findings suggest that when selection acts on PAH metabolism, only a small fraction of possible trajectories is likely to be observed, owing mainly to a combination of the ambiguous phenotypic effects of PAHs and the corresponding pleiotropy- and epistasis-dependent evolutionary adaptation. Evolutionary constraints on the selection of trajectories, like those seen in PAH-degrading phenotypes, are likely to apply to the evolution of other phenotypes in the genus Mycobacterium.

  11. Sequence Search and Comparative Genomic Analysis of SUMO-Activating Enzymes Using CoGe.

    PubMed

    Carretero-Paulet, Lorenzo; Albert, Victor A

    2016-01-01

    The growing number of genome sequences completed during the last few years has made necessary the development of bioinformatics tools for the easy access and retrieval of sequence data, as well as for downstream comparative genomic analyses. Some of these are implemented as online platforms that integrate genomic data produced by different genome sequencing initiatives with data mining tools as well as various comparative genomic and evolutionary analysis possibilities.Here, we use the online comparative genomics platform CoGe ( http://www.genomevolution.org/coge/ ) (Lyons and Freeling. Plant J 53:661-673, 2008; Tang and Lyons. Front Plant Sci 3:172, 2012) (1) to retrieve the entire complement of orthologous and paralogous genes belonging to the SUMO-Activating Enzymes 1 (SAE1) gene family from a set of species representative of the Brassicaceae plant eudicot family with genomes fully sequenced, and (2) to investigate the history, timing, and molecular mechanisms of the gene duplications driving the evolutionary expansion and functional diversification of the SAE1 family in Brassicaceae.

  12. Carnivore-specific SINEs (Can-SINEs): distribution, evolution, and genomic impact.

    PubMed

    Walters-Conte, Kathryn B; Johnson, Diana L E; Allard, Marc W; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics.

  13. Carnivore-Specific SINEs (Can-SINEs): Distribution, Evolution, and Genomic Impact

    PubMed Central

    Johnson, Diana L.E.; Allard, Marc W.; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics. PMID:21846743

  14. The genomic landscape of rapid, repeated evolutionary rescue from toxic pollution in wild fish

    USDA-ARS?s Scientific Manuscript database

    Here we describe evolutionary rescue from intense pollution via multiple modes of selection in killifish populations from 4 urban estuaries of the US eastern seaboard. Comparative transcriptomics and analysis of 384 whole genome sequences show that the functioning of a receptor-based signaling pathw...

  15. EGenBio: A Data Management System for Evolutionary Genomics and Biodiversity

    PubMed Central

    Nahum, Laila A; Reynolds, Matthew T; Wang, Zhengyuan O; Faith, Jeremiah J; Jonna, Rahul; Jiang, Zhi J; Meyer, Thomas J; Pollock, David D

    2006-01-01

    Background Evolutionary genomics requires management and filtering of large numbers of diverse genomic sequences for accurate analysis and inference on evolutionary processes of genomic and functional change. We developed Evolutionary Genomics and Biodiversity (EGenBio; ) to begin to address this. Description EGenBio is a system for manipulation and filtering of large numbers of sequences, integrating curated sequence alignments and phylogenetic trees, managing evolutionary analyses, and visualizing their output. EGenBio is organized into three conceptual divisions, Evolution, Genomics, and Biodiversity. The Genomics division includes tools for selecting pre-aligned sequences from different genes and species, and for modifying and filtering these alignments for further analysis. Species searches are handled through queries that can be modified based on a tree-based navigation system and saved. The Biodiversity division contains tools for analyzing individual sequences or sequence alignments, whereas the Evolution division contains tools involving phylogenetic trees. Alignments are annotated with analytical results and modification history using our PRAED format. A miscellaneous Tools section and Help framework are also available. EGenBio was developed around our comparative genomic research and a prototype database of mtDNA genomes. It utilizes MySQL-relational databases and dynamic page generation, and calls numerous custom programs. Conclusion EGenBio was designed to serve as a platform for tools and resources to ease combined analysis in evolution, genomics, and biodiversity. PMID:17118150

  16. Gramene database: navigating plant comparative genomics resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationship...

  17. The mitochondrial genome of the ascalaphid owlfly Libelloides macaronius and comparative evolutionary mitochondriomics of neuropterid insects

    PubMed Central

    2011-01-01

    Background The insect order Neuroptera encompasses more than 5,700 described species. To date, only three neuropteran mitochondrial genomes have been fully and one partly sequenced. Current knowledge on neuropteran mitochondrial genomes is limited, and new data are strongly required. In the present work, the mitochondrial genome of the ascalaphid owlfly Libelloides macaronius is described and compared with the known neuropterid mitochondrial genomes: Megaloptera, Neuroptera and Raphidioptera. These analyses are further extended to other endopterygotan orders. Results The mitochondrial genome of L. macaronius is a circular molecule 15,890 bp long. It includes the entire set of 37 genes usually present in animal mitochondrial genomes. The gene order of this newly sequenced genome is unique among Neuroptera and differs from the ancestral type of insects in the translocation of trnC. The L. macaronius genome shows the lowest A+T content (74.50%) among known neuropterid genomes. Protein-coding genes possess the typical mitochondrial start codons, except for cox1, which has an unusual ACG. Comparisons among endopterygotan mitochondrial genomes showed that A+T content and AT/GC-skews exhibit a broad range of variation among 84 analyzed taxa. Comparative analyses showed that neuropterid mitochondrial protein-coding genes experienced complex evolutionary histories, involving features ranging from codon usage to rate of substitution, that make them potential markers for population genetics/phylogenetics studies at different taxonomic ranks. The 22 tRNAs show variable substitution patterns in Neuropterida, with higher sequence conservation in genes located on the α strand. Inferred secondary structures for neuropterid rrnS and rrnL genes largely agree with those known for other insects. For the first time, a model is provided for domain I of an insect rrnL. The control region in Neuropterida, as in other insects, is fast-evolving genomic region, characterized by AT-rich motifs. Conclusions The new genome shares many features with known neuropteran genomes but differs in its low A+T content. Comparative analysis of neuropterid mitochondrial genes showed that they experienced distinct evolutionary patterns. Both tRNA families and ribosomal RNAs show composite substitution pathways. The neuropterid mitochondrial genome is characterized by a complex evolutionary history. PMID:21569260

  18. Evolutionary biology through the lens of budding yeast comparative genomics.

    PubMed

    Marsit, Souhir; Leducq, Jean-Baptiste; Durand, Éléonore; Marchant, Axelle; Filteau, Marie; Landry, Christian R

    2017-10-01

    The budding yeast Saccharomyces cerevisiae is a highly advanced model system for studying genetics, cell biology and systems biology. Over the past decade, the application of high-throughput sequencing technologies to this species has contributed to this yeast also becoming an important model for evolutionary genomics. Indeed, comparative genomic analyses of laboratory, wild and domesticated yeast populations are providing unprecedented detail about many of the processes that govern evolution, including long-term processes, such as reproductive isolation and speciation, and short-term processes, such as adaptation to natural and domestication-related environments.

  19. Darwinian evolution in the light of genomics

    PubMed Central

    Koonin, Eugene V.

    2009-01-01

    Comparative genomics and systems biology offer unprecedented opportunities for testing central tenets of evolutionary biology formulated by Darwin in the Origin of Species in 1859 and expanded in the Modern Synthesis 100 years later. Evolutionary-genomic studies show that natural selection is only one of the forces that shape genome evolution and is not quantitatively dominant, whereas non-adaptive processes are much more prominent than previously suspected. Major contributions of horizontal gene transfer and diverse selfish genetic elements to genome evolution undermine the Tree of Life concept. An adequate depiction of evolution requires the more complex concept of a network or ‘forest’ of life. There is no consistent tendency of evolution towards increased genomic complexity, and when complexity increases, this appears to be a non-adaptive consequence of evolution under weak purifying selection rather than an adaptation. Several universals of genome evolution were discovered including the invariant distributions of evolutionary rates among orthologous genes from diverse genomes and of paralogous gene family sizes, and the negative correlation between gene expression level and sequence evolution rate. Simple, non-adaptive models of evolution explain some of these universals, suggesting that a new synthesis of evolutionary biology might become feasible in a not so remote future. PMID:19213802

  20. Evolutionary dynamics of retrotransposons assessed by high-throughput sequencing in wild relatives of wheat.

    PubMed

    Senerchia, Natacha; Wicker, Thomas; Felber, François; Parisod, Christian

    2013-01-01

    Transposable elements (TEs) represent a major fraction of plant genomes and drive their evolution. An improved understanding of genome evolution requires the dynamics of a large number of TE families to be considered. We put forward an approach bypassing the required step of a complete reference genome to assess the evolutionary trajectories of high copy number TE families from genome snapshot with high-throughput sequencing. Low coverage sequencing of the complex genomes of Aegilops cylindrica and Ae. geniculata using 454 identified more than 70% of the sequences as known TEs, mainly long terminal repeat (LTR) retrotransposons. Comparing the abundance of reads as well as patterns of sequence diversity and divergence within and among genomes assessed the dynamics of 44 major LTR retrotransposon families of the 165 identified. In particular, molecular population genetics on individual TE copies distinguished recently active from quiescent families and highlighted different evolutionary trajectories of retrotransposons among related species. This work presents a suite of tools suitable for current sequencing data, allowing to address the genome-wide evolutionary dynamics of TEs at the family level and advancing our understanding of the evolution of nonmodel genomes.

  1. Estimating true evolutionary distances under the DCJ model.

    PubMed

    Lin, Yu; Moret, Bernard M E

    2008-07-01

    Modern techniques can yield the ordering and strandedness of genes on each chromosome of a genome; such data already exists for hundreds of organisms. The evolutionary mechanisms through which the set of the genes of an organism is altered and reordered are of great interest to systematists, evolutionary biologists, comparative genomicists and biomedical researchers. Perhaps the most basic concept in this area is that of evolutionary distance between two genomes: under a given model of genomic evolution, how many events most likely took place to account for the difference between the two genomes? We present a method to estimate the true evolutionary distance between two genomes under the 'double-cut-and-join' (DCJ) model of genome rearrangement, a model under which a single multichromosomal operation accounts for all genomic rearrangement events: inversion, transposition, translocation, block interchange and chromosomal fusion and fission. Our method relies on a simple structural characterization of a genome pair and is both analytically and computationally tractable. We provide analytical results to describe the asymptotic behavior of genomes under the DCJ model, as well as experimental results on a wide variety of genome structures to exemplify the very high accuracy (and low variance) of our estimator. Our results provide a tool for accurate phylogenetic reconstruction from multichromosomal gene rearrangement data as well as a theoretical basis for refinements of the DCJ model to account for biological constraints. All of our software is available in source form under GPL at http://lcbb.epfl.ch.

  2. Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

    PubMed

    Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

    2016-08-30

    Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.

  3. GenomicusPlants: a web resource to study genome evolution in flowering plants.

    PubMed

    Louis, Alexandra; Murat, Florent; Salse, Jérôme; Crollius, Hugues Roest

    2015-01-01

    Comparative genomics combined with phylogenetic reconstructions are powerful approaches to study the evolution of genes and genomes. However, the current rapid expansion of the volume of genomic information makes it increasingly difficult to interrogate, integrate and synthesize comparative genome data while taking into account the maximum breadth of information available. GenomicusPlants (http://www.genomicus.biologie.ens.fr/genomicus-plants) is an extension of the Genomicus webserver that addresses this issue by allowing users to explore flowering plant genomes in an intuitive way, across the broadest evolutionary scales. Extant genomes of 26 flowering plants can be analyzed, as well as 23 ancestral reconstructed genomes. Ancestral gene order provides a long-term chronological view of gene order evolution, greatly facilitating comparative genomics and evolutionary studies. Four main interfaces ('views') are available where: (i) PhyloView combines phylogenetic trees with comparisons of genomic loci across any number of genomes; (ii) AlignView projects loci of interest against all other genomes to visualize its topological conservation; (iii) MatrixView compares two genomes in a classical dotplot representation; and (iv) Karyoview visualizes chromosome karyotypes 'painted' with colours of another genome of interest. All four views are interconnected and benefit from many customizable features. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  4. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species

    PubMed Central

    Wang, Jing; Street, Nathaniel R.; Scofield, Douglas G.; Ingvarsson, Pär K.

    2016-01-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. PMID:26721855

  5. Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

    PubMed

    Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

    2016-03-01

    A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.

  6. Independent evolution of genomic characters during major metazoan transitions.

    PubMed

    Simakov, Oleg; Kawashima, Takeshi

    2017-07-15

    Metazoan evolution encompasses a vast evolutionary time scale spanning over 600 million years. Our ability to infer ancestral metazoan characters, both morphological and functional, is limited by our understanding of the nature and evolutionary dynamics of the underlying regulatory networks. Increasing coverage of metazoan genomes enables us to identify the evolutionary changes of the relevant genomic characters such as the loss or gain of coding sequences, gene duplications, micro- and macro-synteny, and non-coding element evolution in different lineages. In this review we describe recent advances in our understanding of ancestral metazoan coding and non-coding features, as deduced from genomic comparisons. Some genomic changes such as innovations in gene and linkage content occur at different rates across metazoan clades, suggesting some level of independence among genomic characters. While their contribution to biological innovation remains largely unclear, we review recent literature about certain genomic changes that do correlate with changes to specific developmental pathways and metazoan innovations. In particular, we discuss the origins of the recently described pharyngeal cluster which is conserved across deuterostome genomes, and highlight different genomic features that have contributed to the evolution of this group. We also assess our current capacity to infer ancestral metazoan states from gene models and comparative genomics tools and elaborate on the future directions of metazoan comparative genomics relevant to evo-devo studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  7. OrthoMaM v8: a database of orthologous exons and coding sequences for comparative genomics in mammals.

    PubMed

    Douzery, Emmanuel J P; Scornavacca, Celine; Romiguier, Jonathan; Belkhir, Khalid; Galtier, Nicolas; Delsuc, Frédéric; Ranwez, Vincent

    2014-07-01

    Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at http://www.orthomam.univ-montp2.fr/. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Genome-scale rates of evolutionary change in bacteria

    PubMed Central

    Duchêne, Sebastian; Holt, Kathryn E.; Weill, François-Xavier; Le Hello, Simon; Hawkey, Jane; Edwards, David J.; Fourment, Mathieu

    2016-01-01

    Estimating the rates at which bacterial genomes evolve is critical to understanding major evolutionary and ecological processes such as disease emergence, long-term host–pathogen associations and short-term transmission patterns. The surge in bacterial genomic data sets provides a new opportunity to estimate these rates and reveal the factors that shape bacterial evolutionary dynamics. For many organisms estimates of evolutionary rate display an inverse association with the time-scale over which the data are sampled. However, this relationship remains unexplored in bacteria due to the difficulty in estimating genome-wide evolutionary rates, which are impacted by the extent of temporal structure in the data and the prevalence of recombination. We collected 36 whole genome sequence data sets from 16 species of bacterial pathogens to systematically estimate and compare their evolutionary rates and assess the extent of temporal structure in the absence of recombination. The majority (28/36) of data sets possessed sufficient clock-like structure to robustly estimate evolutionary rates. However, in some species reliable estimates were not possible even with ‘ancient DNA’ data sampled over many centuries, suggesting that they evolve very slowly or that they display extensive rate variation among lineages. The robustly estimated evolutionary rates spanned several orders of magnitude, from approximately 10−5 to 10−8 nucleotide substitutions per site year−1. This variation was negatively associated with sampling time, with this relationship best described by an exponential decay curve. To avoid potential estimation biases, such time-dependency should be considered when inferring evolutionary time-scales in bacteria. PMID:28348834

  9. Genomic Analysis of Hepatitis B Virus Reveals Antigen State and Genotype as Sources of Evolutionary Rate Variation

    PubMed Central

    Harrison, Abby; Lemey, Philippe; Hurles, Matthew; Moyes, Chris; Horn, Susanne; Pryor, Jan; Malani, Joji; Supuri, Mathias; Masta, Andrew; Teriboriki, Burentau; Toatu, Tebuka; Penny, David; Rambaut, Andrew; Shapiro, Beth

    2011-01-01

    Hepatitis B virus (HBV) genomes are small, semi-double-stranded DNA circular genomes that contain alternating overlapping reading frames and replicate through an RNA intermediary phase. This complex biology has presented a challenge to estimating an evolutionary rate for HBV, leading to difficulties resolving the evolutionary and epidemiological history of the virus. Here, we re-examine rates of HBV evolution using a novel data set of 112 within-host, transmission history (pedigree) and among-host genomes isolated over 20 years from the indigenous peoples of the South Pacific, combined with 313 previously published HBV genomes. We employ Bayesian phylogenetic approaches to examine several potential causes and consequences of evolutionary rate variation in HBV. Our results reveal rate variation both between genotypes and across the genome, as well as strikingly slower rates when genomes are sampled in the Hepatitis B e antigen positive state, compared to the e antigen negative state. This Hepatitis B e antigen rate variation was found to be largely attributable to changes during the course of infection in the preCore and Core genes and their regulatory elements. PMID:21765983

  10. Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing.

    PubMed

    Liu, Tiancheng; Yu, Lin; Liu, Lei; Li, Hong; Li, Yixue

    2015-01-01

    High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO) studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the "funnel-like" model and the "hourglass" model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.

  11. Comparative Analysis of Proteomes and Functionomes Provides Insights into Origins of Cellular Diversification

    PubMed Central

    Caetano-Anollés, Gustavo

    2013-01-01

    Reconstructing the evolutionary history of modern species is a difficult problem complicated by the conceptual and technical limitations of phylogenetic tree building methods. Here, we propose a comparative proteomic and functionomic inferential framework for genome evolution that allows resolving the tripartite division of cells and sketching their history. Evolutionary inferences were derived from the spread of conserved molecular features, such as molecular structures and functions, in the proteomes and functionomes of contemporary organisms. Patterns of use and reuse of these traits yielded significant insights into the origins of cellular diversification. Results uncovered an unprecedented strong evolutionary association between Bacteria and Eukarya while revealing marked evolutionary reductive tendencies in the archaeal genomic repertoires. The effects of nonvertical evolutionary processes (e.g., HGT, convergent evolution) were found to be limited while reductive evolution and molecular innovation appeared to be prevalent during the evolution of cells. Our study revealed a strong vertical trace in the history of proteins and associated molecular functions, which was reliably recovered using the comparative genomics approach. The trace supported the existence of a stem line of descent and the very early appearance of Archaea as a diversified superkingdom, but failed to uncover a hidden canonical pattern in which Bacteria was the first superkingdom to deploy superkingdom-specific structures and functions. PMID:24492748

  12. Genes with stable DNA methylation levels show higher evolutionary conservation than genes with fluctuant DNA methylation levels.

    PubMed

    Zhang, Ruijie; Lv, Wenhua; Luan, Meiwei; Zheng, Jiajia; Shi, Miao; Zhu, Hongjie; Li, Jin; Lv, Hongchao; Zhang, Mingming; Shang, Zhenwei; Duan, Lian; Jiang, Yongshuai

    2015-11-24

    Different human genes often exhibit different degrees of stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had greater percentages of the orthologous genes, lower dn/ds, and higher protein sequence identities in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had lower SNP densities, and higher degrees of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.

  13. Comparative genomic de-convolution of the cotton genome revealed a decaploid ancestor and widespread chromosomal fractionation.

    PubMed

    Wang, Xiyin; Guo, Hui; Wang, Jinpeng; Lei, Tianyu; Liu, Tao; Wang, Zhenyi; Li, Yuxian; Lee, Tae-Ho; Li, Jingping; Tang, Haibao; Jin, Dianchuan; Paterson, Andrew H

    2016-02-01

    The 'apparently' simple genomes of many angiosperms mask complex evolutionary histories. The reference genome sequence for cotton (Gossypium spp.) revealed a ploidy change of a complexity unprecedented to date, indeed that could not be distinguished as to its exact dosage. Herein, by developing several comparative, computational and statistical approaches, we revealed a 5× multiplication in the cotton lineage of an ancestral genome common to cotton and cacao, and proposed evolutionary models to show how such a decaploid ancestor formed. The c. 70% gene loss necessary to bring the ancestral decaploid to its current gene count appears to fit an approximate geometrical model; that is, although many genes may be lost by single-gene deletion events, some may be lost in groups of consecutive genes. Gene loss following cotton decaploidy has largely just reduced gene copy numbers of some homologous groups. We designed a novel approach to deconvolute layers of chromosome homology, providing definitive information on gene orthology and paralogy across broad evolutionary distances, both of fundamental value and serving as an important platform to support further studies in and beyond cotton and genomics communities. No claim to original US government works. New Phytologist © 2015 New Phytologist Trust.

  14. Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition

    PubMed Central

    Lefébure, Tristan; Stanhope, Michael J

    2007-01-01

    Background The genus Streptococcus is one of the most diverse and important human and agricultural pathogens. This study employs comparative evolutionary analyses of 26 Streptococcus genomes to yield an improved understanding of the relative roles of recombination and positive selection in pathogen adaptation to their hosts. Results Streptococcus genomes exhibit extreme levels of evolutionary plasticity, with high levels of gene gain and loss during species and strain evolution. S. agalactiae has a large pan-genome, with little recombination in its core-genome, while S. pyogenes has a smaller pan-genome and much more recombination of its core-genome, perhaps reflecting the greater habitat, and gene pool, diversity for S. agalactiae compared to S. pyogenes. Core-genome recombination was evident in all lineages (18% to 37% of the core-genome judged to be recombinant), while positive selection was mainly observed during species differentiation (from 11% to 34% of the core-genome). Positive selection pressure was unevenly distributed across lineages and biochemical main role categories. S. suis was the lineage with the greatest level of positive selection pressure, the largest number of unique loci selected, and the largest amount of gene gain and loss. Conclusion Recombination is an important evolutionary force in shaping Streptococcus genomes, not only in the acquisition of significant portions of the genome as lineage specific loci, but also in facilitating rapid evolution of the core-genome. Positive selection, although undoubtedly a slower process, has nonetheless played an important role in adaptation of the core-genome of different Streptococcus species to different hosts. PMID:17475002

  15. Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

    PubMed

    Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

    2016-12-01

    High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  16. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

    PubMed

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-03-17

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies.

  17. Evolutionary Genomics of Genes Involved in Olfactory Behavior in the Drosophila melanogaster Species Group

    PubMed Central

    Lavagnino, Nicolás; Serra, François; Arbiza, Leonardo; Dopazo, Hernán; Hasson, Esteban

    2012-01-01

    Previous comparative genomic studies of genes involved in olfactory behavior in Drosophila focused only on particular gene families such as odorant receptor and/or odorant binding proteins. However, olfactory behavior has a complex genetic architecture that is orchestrated by many interacting genes. In this paper, we present a comparative genomic study of olfactory behavior in Drosophila including an extended set of genes known to affect olfactory behavior. We took advantage of the recent burst of whole genome sequences and the development of powerful statistical tools to analyze genomic data and test evolutionary and functional hypotheses of olfactory genes in the six species of the Drosophila melanogaster species group for which whole genome sequences are available. Our study reveals widespread purifying selection and limited incidence of positive selection on olfactory genes. We show that the pace of evolution of olfactory genes is mostly independent of the life cycle stage, and of the number of life cycle stages, in which they participate in olfaction. However, we detected a relationship between evolutionary rates and the position that the gene products occupy in the olfactory system, genes occupying central positions tend to be more constrained than peripheral genes. Finally, we demonstrate that specialization to one host does not seem to be associated with bursts of adaptive evolution in olfactory genes in D. sechellia and D. erecta, the two specialists species analyzed, but rather different lineages have idiosyncratic evolutionary histories in which both historical and ecological factors have been involved. PMID:22346339

  18. Cocoa/Cotton Comparative Genomics

    USDA-ARS?s Scientific Manuscript database

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  19. Comparative Transcriptomics of Strawberries (Fragaria spp.) Provides Insights into Evolutionary Patterns.

    PubMed

    Qiao, Qin; Xue, Li; Wang, Qia; Sun, Hang; Zhong, Yang; Huang, Jinling; Lei, Jiajun; Zhang, Ticao

    2016-01-01

    Multiple closely related species with genomic sequences provide an ideal system for studies on comparative and evolutionary genomics, as well as the mechanism of speciation. The whole genome sequences of six strawberry species ( Fragaria spp.) have been released, which provide one of the richest genomic resources of any plant genus. In this study, we first generated seven transcriptome sequences of Fragaria species de novo , with a total of 48,557-82,537 unigenes per species. Combined with 13 other species genomes in Rosales, we reconstructed a phylogenetic tree at the genomic level. The phylogenic tree shows that Fragaria closed grouped with Rubus and the Fragaria clade is divided into three subclades. East Asian species appeared in every subclade, suggesting that the genus originated in this area at ∼7.99 Mya. Four species found in mountains of Southwest China originated at ∼3.98 Mya, suggesting that rapid speciation occurred to adapt to changing environments following the uplift of the Qinghai-Tibet Plateau. Moreover, we identified 510 very significantly positively selected genes in the cultivated species F . × ananassa genome. This set of genes was enriched in functions related to specific agronomic traits, such as carbon metabolism and plant hormone signal transduction processes, which are directly related to fruit quality and flavor. These findings illustrate comprehensive evolutionary patterns in Fragaria and the genetic basis of fruit domestication of cultivated strawberry at the genomic/transcriptomic level.

  20. Comparative genomics of the bacterial genus Streptococcus illuminates evolutionary implications of species groups.

    PubMed

    Gao, Xiao-Yang; Zhi, Xiao-Yang; Li, Hong-Wei; Klenk, Hans-Peter; Li, Wen-Jun

    2014-01-01

    Members of the genus Streptococcus within the phylum Firmicutes are among the most diverse and significant zoonotic pathogens. This genus has gone through considerable taxonomic revision due to increasing improvements of chemotaxonomic approaches, DNA hybridization and 16S rRNA gene sequencing. It is proposed to place the majority of streptococci into "species groups". However, the evolutionary implications of species groups are not clear presently. We use comparative genomic approaches to yield a better understanding of the evolution of Streptococcus through genome dynamics, population structure, phylogenies and virulence factor distribution of species groups. Genome dynamics analyses indicate that the pan-genome size increases with the addition of newly sequenced strains, while the core genome size decreases with sequential addition at the genus level and species group level. Population structure analysis reveals two distinct lineages, one including Pyogenic, Bovis, Mutans and Salivarius groups, and the other including Mitis, Anginosus and Unknown groups. Phylogenetic dendrograms show that species within the same species group cluster together, and infer two main clades in accordance with population structure analysis. Distribution of streptococcal virulence factors has no obvious patterns among the species groups; however, the evolution of some common virulence factors is congruous with the evolution of species groups, according to phylogenetic inference. We suggest that the proposed streptococcal species groups are reasonable from the viewpoints of comparative genomics; evolution of the genus is congruent with the individual evolutionary trajectories of different species groups.

  1. Comparative Genomics of the Bacterial Genus Streptococcus Illuminates Evolutionary Implications of Species Groups

    PubMed Central

    Gao, Xiao-Yang; Zhi, Xiao-Yang; Li, Hong-Wei; Klenk, Hans-Peter; Li, Wen-Jun

    2014-01-01

    Members of the genus Streptococcus within the phylum Firmicutes are among the most diverse and significant zoonotic pathogens. This genus has gone through considerable taxonomic revision due to increasing improvements of chemotaxonomic approaches, DNA hybridization and 16S rRNA gene sequencing. It is proposed to place the majority of streptococci into “species groups”. However, the evolutionary implications of species groups are not clear presently. We use comparative genomic approaches to yield a better understanding of the evolution of Streptococcus through genome dynamics, population structure, phylogenies and virulence factor distribution of species groups. Genome dynamics analyses indicate that the pan-genome size increases with the addition of newly sequenced strains, while the core genome size decreases with sequential addition at the genus level and species group level. Population structure analysis reveals two distinct lineages, one including Pyogenic, Bovis, Mutans and Salivarius groups, and the other including Mitis, Anginosus and Unknown groups. Phylogenetic dendrograms show that species within the same species group cluster together, and infer two main clades in accordance with population structure analysis. Distribution of streptococcal virulence factors has no obvious patterns among the species groups; however, the evolution of some common virulence factors is congruous with the evolution of species groups, according to phylogenetic inference. We suggest that the proposed streptococcal species groups are reasonable from the viewpoints of comparative genomics; evolution of the genus is congruent with the individual evolutionary trajectories of different species groups. PMID:24977706

  2. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  3. Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region

    PubMed Central

    Jiang, Zhi J; Castoe, Todd A; Austin, Christopher C; Burbrink, Frank T; Herron, Matthew D; McGuire, Jimmy A; Parkinson, Christopher L; Pollock, David D

    2007-01-01

    Background The mitochondrial genomes of snakes are characterized by an overall evolutionary rate that appears to be one of the most accelerated among vertebrates. They also possess other unusual features, including short tRNAs and other genes, and a duplicated control region that has been stably maintained since it originated more than 70 million years ago. Here, we provide a detailed analysis of evolutionary dynamics in snake mitochondrial genomes to better understand the basis of these extreme characteristics, and to explore the relationship between mitochondrial genome molecular evolution, genome architecture, and molecular function. We sequenced complete mitochondrial genomes from Slowinski's corn snake (Pantherophis slowinskii) and two cottonmouths (Agkistrodon piscivorus) to complement previously existing mitochondrial genomes, and to provide an improved comparative view of how genome architecture affects molecular evolution at contrasting levels of divergence. Results We present a Bayesian genetic approach that suggests that the duplicated control region can function as an additional origin of heavy strand replication. The two control regions also appear to have different intra-specific versus inter-specific evolutionary dynamics that may be associated with complex modes of concerted evolution. We find that different genomic regions have experienced substantial accelerated evolution along early branches in snakes, with different genes having experienced dramatic accelerations along specific branches. Some of these accelerations appear to coincide with, or subsequent to, the shortening of various mitochondrial genes and the duplication of the control region and flanking tRNAs. Conclusion Fluctuations in the strength and pattern of selection during snake evolution have had widely varying gene-specific effects on substitution rates, and these rate accelerations may have been functionally related to unusual changes in genomic architecture. The among-lineage and among-gene variation in rate dynamics observed in snakes is the most extreme thus far observed in animal genomes, and provides an important study system for further evaluating the biochemical and physiological basis of evolutionary pressures in vertebrate mitochondria. PMID:17655768

  4. Global DNA cytosine methylation as an evolving trait: phylogenetic signal and correlated evolution with genome size in angiosperms

    PubMed Central

    Alonso, Conchita; Pérez, Ricardo; Bazaga, Pilar; Herrera, Carlos M.

    2015-01-01

    DNA cytosine methylation is a widespread epigenetic mechanism in eukaryotes, and plant genomes commonly are densely methylated. Genomic methylation can be associated with functional consequences such as mutational events, genomic instability or altered gene expression, but little is known on interspecific variation in global cytosine methylation in plants. In this paper, we compare global cytosine methylation estimates obtained by HPLC and use a phylogenetically-informed analytical approach to test for significance of evolutionary signatures of this trait across 54 angiosperm species in 25 families. We evaluate whether interspecific variation in global cytosine methylation is statistically related to phylogenetic distance and also whether it is evolutionarily correlated with genome size (C-value). Global cytosine methylation varied widely between species, ranging between 5.3% (Arabidopsis) and 39.2% (Narcissus). Differences between species were related to their evolutionary trajectories, as denoted by the strong phylogenetic signal underlying interspecific variation. Global cytosine methylation and genome size were evolutionarily correlated, as revealed by the significant relationship between the corresponding phylogenetically independent contrasts. On average, a ten-fold increase in genome size entailed an increase of about 10% in global cytosine methylation. Results show that global cytosine methylation is an evolving trait in angiosperms whose evolutionary trajectory is significantly linked to changes in genome size, and suggest that the evolutionary implications of epigenetic mechanisms are likely to vary between plant lineages. PMID:25688257

  5. Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species.

    PubMed

    Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

    2008-06-23

    The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.

  6. Shared Subgenome Dominance Following Polyploidization Explains Grass Genome Evolutionary Plasticity from a Seven Protochromosome Ancestor with 16K Protogenes

    PubMed Central

    Murat, Florent; Zhang, Rongzhi; Guizard, Sébastien; Flores, Raphael; Armero, Alix; Pont, Caroline; Steinbach, Delphine; Quesneville, Hadi; Cooke, Richard; Salse, Jerome

    2013-01-01

    Modern plant genomes are diploidized paleopolyploids. We revisited grass genome paleohistory in response to the diploidization process through a detailed investigation of the evolutionary fate of duplicated blocks. Ancestrally duplicated genes can be conserved, deleted, and shuffled, defining dominant (bias toward duplicate retention) and sensitive (bias toward duplicate erosion) chromosomal fragments. We propose a new grass genome paleohistory deriving from an ancestral karyotype structured in seven protochromosomes containing 16,464 protogenes and following evolutionary rules where 1) ancestral shared polyploidizations shaped conserved dominant (D) and sensitive (S) subgenomes, 2) subgenome dominance is revealed by both gene deletion and shuffling from the S blocks, 3) duplicate deletion/movement may have been mediated by single-/double-stranded illegitimate recombination mechanisms, 4) modern genomes arose through centromeric fusion of protochromosomes, leading to functional monocentric neochromosomes, 5) the fusion of two dominant blocks leads to supradominant neochromosomes (D + D = D) with higher ancestral gene retention compared with D + S = D (i.e., fusion of blocks with opposite sensitivity) or even S + S = S (i.e., fusion of two sensitive ancestral blocks). A new user-friendly online tool named “PlantSyntenyViewer,” available at http://urgi.versailles.inra.fr/synteny-cereal, presents the refined comparative genomics data. PMID:24317974

  7. Evolutionary and Comparative Genomics to Drive Rational Drug Design, with Particular Focus on Neuropeptide Seven-Transmembrane Receptors.

    PubMed

    Furlong, Michael; Seong, Jae Young

    2017-01-01

    Seven transmembrane receptors (7TMRs), also known as G protein-coupled receptors, are popular targets of drug development, particularly 7TMR systems that are activated by peptide ligands. Although many pharmaceutical drugs have been discovered via conventional bulk analysis techniques the increasing availability of structural and evolutionary data are facilitating change to rational, targeted drug design. This article discusses the appeal of neuropeptide-7TMR systems as drug targets and provides an overview of concepts in the evolution of vertebrate genomes and gene families. Subsequently, methods that use evolutionary concepts and comparative analysis techniques to aid in gene discovery, gene function identification, and novel drug design are provided along with case study examples.

  8. Evolutionary and Comparative Genomics to Drive Rational Drug Design, with Particular Focus on Neuropeptide Seven-Transmembrane Receptors

    PubMed Central

    Furlong, Michael; Seong, Jae Young

    2017-01-01

    Seven transmembrane receptors (7TMRs), also known as G protein-coupled receptors, are popular targets of drug development, particularly 7TMR systems that are activated by peptide ligands. Although many pharmaceutical drugs have been discovered via conventional bulk analysis techniques the increasing availability of structural and evolutionary data are facilitating change to rational, targeted drug design. This article discusses the appeal of neuropeptide-7TMR systems as drug targets and provides an overview of concepts in the evolution of vertebrate genomes and gene families. Subsequently, methods that use evolutionary concepts and comparative analysis techniques to aid in gene discovery, gene function identification, and novel drug design are provided along with case study examples. PMID:28035082

  9. Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform1[OPEN

    PubMed Central

    Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-chen; Paterson, Andrew H.

    2017-01-01

    Mainly due to their economic importance, genomes of 10 legumes, including soybean (Glycine max), wild peanut (Arachis duranensis and Arachis ipaensis), and barrel medic (Medicago truncatula), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape (Vitis vinifera) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). PMID:28325848

  10. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.

    PubMed

    Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

    2007-11-27

    An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover approximately 88% of the genes in a genome compared to a approximately 76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; approximately 40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/.

  11. Mitochondrial genome sequencing helps show the evolutionary mechanism of mitochondrial genome formation in Brassica

    PubMed Central

    2011-01-01

    Background Angiosperm mitochondrial genomes are more complex than those of other organisms. Analyses of the mitochondrial genome sequences of at least 11 angiosperm species have showed several common properties; these cannot easily explain, however, how the diverse mitotypes evolved within each genus or species. We analyzed the evolutionary relationships of Brassica mitotypes by sequencing. Results We sequenced the mitotypes of cam (Brassica rapa), ole (B. oleracea), jun (B. juncea), and car (B. carinata) and analyzed them together with two previously sequenced mitotypes of B. napus (pol and nap). The sizes of whole single circular genomes of cam, jun, ole, and car are 219,747 bp, 219,766 bp, 360,271 bp, and 232,241 bp, respectively. The mitochondrial genome of ole is largest as a resulting of the duplication of a 141.8 kb segment. The jun mitotype is the result of an inherited cam mitotype, and pol is also derived from the cam mitotype with evolutionary modifications. Genes with known functions are conserved in all mitotypes, but clear variation in open reading frames (ORFs) with unknown functions among the six mitotypes was observed. Sequence relationship analysis showed that there has been genome compaction and inheritance in the course of Brassica mitotype evolution. Conclusions We have sequenced four Brassica mitotypes, compared six Brassica mitotypes and suggested a mechanism for mitochondrial genome formation in Brassica, including evolutionary events such as inheritance, duplication, rearrangement, genome compaction, and mutation. PMID:21988783

  12. Evolutionary Dynamics of Microsatellite Distribution in Plants: Insight from the Comparison of Sequenced Brassica, Arabidopsis and Other Angiosperm Species

    PubMed Central

    Shi, Jiaqin; Huang, Shunmou; Fu, Donghui; Yu, Jinyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2013-01-01

    Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences). The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number) of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type) the angiosperm species (aside from a few species) all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite distribution with respect to motif length, type and repeat number. Interestingly, several microsatellite characteristics seemed to be constant in plant evolution, which can be well explained by the general biological rules. PMID:23555856

  13. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  14. Comparative primate genomics: emerging patterns of genome content and dynamics

    PubMed Central

    Rogers, Jeffrey; Gibbs, Richard A.

    2014-01-01

    Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

  15. Comparative primate genomics: emerging patterns of genome content and dynamics.

    PubMed

    Rogers, Jeffrey; Gibbs, Richard A

    2014-05-01

    Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future.

  16. IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

    PubMed Central

    Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

    2009-01-01

    Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385

  17. Comparative Transcriptomics of Strawberries (Fragaria spp.) Provides Insights into Evolutionary Patterns

    PubMed Central

    Qiao, Qin; Xue, Li; Wang, Qia; Sun, Hang; Zhong, Yang; Huang, Jinling; Lei, Jiajun; Zhang, Ticao

    2016-01-01

    Multiple closely related species with genomic sequences provide an ideal system for studies on comparative and evolutionary genomics, as well as the mechanism of speciation. The whole genome sequences of six strawberry species (Fragaria spp.) have been released, which provide one of the richest genomic resources of any plant genus. In this study, we first generated seven transcriptome sequences of Fragaria species de novo, with a total of 48,557–82,537 unigenes per species. Combined with 13 other species genomes in Rosales, we reconstructed a phylogenetic tree at the genomic level. The phylogenic tree shows that Fragaria closed grouped with Rubus and the Fragaria clade is divided into three subclades. East Asian species appeared in every subclade, suggesting that the genus originated in this area at ∼7.99 Mya. Four species found in mountains of Southwest China originated at ∼3.98 Mya, suggesting that rapid speciation occurred to adapt to changing environments following the uplift of the Qinghai–Tibet Plateau. Moreover, we identified 510 very significantly positively selected genes in the cultivated species F. × ananassa genome. This set of genes was enriched in functions related to specific agronomic traits, such as carbon metabolism and plant hormone signal transduction processes, which are directly related to fruit quality and flavor. These findings illustrate comprehensive evolutionary patterns in Fragaria and the genetic basis of fruit domestication of cultivated strawberry at the genomic/transcriptomic level. PMID:28018379

  18. [Advance on genome research of Yersinia pestis bacteriophage].

    PubMed

    Tan, H L; Wang, P; Li, W

    2017-04-10

    Completion of the genome sequences on Yersinia pestis bacteriophage offered unprecedented opportunity for researchers to carry out related genomic studies. This review was based on the genomic sequences and provided a genomic perspective in describing the essential features of genome on Yersinia pestis bacteriophage. Based on the comparative genomics, genetic evolutionary relationship was discussed. Description of functions from the gene prediction and protein annotation provided evidence for further related studies.

  19. Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms.

    PubMed

    Buschiazzo, Emmanuel; Ritland, Carol; Bohlmann, Jörg; Ritland, Kermit

    2012-01-20

    Comparative genomics can inform us about the processes of mutation and selection across diverse taxa. Among seed plants, gymnosperms have been lacking in genomic comparisons. Recent EST and full-length cDNA collections for two conifers, Sitka spruce (Picea sitchensis) and loblolly pine (Pinus taeda), together with full genome sequences for two angiosperms, Arabidopsis thaliana and poplar (Populus trichocarpa), offer an opportunity to infer the evolutionary processes underlying thousands of orthologous protein-coding genes in gymnosperms compared with an angiosperm orthologue set. Based upon pairwise comparisons of 3,723 spruce and pine orthologues, we found an average synonymous genetic distance (dS) of 0.191, and an average dN/dS ratio of 0.314. Using a fossil-established divergence time of 140 million years between spruce and pine, we extrapolated a nucleotide substitution rate of 0.68 × 10(-9) synonymous substitutions per site per year. When compared to angiosperms, this indicates a dramatically slower rate of nucleotide substitution rates in conifers: on average 15-fold. Coincidentally, we found a three-fold higher dN/dS for the spruce-pine lineage compared to the poplar-Arabidopsis lineage. This joint occurrence of a slower evolutionary rate in conifers with higher dN/dS, and possibly positive selection, showcases the uniqueness of conifer genome evolution. Our results are in line with documented reduced nucleotide diversity, conservative genome evolution and low rates of diversification in conifers on the one hand and numerous examples of local adaptation in conifers on the other hand. We propose that reduced levels of nucleotide mutation in large and long-lived conifer trees, coupled with large effective population size, were the main factors leading to slow substitution rates but retention of beneficial mutations.

  20. Genomic Aspects of Research Involving Polyploid Plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Xiaohan; Ye, Chuyu; Tschaplinski, Timothy J

    2011-01-01

    Almost all extant plant species have spontaneously doubled their genomes at least once in their evolutionary histories, resulting in polyploidy which provided a rich genomic resource for evolutionary processes. Moreover, superior polyploid clones have been created during the process of crop domestication. Polyploid plants generated by evolutionary processes and/or crop domestication have been the intentional or serendipitous focus of research dealing with the dynamics and consequences of genome evolution. One of the new trends in genomics research is to create synthetic polyploid plants which provide materials for studying the initial genomic changes/responses immediately after polyploid formation. Polyploid plants are alsomore » used in functional genomics research to study gene expression in a complex genomic background. In this review, we summarize the recent progress in genomics research involving ancient, young, and synthetic polyploid plants, with a focus on genome size evolution, genomics diversity, genomic rearrangement, genetic and epigenetic changes in duplicated genes, gene discovery, and comparative genomics. Implications on plant sciences including evolution, functional genomics, and plant breeding are presented. It is anticipated that polyploids will be a regular subject of genomics research in the foreseeable future as the rapid advances in DNA sequencing technology create unprecedented opportunities for discovering and monitoring genomic and transcriptomic changes in polyploid plants. The fast accumulation of knowledge on polyploid formation, maintenance, and divergence at whole-genome and subgenome levels will not only help plant biologists understand how plants have evolved and diversified, but also assist plant breeders in designing new strategies for crop improvement.« less

  1. Mutational burdens and evolutionary ages of thyroid follicular adenoma are comparable to those of follicular carcinoma

    PubMed Central

    Jung, Seung-Hyun; Kim, Min Sung; Jung, Chan Kwon; Park, Hyun-Chun; Kim, So Youn; Liu, Jieying; Bae, Ja-Seong; Lee, Sung Hak; Kim, Tae-Min; Lee, Sug Hyung; Chung, Yeun-Jun

    2016-01-01

    Follicular thyroid adenoma (FTA) precedes follicular thyroid carcinoma (FTC) by definition with a favorable prognosis compared to FTC. However, the genetic mechanism of FTA to FTC progression remains unknown. For this, it is required to disclose FTA and FTC genomes in mutational and evolutionary perspectives. We performed whole-exome sequencing and copy number profiling of 14 FTAs and 13 FTCs, which exhibited previously-known gene mutations (NRAS, HRAS, BRAF, TSHR and EIF1AX) and copy number alterations (CNAs) (22q loss and 1q gain) in follicular tumors. In addition, we found eleven potential cancer-related genes with mutations (EZH1, SPOP, NF1, TCF12, IGF2BP3, KMT2C, CNOT1, BRIP1, KDM5C, STAG2 and MAP4K3) that have not been reported in thyroid follicular tumors. Of note, FTA genomes showed comparable levels of mutations to FTC in terms of the number, sequence composition and functional consequences (potential driver mutations) of mutations. Analyses of evolutionary ages using somatic mutations as molecular clocks further identified that FTA genomes were as old as FTC genomes. Whole-transcriptome sequencing did not find any gene fusions with potential significance. Our data indicate that FTA genomes may be as old as FTC genomes, thus suggesting that follicular thyroid tumor genomes during the transition from FTA to FTC may stand stable at genomic levels in contrast to the discernable changes at pathologic and clinical levels. Also, the data suggest a possibility that the mutational profiles obtained from early biopsies may be useful for the molecular diagnosis and therapeutics of follicular tumor patients. PMID:27626165

  2. Mutational burdens and evolutionary ages of thyroid follicular adenoma are comparable to those of follicular carcinoma.

    PubMed

    Jung, Seung-Hyun; Kim, Min Sung; Jung, Chan Kwon; Park, Hyun-Chun; Kim, So Youn; Liu, Jieying; Bae, Ja-Seong; Lee, Sung Hak; Kim, Tae-Min; Lee, Sug Hyung; Chung, Yeun-Jun

    2016-10-25

    Follicular thyroid adenoma (FTA) precedes follicular thyroid carcinoma (FTC) by definition with a favorable prognosis compared to FTC. However, the genetic mechanism of FTA to FTC progression remains unknown. For this, it is required to disclose FTA and FTC genomes in mutational and evolutionary perspectives. We performed whole-exome sequencing and copy number profiling of 14 FTAs and 13 FTCs, which exhibited previously-known gene mutations (NRAS, HRAS, BRAF, TSHR and EIF1AX) and copy number alterations (CNAs) (22q loss and 1q gain) in follicular tumors. In addition, we found eleven potential cancer-related genes with mutations (EZH1, SPOP, NF1, TCF12, IGF2BP3, KMT2C, CNOT1, BRIP1, KDM5C, STAG2 and MAP4K3) that have not been reported in thyroid follicular tumors. Of note, FTA genomes showed comparable levels of mutations to FTC in terms of the number, sequence composition and functional consequences (potential driver mutations) of mutations. Analyses of evolutionary ages using somatic mutations as molecular clocks further identified that FTA genomes were as old as FTC genomes. Whole-transcriptome sequencing did not find any gene fusions with potential significance. Our data indicate that FTA genomes may be as old as FTC genomes, thus suggesting that follicular thyroid tumor genomes during the transition from FTA to FTC may stand stable at genomic levels in contrast to the discernable changes at pathologic and clinical levels. Also, the data suggest a possibility that the mutational profiles obtained from early biopsies may be useful for the molecular diagnosis and therapeutics of follicular tumor patients.

  3. Comparative methods for the analysis of gene-expression evolution: an example using yeast functional genomic data.

    PubMed

    Oakley, Todd H; Gu, Zhenglong; Abouheif, Ehab; Patel, Nipam H; Li, Wen-Hsiung

    2005-01-01

    Understanding the evolution of gene function is a primary challenge of modern evolutionary biology. Despite an expanding database from genomic and developmental studies, we are lacking quantitative methods for analyzing the evolution of some important measures of gene function, such as gene-expression patterns. Here, we introduce phylogenetic comparative methods to compare different models of gene-expression evolution in a maximum-likelihood framework. We find that expression of duplicated genes has evolved according to a nonphylogenetic model, where closely related genes are no more likely than more distantly related genes to share common expression patterns. These results are consistent with previous studies that found rapid evolution of gene expression during the history of yeast. The comparative methods presented here are general enough to test a wide range of evolutionary hypotheses using genomic-scale data from any organism.

  4. Phylogenetic and Protein Sequence Analysis of Bacterial Chemoreceptors.

    PubMed

    Ortega, Davi R; Zhulin, Igor B

    2018-01-01

    Identifying chemoreceptors in sequenced bacterial genomes, revealing their domain architecture, inferring their evolutionary relationships, and comparing them to chemoreceptors of known function become important steps in genome annotation and chemotaxis research. Here, we describe bioinformatics procedures that enable such analyses, using two closely related bacterial genomes as examples.

  5. phyloXML: XML for evolutionary biology and comparative genomics

    PubMed Central

    Han, Mira V; Zmasek, Christian M

    2009-01-01

    Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types. Results We developed a XML language, named phyloXML, for describing evolutionary trees, as well as various associated data items. PhyloXML provides elements for commonly used items, such as branch lengths, support values, taxonomic names, and gene names and identifiers. By using "property" elements, phyloXML can be adapted to novel and unforeseen use cases. We also developed various software tools for reading, writing, conversion, and visualization of phyloXML formatted data. Conclusion PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data. More information about phyloXML itself, the XSD schema, as well as tools implementing and supporting phyloXML, is available at . PMID:19860910

  6. Comparative genomics of parasitic silkworm microsporidia reveal an association between genome expansion and host adaptation

    PubMed Central

    2013-01-01

    Background Microsporidian Nosema bombycis has received much attention because the pébrine disease of domesticated silkworms results in great economic losses in the silkworm industry. So far, no effective treatment could be found for pébrine. Compared to other known Nosema parasites, N. bombycis can unusually parasitize a broad range of hosts. To gain some insights into the underlying genetic mechanism of pathological ability and host range expansion in this parasite, a comparative genomic approach is conducted. The genome of two Nosema parasites, N. bombycis and N. antheraeae (an obligatory parasite to undomesticated silkworms Antheraea pernyi), were sequenced and compared with their distantly related species, N. ceranae (an obligatory parasite to honey bees). Results Our comparative genomics analysis show that the N. bombycis genome has greatly expanded due to the following three molecular mechanisms: 1) the proliferation of host-derived transposable elements, 2) the acquisition of many horizontally transferred genes from bacteria, and 3) the production of abundnant gene duplications. To our knowledge, duplicated genes derived not only from small-scale events (e.g., tandem duplications) but also from large-scale events (e.g., segmental duplications) have never been seen so abundant in any reported microsporidia genomes. Our relative dating analysis further indicated that these duplication events have arisen recently over very short evolutionary time. Furthermore, several duplicated genes involving in the cytotoxic metabolic pathway were found to undergo positive selection, suggestive of the role of duplicated genes on the adaptive evolution of pathogenic ability. Conclusions Genome expansion is rarely considered as the evolutionary outcome acting on those highly reduced and compact parasitic microsporidian genomes. This study, for the first time, demonstrates that the parasitic genomes can expand, instead of shrink, through several common molecular mechanisms such as gene duplication, horizontal gene transfer, and transposable element expansion. We also showed that the duplicated genes can serve as raw materials for evolutionary innovations possibly contributing to the increase of pathologenic ability. Based on our research, we propose that duplicated genes of N. bombycis should be treated as primary targets for treatment designs against pébrine. PMID:23496955

  7. MIPS plant genome information resources.

    PubMed

    Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

    2007-01-01

    The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.

  8. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures

    PubMed Central

    Stark, Alexander; Lin, Michael F.; Kheradpour, Pouya; Pedersen, Jakob S.; Parts, Leopold; Carlson, Joseph W.; Crosby, Madeline A.; Rasmussen, Matthew D.; Roy, Sushmita; Deoras, Ameya N.; Ruby, J. Graham; Brennecke, Julius; Hodges, Emily; Hinrichs, Angie S.; Caspi, Anat; Paten, Benedict; Park, Seung-Won; Han, Mira V.; Maeder, Morgan L.; Polansky, Benjamin J.; Robson, Bryanne E.; Aerts, Stein; van Helden, Jacques; Hassan, Bassem; Gilbert, Donald G.; Eastman, Deborah A.; Rice, Michael; Weir, Michael; Hahn, Matthew W.; Park, Yongkyu; Dewey, Colin N.; Pachter, Lior; Kent, W. James; Haussler, David; Lai, Eric C.; Bartel, David P.; Hannon, Gregory J.; Kaufman, Thomas C.; Eisen, Michael B.; Clark, Andrew G.; Smith, Douglas; Celniker, Susan E.; Gelbart, William M.; Kellis, Manolis

    2008-01-01

    Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or ‘evolutionary signatures’, dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies. PMID:17994088

  9. Hierarchically Aligning 10 Legume Genomes Establishes a Family-Level Genomics Platform.

    PubMed

    Wang, Jinpeng; Sun, Pengchuan; Li, Yuxian; Liu, Yinzhe; Yu, Jigao; Ma, Xuelian; Sun, Sangrong; Yang, Nanshan; Xia, Ruiyan; Lei, Tianyu; Liu, Xiaojian; Jiao, Beibei; Xing, Yue; Ge, Weina; Wang, Li; Wang, Zhenyi; Song, Xiaoming; Yuan, Min; Guo, Di; Zhang, Lan; Zhang, Jiaqi; Jin, Dianchuan; Chen, Wei; Pan, Yuxin; Liu, Tao; Jin, Ling; Sun, Jinshuai; Yu, Jiaxiang; Cheng, Rui; Duan, Xueqian; Shen, Shaoqi; Qin, Jun; Zhang, Meng-Chen; Paterson, Andrew H; Wang, Xiyin

    2017-05-01

    Mainly due to their economic importance, genomes of 10 legumes, including soybean ( Glycine max ), wild peanut ( Arachis duranensis and Arachis ipaensis ), and barrel medic ( Medicago truncatula ), have been sequenced. However, a family-level comparative genomics analysis has been unavailable. With grape ( Vitis vinifera ) and selected legume genomes as outgroups, we managed to perform a hierarchical and event-related alignment of these genomes and deconvoluted layers of homologous regions produced by ancestral polyploidizations or speciations. Consequently, we illustrated genomic fractionation characterized by widespread gene losses after the polyploidizations. Notably, high similarity in gene retention between recently duplicated chromosomes in soybean supported the likely autopolyploidy nature of its tetraploid ancestor. Moreover, although most gene losses were nearly random, largely but not fully described by geometric distribution, we showed that polyploidization contributed divergently to the copy number variation of important gene families. Besides, we showed significantly divergent evolutionary levels among legumes and, by performing synonymous nucleotide substitutions at synonymous sites correction, redated major evolutionary events during their expansion. This effort laid a solid foundation for further genomics exploration in the legume research community and beyond. We describe only a tiny fraction of legume comparative genomics analysis that we performed; more information was stored in the newly constructed Legume Comparative Genomics Research Platform (www.legumegrp.org). © 2017 American Society of Plant Biologists. All Rights Reserved.

  10. A Molecular Phylogeny of Living Primates

    PubMed Central

    Perelman, Polina; Johnson, Warren E.; Roos, Christian; Seuánez, Hector N.; Horvath, Julie E.; Moreira, Miguel A. M.; Kessing, Bailey; Pontius, Joan; Roelke, Melody; Rumpler, Yves; Schneider, Maria Paula C.; Silva, Artur; O'Brien, Stephen J.; Pecon-Slattery, Jill

    2011-01-01

    Comparative genomic analyses of primates offer considerable potential to define and understand the processes that mold, shape, and transform the human genome. However, primate taxonomy is both complex and controversial, with marginal unifying consensus of the evolutionary hierarchy of extant primate species. Here we provide new genomic sequence (∼8 Mb) from 186 primates representing 61 (∼90%) of the described genera, and we include outgroup species from Dermoptera, Scandentia, and Lagomorpha. The resultant phylogeny is exceptionally robust and illuminates events in primate evolution from ancient to recent, clarifying numerous taxonomic controversies and providing new data on human evolution. Ongoing speciation, reticulate evolution, ancient relic lineages, unequal rates of evolution, and disparate distributions of insertions/deletions among the reconstructed primate lineages are uncovered. Our resolution of the primate phylogeny provides an essential evolutionary framework with far-reaching applications including: human selection and adaptation, global emergence of zoonotic diseases, mammalian comparative genomics, primate taxonomy, and conservation of endangered species. PMID:21436896

  11. Sheep genome functional annotation reveals proximal regulatory elements contributed to the evolution of modern breeds.

    PubMed

    Naval-Sanchez, Marina; Nguyen, Quan; McWilliam, Sean; Porto-Neto, Laercio R; Tellam, Ross; Vuocolo, Tony; Reverter, Antonio; Perez-Enciso, Miguel; Brauning, Rudiger; Clarke, Shannon; McCulloch, Alan; Zamani, Wahid; Naderi, Saeid; Rezaei, Hamid Reza; Pompanon, Francois; Taberlet, Pierre; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Jhangiani, Shalini N; Cockett, Noelle; Daetwyler, Hans; Kijas, James

    2018-02-28

    Domestication fundamentally reshaped animal morphology, physiology and behaviour, offering the opportunity to investigate the molecular processes driving evolutionary change. Here we assess sheep domestication and artificial selection by comparing genome sequence from 43 modern breeds (Ovis aries) and their Asian mouflon ancestor (O. orientalis) to identify selection sweeps. Next, we provide a comparative functional annotation of the sheep genome, validated using experimental ChIP-Seq of sheep tissue. Using these annotations, we evaluate the impact of selection and domestication on regulatory sequences and find that sweeps are significantly enriched for protein coding genes, proximal regulatory elements of genes and genome features associated with active transcription. Finally, we find individual sites displaying strong allele frequency divergence are enriched for the same regulatory features. Our data demonstrate that remodelling of gene expression is likely to have been one of the evolutionary forces that drove phenotypic diversification of this common livestock species.

  12. The COG database: an updated version includes eukaryotes

    PubMed Central

    Tatusov, Roman L; Fedorova, Natalie D; Jackson, John D; Jacobs, Aviva R; Kiryutin, Boris; Koonin, Eugene V; Krylov, Dmitri M; Mazumder, Raja; Mekhedov, Sergei L; Nikolskaya, Anastasia N; Rao, B Sridhar; Smirnov, Sergei; Sverdlov, Alexander V; Vasudevan, Sona; Wolf, Yuri I; Yin, Jodie J; Natale, Darren A

    2003-01-01

    Background The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Results We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. Conclusion The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies. PMID:12969510

  13. Evolutionary genomics of yeast pathogens in the Saccharomycotina

    PubMed Central

    Naranjo-Ortíz, Miguel A.; Marcet-Houben, Marina

    2016-01-01

    Saccharomycotina comprises a diverse group of yeasts that includes numerous species of industrial or clinical relevance. Opportunistic pathogens within this clade are often assigned to the genus Candida but belong to phylogenetically distant lineages that also comprise non-pathogenic species. This indicates that the ability to infect humans has evolved independently several times among Saccharomycotina. Although the mechanisms of infection of the main groups of Candida pathogens are starting to be unveiled, we still lack sufficient understanding of the evolutionary paths that led to a virulent phenotype in each of the pathogenic lineages. Deciphering what genomic changes underlie the evolutionary emergence of a virulence trait will not only aid the discovery of novel virulence mechanisms but it will also provide valuable information to understand how new pathogens emerge, and what clades may pose a future danger. Here we review recent comparative genomics efforts that have revealed possible evolutionary paths to pathogenesis in different lineages, focusing on the main three agents of candidiasis worldwide: Candida albicans, C. parapsilosis and C. glabrata. We will discuss what genomic traits may facilitate the emergence of virulence, and focus on two different genome evolution mechanisms able to generate drastic phenotypic changes and which have been associated to the emergence of virulence: gene family expansion and interspecies hybridization. PMID:27493146

  14. Perspectives provided by leopard and other cat genomes: how diet determined the evolutionary history of carnivores, omnivores, and herbivores

    PubMed Central

    Kim, Soonok; Cho, Yun Sung; Bhak, Jong; O’Brian, Stephen J.; Yeo, Joo-Hong

    2017-01-01

    Recent advances in genome sequencing technologies have enabled humans to generate and investigate the genomes of wild species. This includes the big cat family, such as tigers, lions, and leopards. Adding the first high quality leopard genome, we have performed an in-depth comparative analysis to identify the genomic signatures in the evolution of felid to become the top predators on land. Our study focused on how the carnivore genomes, as compared to the omnivore or herbivore genomes, shared evolutionary adaptations in genes associated with nutrient metabolism, muscle strength, agility, and other traits responsible for hunting and meat digestion. We found genetic evidence that genomes represent what animals eat through modifying genes. Highly conserved genetically relevant regions were discovered in genomes at the family level. Also, the Felidae family genomes exhibited low levels of genetic diversity associated with decreased population sizes, presumably because of their strict diet, suggesting their vulnerability and critical conservation status. Our findings can be used for human health enhancement, since we share the same genes as cats with some variation. This is an example how wildlife genomes can be a critical resource for human evolution, providing key genetic marker information for disease treatment. PMID:28042784

  15. Phylogenomic Insights into Mouse Evolution Using a Pseudoreference Approach

    PubMed Central

    Sarver, Brice A.J.; Keeble, Sara; Cosart, Ted; Tucker, Priscilla K.; Dean, Matthew D.

    2017-01-01

    Comparative genomic studies are now possible across a broad range of evolutionary timescales, but the generation and analysis of genomic data across many different species still present a number of challenges. The most sophisticated genotyping and down-stream analytical frameworks are still predominantly based on comparisons to high-quality reference genomes. However, established genomic resources are often limited within a given group of species, necessitating comparisons to divergent reference genomes that could restrict or bias comparisons across a phylogenetic sample. Here, we develop a scalable pseudoreference approach to iteratively incorporate sample-specific variation into a genome reference and reduce the effects of systematic mapping bias in downstream analyses. To characterize this framework, we used targeted capture to sequence whole exomes (∼54 Mbp) in 12 lineages (ten species) of mice spanning the Mus radiation. We generated whole exome pseudoreferences for all species and show that this iterative reference-based approach improved basic genomic analyses that depend on mapping accuracy while preserving the associated annotations of the mouse reference genome. We then use these pseudoreferences to resolve evolutionary relationships among these lineages while accounting for phylogenetic discordance across the genome, contributing an important resource for comparative studies in the mouse system. We also describe patterns of genomic introgression among lineages and compare our results to previous studies. Our general approach can be applied to whole or partitioned genomic data and is easily portable to any system with sufficient genomic resources, providing a useful framework for phylogenomic studies in mice and other taxa. PMID:28338821

  16. Are there ergodic limits to evolution? Ergodic exploration of genome space and convergence

    PubMed Central

    McLeish, Tom C. B.

    2015-01-01

    We examine the analogy between evolutionary dynamics and statistical mechanics to include the fundamental question of ergodicity—the representative exploration of the space of possible states (in the case of evolution this is genome space). Several properties of evolutionary dynamics are identified that allow a generalization of the ergodic dynamics, familiar in dynamical systems theory, to evolution. Two classes of evolved biological structure then arise, differentiated by the qualitative duration of their evolutionary time scales. The first class has an ergodicity time scale (the time required for representative genome exploration) longer than available evolutionary time, and has incompletely explored the genotypic and phenotypic space of its possibilities. This case generates no expectation of convergence to an optimal phenotype or possibility of its prediction. The second, more interesting, class exhibits an evolutionary form of ergodicity—essentially all of the structural space within the constraints of slower evolutionary variables have been sampled; the ergodicity time scale for the system evolution is less than the evolutionary time. In this case, some convergence towards similar optima may be expected for equivalent systems in different species where both possess ergodic evolutionary dynamics. When the fitness maximum is set by physical, rather than co-evolved, constraints, it is additionally possible to make predictions of some properties of the evolved structures and systems. We propose four structures that emerge from evolution within genotypes whose fitness is induced from their phenotypes. Together, these result in an exponential speeding up of evolution, when compared with complete exploration of genomic space. We illustrate a possible case of application and a prediction of convergence together with attaining a physical fitness optimum in the case of invertebrate compound eye resolution. PMID:26640648

  17. Are there ergodic limits to evolution? Ergodic exploration of genome space and convergence.

    PubMed

    McLeish, Tom C B

    2015-12-06

    We examine the analogy between evolutionary dynamics and statistical mechanics to include the fundamental question of ergodicity-the representative exploration of the space of possible states (in the case of evolution this is genome space). Several properties of evolutionary dynamics are identified that allow a generalization of the ergodic dynamics, familiar in dynamical systems theory, to evolution. Two classes of evolved biological structure then arise, differentiated by the qualitative duration of their evolutionary time scales. The first class has an ergodicity time scale (the time required for representative genome exploration) longer than available evolutionary time, and has incompletely explored the genotypic and phenotypic space of its possibilities. This case generates no expectation of convergence to an optimal phenotype or possibility of its prediction. The second, more interesting, class exhibits an evolutionary form of ergodicity-essentially all of the structural space within the constraints of slower evolutionary variables have been sampled; the ergodicity time scale for the system evolution is less than the evolutionary time. In this case, some convergence towards similar optima may be expected for equivalent systems in different species where both possess ergodic evolutionary dynamics. When the fitness maximum is set by physical, rather than co-evolved, constraints, it is additionally possible to make predictions of some properties of the evolved structures and systems. We propose four structures that emerge from evolution within genotypes whose fitness is induced from their phenotypes. Together, these result in an exponential speeding up of evolution, when compared with complete exploration of genomic space. We illustrate a possible case of application and a prediction of convergence together with attaining a physical fitness optimum in the case of invertebrate compound eye resolution.

  18. Comparative systems biology across an evolutionary gradient within the Shewanella genus.

    PubMed

    Konstantinidis, Konstantinos T; Serres, Margrethe H; Romine, Margaret F; Rodrigues, Jorge L M; Auchtung, Jennifer; McCue, Lee-Ann; Lipton, Mary S; Obraztsova, Anna; Giometti, Carol S; Nealson, Kenneth H; Fredrickson, James K; Tiedje, James M

    2009-09-15

    To what extent genotypic differences translate to phenotypic variation remains a poorly understood issue of paramount importance for several cornerstone concepts of microbiology including the species definition. Here, we take advantage of the completed genomic sequences, expressed proteomic profiles, and physiological studies of 10 closely related Shewanella strains and species to provide quantitative insights into this issue. Our analyses revealed that, despite extensive horizontal gene transfer within these genomes, the genotypic and phenotypic similarities among the organisms were generally predictable from their evolutionary relatedness. The power of the predictions depended on the degree of ecological specialization of the organisms evaluated. Using the gradient of evolutionary relatedness formed by these genomes, we were able to partly isolate the effect of ecology from that of evolutionary divergence and to rank the different cellular functions in terms of their rates of evolution. Our ranking also revealed that whole-cell protein expression differences among these organisms, when the organisms were grown under identical conditions, were relatively larger than differences at the genome level, suggesting that similarity in gene regulation and expression should constitute another important parameter for (new) species description. Collectively, our results provide important new information toward beginning a systems-level understanding of bacterial species and genera.

  19. Genome of the Asian longhorned beetle, Anoplophora glabripennis), a globally significant invasive species, reveals key functional and evolutionary innovations at the beetle-plant interface

    USDA-ARS?s Scientific Manuscript database

    The Asian longhorned beetle (Anoplophora glabripennis; AGLAB) is a globally significant invasive species capable of inflicting severe feeding damage on many important orchard, ornamental and forest trees. Genome sequencing, annotation, gene expression assays, and functional and comparative genomic s...

  20. Improved Genome Assembly and Annotation for the Rock Pigeon (Columba livia)

    PubMed Central

    Holt, Carson; Campbell, Michael; Keays, David A.; Edelman, Nathaniel; Kapusta, Aurélie; Maclary, Emily; T. Domyan, Eric; Suh, Alexander; Warren, Wesley C.; Yandell, Mark; Gilbert, M. Thomas P.; Shapiro, Michael D.

    2018-01-01

    The domestic rock pigeon (Columba livia) is among the most widely distributed and phenotypically diverse avian species. C. livia is broadly studied in ecology, genetics, physiology, behavior, and evolutionary biology, and has recently emerged as a model for understanding the molecular basis of anatomical diversity, the magnetic sense, and other key aspects of avian biology. Here we report an update to the C. livia genome reference assembly and gene annotation dataset. Greatly increased scaffold lengths in the updated reference assembly, along with an updated annotation set, provide improved tools for evolutionary and functional genetic studies of the pigeon, and for comparative avian genomics in general. PMID:29519939

  1. The scope and strength of sex-specific selection in genome evolution.

    PubMed

    Wright, A E; Mank, J E

    2013-09-01

    Males and females share the vast majority of their genomes and yet are often subject to different, even conflicting, selection. Genomic and transcriptomic developments have made it possible to assess sex-specific selection at the molecular level, and it is clear that sex-specific selection shapes the evolutionary properties of several genomic characteristics, including transcription, post-transcriptional regulation, imprinting, genome structure and gene sequence. Sex-specific selection is strongly influenced by mating system, which also causes neutral evolutionary changes that affect different regions of the genome in different ways. Here, we synthesize theoretical and molecular work in order to provide a cohesive view of the role of sex-specific selection and mating system in genome evolution. We also highlight the need for a combined approach, incorporating both genomic data and experimental phenotypic studies, in order to understand precisely how sex-specific selection drives evolutionary change across the genome. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.

  2. Advances in computer simulation of genome evolution: toward more realistic evolutionary genomics analysis by approximate bayesian computation.

    PubMed

    Arenas, Miguel

    2015-04-01

    NGS technologies present a fast and cheap generation of genomic data. Nevertheless, ancestral genome inference is not so straightforward due to complex evolutionary processes acting on this material such as inversions, translocations, and other genome rearrangements that, in addition to their implicit complexity, can co-occur and confound ancestral inferences. Recently, models of genome evolution that accommodate such complex genomic events are emerging. This letter explores these novel evolutionary models and proposes their incorporation into robust statistical approaches based on computer simulations, such as approximate Bayesian computation, that may produce a more realistic evolutionary analysis of genomic data. Advantages and pitfalls in using these analytical methods are discussed. Potential applications of these ancestral genomic inferences are also pointed out.

  3. Mitochondrial genome evolution in the Saccharomyces sensu stricto complex.

    PubMed

    Ruan, Jiangxing; Cheng, Jian; Zhang, Tongcun; Jiang, Huifeng

    2017-01-01

    Exploring the evolutionary patterns of mitochondrial genomes is important for our understanding of the Saccharomyces sensu stricto (SSS) group, which is a model system for genomic evolution and ecological analysis. In this study, we first obtained the complete mitochondrial sequences of two important species, Saccharomyces mikatae and Saccharomyces kudriavzevii. We then compared the mitochondrial genomes in the SSS group with those of close relatives, and found that the non-coding regions evolved rapidly, including dramatic expansion of intergenic regions, fast evolution of introns and almost 20-fold higher rearrangement rates than those of the nuclear genomes. However, the coding regions, and especially the protein-coding genes, are more conserved than those in the nuclear genomes of the SSS group. The different evolutionary patterns of coding and non-coding regions in the mitochondrial and nuclear genomes may be related to the origin of the aerobic fermentation lifestyle in this group. Our analysis thus provides novel insights into the evolution of mitochondrial genomes.

  4. Can fat explain the human brain's big bang evolution?-Horrobin's leads for comparative and functional genomics.

    PubMed

    Erren, T C; Erren, M

    2004-04-01

    When David Horrobin suggested that phospholipid and fatty acid metabolism played a major role in human evolution, his 'fat utilization hypothesis' unified intriguing work from paleoanthropology, evolutionary biology, genetic and nervous system research in a novel and coherent lipid-related context. Interestingly, unlike most other evolutionary concepts, the hypothesis allows specific predictions which can be empirically tested in the near future. This paper summarizes some of Horrobin's intriguing propositions and suggests as to how approaches of comparative genomics published in Cell, Nature, Science and elsewhere since 1997 may be used to examine his evolutionary hypothesis. Indeed, systematic investigations of the genomic clock in the species' mitochondrial DNA, the Y and autosomal chromosomes as evidence of evolutionary relationships and distinctions can help to scrutinize associated predictions for their validity, namely that key mutations which differentiate us from Neanderthals and from great apes are in the genes coding for proteins which regulate fat metabolism, and particularly the phospholipid metabolism of the synapses of the brain. It is concluded that beyond clues to humans' relationships with living primates and to the Neanderthals' cognitive performance and their disappearance, the suggested molecular clock analyses may provide crucial insights into the biochemical evolution-and means of possible manipulation-of our brain.

  5. Datasets for evolutionary comparative genomics

    PubMed Central

    Liberles, David A

    2005-01-01

    Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856

  6. The evolutionary fate of the chloroplast and nuclear rps16 genes as revealed through the sequencing and comparative analyses of four novel legume chloroplast genomes from Lupinus

    PubMed Central

    Keller, J.; Rousseau-Gueutin, M.; Martin, G.E.; Morice, J.; Boutte, J.; Coissac, E.; Ourari, M.; Aïnouche, M.; Salmon, A.; Cabello-Hurtado, F.

    2017-01-01

    Abstract The Fabaceae family is considered as a model system for understanding chloroplast genome evolution due to the presence of extensive structural rearrangements, gene losses and localized hypermutable regions. Here, we provide sequences of four chloroplast genomes from the Lupinus genus, belonging to the underinvestigated Genistoid clade. Notably, we found in Lupinus species the functional loss of the essential rps16 gene, which was most likely replaced by the nuclear rps16 gene that encodes chloroplast and mitochondrion targeted RPS16 proteins. To study the evolutionary fate of the rps16 gene, we explored all available plant chloroplast, mitochondrial and nuclear genomes. Whereas no plant mitochondrial genomes carry an rps16 gene, many plants still have a functional nuclear and chloroplast rps16 gene. Ka/Ks ratios revealed that both chloroplast and nuclear rps16 copies were under purifying selection. However, due to the dual targeting of the nuclear rps16 gene product and the absence of a mitochondrial copy, the chloroplast gene may be lost. We also performed comparative analyses of lupine plastomes (SNPs, indels and repeat elements), identified the most variable regions and examined their phylogenetic utility. The markers identified here will help to reveal the evolutionary history of lupines, Genistoids and closely related clades. PMID:28338826

  7. Genomic investigations of evolutionary dynamics and epistasis in microbial evolution experiments.

    PubMed

    Jerison, Elizabeth R; Desai, Michael M

    2015-12-01

    Microbial evolution experiments enable us to watch adaptation in real time, and to quantify the repeatability and predictability of evolution by comparing identical replicate populations. Further, we can resurrect ancestral types to examine changes over evolutionary time. Until recently, experimental evolution has been limited to measuring phenotypic changes, or to tracking a few genetic markers over time. However, recent advances in sequencing technology now make it possible to extensively sequence clones or whole-population samples from microbial evolution experiments. Here, we review recent work exploiting these techniques to understand the genomic basis of evolutionary change in experimental systems. We first focus on studies that analyze the dynamics of genome evolution in microbial systems. We then survey work that uses observations of sequence evolution to infer aspects of the underlying fitness landscape, concentrating on the epistatic interactions between mutations and the constraints these interactions impose on adaptation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Evolutionary origin and phylogeny of the modern holocephalans (Chondrichthyes: Chimaeriformes): a mitogenomic perspective.

    PubMed

    Inoue, Jun G; Miya, Masaki; Lam, Kevin; Tay, Boon-Hui; Danks, Janine A; Bell, Justin; Walker, Terrence I; Venkatesh, Byrappa

    2010-11-01

    With our increasing ability for generating whole-genome sequences, comparative analysis of whole genomes has become a powerful tool for understanding the structure, function, and evolutionary history of human and other vertebrate genomes. By virtue of their position basal to bony vertebrates, cartilaginous fishes (class Chondrichthyes) are a valuable outgroup in comparative studies of vertebrates. Recently, a holocephalan cartilaginous fish, the elephant shark, Callorhinchus milii (Subclass Holocephali: Order Chimaeriformes), has been proposed as a model genome, and low-coverage sequence of its genome has been generated. Despite such an increasing interest, the evolutionary history of the modern holocephalans-a previously successful and diverse group but represented by only 39 extant species-and their relationship with elasmobranchs and other jawed vertebrates has been poorly documented largely owing to a lack of well-preserved fossil materials after the end-Permian about 250 Ma. In this study, we assembled the whole mitogenome sequences for eight representatives from all the three families of the modern holocephalans and investigated their phylogenetic relationships and evolutionary history. Unambiguously aligned sequences from these holocephalans together with 17 other vertebrates (9,409 nt positions excluding entire third codon positions) were subjected to partitioned maximum likelihood analysis. The resulting tree strongly supported a single origin of the modern holocephalans and their sister-group relationship with elasmobranchs. The mitogenomic tree recovered the most basal callorhinchids within the chimaeriforms, which is sister to a clade comprising the remaining two families (rhinochimaerids and chimaerids). The timetree derived from a relaxed molecular clock Bayesian method suggests that the holocephalans originated in the Silurian about 420 Ma, having survived from the end-Permian (250 Ma) mass extinction and undergoing familial diversifications during the late Jurassic to early Cretaceous (170-120 Ma). This postulated evolutionary scenario agrees well with that based on the paleontological observations.

  9. Evolutionary Genomics of Fast Evolving Tunicates

    PubMed Central

    Berná, Luisa; Alvarez-Valin, Fernando

    2014-01-01

    Tunicates have been extensively studied because of their crucial phylogenetic location (the closest living relatives of vertebrates) and particular developmental plan. Recent genome efforts have disclosed that tunicates are also remarkable in their genome organization and molecular evolutionary patterns. Here, we review these latter aspects, comparing the similarities and specificities of two model species of the group: Oikopleura dioica and Ciona intestinalis. These species exhibit great genome plasticity and Oikopleura in particular has undergone a process of extreme genome reduction and compaction that can be explained in part by gene loss, but is mostly due to other mechanisms such as shortening of intergenic distances and introns, and scarcity of mobile elements. In Ciona, genome reorganization was less severe being more similar to the other chordates in several aspects. Rates and patterns of molecular evolution are also peculiar in tunicates, being Ciona about 50% faster than vertebrates and Oikopleura three times faster. In fact, the latter species is considered as the fastest evolving metazoan recorded so far. Two processes of increase in evolutionary rates have taken place in tunicates. One of them is more extreme, and basically restricted to genes encoding regulatory proteins (transcription regulators, chromatin remodeling proteins, and metabolic regulators), and the other one is less pronounced but affects the whole genome. Very likely adaptive evolution has played a very significant role in the first, whereas the functional and/or evolutionary causes of the second are less clear and the evidence is not conclusive. The evidences supporting the incidence of increased mutation and less efficient negative selection are presented and discussed. PMID:25008364

  10. Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms

    PubMed Central

    2012-01-01

    Background Comparative genomics can inform us about the processes of mutation and selection across diverse taxa. Among seed plants, gymnosperms have been lacking in genomic comparisons. Recent EST and full-length cDNA collections for two conifers, Sitka spruce (Picea sitchensis) and loblolly pine (Pinus taeda), together with full genome sequences for two angiosperms, Arabidopsis thaliana and poplar (Populus trichocarpa), offer an opportunity to infer the evolutionary processes underlying thousands of orthologous protein-coding genes in gymnosperms compared with an angiosperm orthologue set. Results Based upon pairwise comparisons of 3,723 spruce and pine orthologues, we found an average synonymous genetic distance (dS) of 0.191, and an average dN/dS ratio of 0.314. Using a fossil-established divergence time of 140 million years between spruce and pine, we extrapolated a nucleotide substitution rate of 0.68 × 10-9 synonymous substitutions per site per year. When compared to angiosperms, this indicates a dramatically slower rate of nucleotide substitution rates in conifers: on average 15-fold. Coincidentally, we found a three-fold higher dN/dS for the spruce-pine lineage compared to the poplar-Arabidopsis lineage. This joint occurrence of a slower evolutionary rate in conifers with higher dN/dS, and possibly positive selection, showcases the uniqueness of conifer genome evolution. Conclusions Our results are in line with documented reduced nucleotide diversity, conservative genome evolution and low rates of diversification in conifers on the one hand and numerous examples of local adaptation in conifers on the other hand. We propose that reduced levels of nucleotide mutation in large and long-lived conifer trees, coupled with large effective population size, were the main factors leading to slow substitution rates but retention of beneficial mutations. PMID:22264329

  11. Biogeography of the Sulfolobus islandicus pan-genome

    PubMed Central

    Reno, Michael L.; Held, Nicole L.; Fields, Christopher J.; Burke, Patricia V.; Whitaker, Rachel J.

    2009-01-01

    Variation in gene content has been hypothesized to be the primary mode of adaptive evolution in microorganisms; however, very little is known about the spatial and temporal distribution of variable genes. Through population-scale comparative genomics of 7 Sulfolobus islandicus genomes from 3 locations, we demonstrate the biogeographical structure of the pan-genome of this species, with no evidence of gene flow between geographically isolated populations. The evolutionary independence of each population allowed us to assess genome dynamics over very recent evolutionary time, beginning ≈910,000 years ago. On this time scale, genome variation largely consists of recent strain-specific integration of mobile elements. Localized sectors of parallel gene loss are identified; however, the balance between the gain and loss of genetic material suggests that S. islandicus genomes acquire material slowly over time, primarily from closely related Sulfolobus species. Examination of the genome dynamics through population genomics in S. islandicus exposes the process of allopatric speciation in thermophilic Archaea and brings us closer to a generalized framework for understanding microbial genome evolution in a spatial context. PMID:19435847

  12. Phylogenetic and Evolutionary Patterns in Microbial Carotenoid Biosynthesis Are Revealed by Comparative Genomics

    PubMed Central

    Klassen, Jonathan L.

    2010-01-01

    Background Carotenoids are multifunctional, taxonomically widespread and biotechnologically important pigments. Their biosynthesis serves as a model system for understanding the evolution of secondary metabolism. Microbial carotenoid diversity and evolution has hitherto been analyzed primarily from structural and biosynthetic perspectives, with the few phylogenetic analyses of microbial carotenoid biosynthetic proteins using either used limited datasets or lacking methodological rigor. Given the recent accumulation of microbial genome sequences, a reappraisal of microbial carotenoid biosynthetic diversity and evolution from the perspective of comparative genomics is warranted to validate and complement models of microbial carotenoid diversity and evolution based upon structural and biosynthetic data. Methodology/Principal Findings Comparative genomics were used to identify and analyze in silico microbial carotenoid biosynthetic pathways. Four major phylogenetic lineages of carotenoid biosynthesis are suggested composed of: (i) Proteobacteria; (ii) Firmicutes; (iii) Chlorobi, Cyanobacteria and photosynthetic eukaryotes; and (iv) Archaea, Bacteroidetes and two separate sub-lineages of Actinobacteria. Using this phylogenetic framework, specific evolutionary mechanisms are proposed for carotenoid desaturase CrtI-family enzymes and carotenoid cyclases. Several phylogenetic lineage-specific evolutionary mechanisms are also suggested, including: (i) horizontal gene transfer; (ii) gene acquisition followed by differential gene loss; (iii) co-evolution with other biochemical structures such as proteorhodopsins; and (iv) positive selection. Conclusions/Significance Comparative genomics analyses of microbial carotenoid biosynthetic proteins indicate a much greater taxonomic diversity then that identified based on structural and biosynthetic data, and divides microbial carotenoid biosynthesis into several, well-supported phylogenetic lineages not evident previously. This phylogenetic framework is applicable to understanding the evolution of specific carotenoid biosynthetic proteins or the unique characteristics of carotenoid biosynthetic evolution in a specific phylogenetic lineage. Together, these analyses suggest a “bramble” model for microbial carotenoid biosynthesis whereby later biosynthetic steps exhibit greater evolutionary plasticity and reticulation compared to those closer to the biosynthetic “root”. Structural diversification may be constrained (“trimmed”) where selection is strong, but less so where selection is weaker. These analyses also highlight likely productive avenues for future research and bioprospecting by identifying both gaps in current knowledge and taxa which may particularly facilitate carotenoid diversification. PMID:20582313

  13. Improved Genome Assembly and Annotation for the Rock Pigeon (Columba livia).

    PubMed

    Holt, Carson; Campbell, Michael; Keays, David A; Edelman, Nathaniel; Kapusta, Aurélie; Maclary, Emily; T Domyan, Eric; Suh, Alexander; Warren, Wesley C; Yandell, Mark; Gilbert, M Thomas P; Shapiro, Michael D

    2018-05-04

    The domestic rock pigeon ( Columba livia ) is among the most widely distributed and phenotypically diverse avian species. C. livia is broadly studied in ecology, genetics, physiology, behavior, and evolutionary biology, and has recently emerged as a model for understanding the molecular basis of anatomical diversity, the magnetic sense, and other key aspects of avian biology. Here we report an update to the C. livia genome reference assembly and gene annotation dataset. Greatly increased scaffold lengths in the updated reference assembly, along with an updated annotation set, provide improved tools for evolutionary and functional genetic studies of the pigeon, and for comparative avian genomics in general. Copyright © 2018 Holt et al.

  14. Mycobacterial species as case-study of comparative genome analysis.

    PubMed

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  15. Genome Alignment Spanning Major Poaceae Lineages Reveals Heterogeneous Evolutionary Rates and Alters Inferred Dates for Key Evolutionary Events.

    PubMed

    Wang, Xiyin; Wang, Jingpeng; Jin, Dianchuan; Guo, Hui; Lee, Tae-Ho; Liu, Tao; Paterson, Andrew H

    2015-06-01

    Multiple comparisons among genomes can clarify their evolution, speciation, and functional innovations. To date, the genome sequences of eight grasses representing the most economically important Poaceae (grass) clades have been published, and their genomic-level comparison is an essential foundation for evolutionary, functional, and translational research. Using a formal and conservative approach, we aligned these genomes. Direct comparison of paralogous gene pairs all duplicated simultaneously reveal striking variation in evolutionary rates among whole genomes, with nucleotide substitution slowest in rice and up to 48% faster in other grasses, adding a new dimension to the value of rice as a grass model. We reconstructed ancestral genome contents for major evolutionary nodes, potentially contributing to understanding the divergence and speciation of grasses. Recent fossil evidence suggests revisions of the estimated dates of key evolutionary events, implying that the pan-grass polyploidization occurred ∼96 million years ago and could not be related to the Cretaceous-Tertiary mass extinction as previously inferred. Adjusted dating to reflect both updated fossil evidence and lineage-specific evolutionary rates suggested that maize subgenome divergence and maize-sorghum divergence were virtually simultaneous, a coincidence that would be explained if polyploidization directly contributed to speciation. This work lays a solid foundation for Poaceae translational genomics. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.

  16. Evolutionary rates of mitochondrial genomes correspond to diversification rates and to contemporary species richness in birds and reptiles

    PubMed Central

    Eo, Soo Hyung; DeWoody, J. Andrew

    2010-01-01

    Rates of biological diversification should ultimately correspond to rates of genome evolution. Recent studies have compared diversification rates with phylogenetic branch lengths, but incomplete phylogenies hamper such analyses for many taxa. Herein, we use pairwise comparisons of confamilial sauropsid (bird and reptile) mitochondrial DNA (mtDNA) genome sequences to estimate substitution rates. These molecular evolutionary rates are considered in light of the age and species richness of each taxonomic family, using a random-walk speciation–extinction process to estimate rates of diversification. We find the molecular clock ticks at disparate rates in different families and at different genes. For example, evolutionary rates are relatively fast in snakes and lizards, intermediate in crocodilians and slow in turtles and birds. There was also rate variation across genes, where non-synonymous substitution rates were fastest at ATP8 and slowest at CO3. Family-by-gene interactions were significant, indicating that local clocks vary substantially among sauropsids. Most importantly, we find evidence that mitochondrial genome evolutionary rates are positively correlated with speciation rates and with contemporary species richness. Nuclear sequences are poorly represented among reptiles, but the correlation between rates of molecular evolution and species diversification also extends to 18 avian nuclear genes we tested. Thus, the nuclear data buttress our mtDNA findings. PMID:20610427

  17. The Modern Synthesis in the Light of Microbial Genomics.

    PubMed

    Booth, Austin; Mariscal, Carlos; Doolittle, W Ford

    2016-09-08

    We review the theoretical implications of findings in genomics for evolutionary biology since the Modern Synthesis. We examine the ways in which microbial genomics has influenced our understanding of the last universal common ancestor, the tree of life, species, lineages, and evolutionary transitions. We conclude by advocating a piecemeal toolkit approach to evolutionary biology, in lieu of any grand unified theory updated to include microbial genomics.

  18. Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data

    PubMed Central

    Lemoine, Frédéric; Lespinet, Olivier; Labedan, Bernard

    2007-01-01

    Background Comparison of completely sequenced microbial genomes has revealed how fluid these genomes are. Detecting synteny blocks requires reliable methods to determining the orthologs among the whole set of homologs detected by exhaustive comparisons between each pair of completely sequenced genomes. This is a complex and difficult problem in the field of comparative genomics but will help to better understand the way prokaryotic genomes are evolving. Results We have developed a suite of programs that automate three essential steps to study conservation of gene order, and validated them with a set of 107 bacteria and archaea that cover the majority of the prokaryotic taxonomic space. We identified the whole set of shared homologs between two or more species and computed the evolutionary distance separating each pair of homologs. We applied two strategies to extract from the set of homologs a collection of valid orthologs shared by at least two genomes. The first computes the Reciprocal Smallest Distance (RSD) using the PAM distances separating pairs of homologs. The second method groups homologs in families and reconstructs each family's evolutionary tree, distinguishing bona fide orthologs as well as paralogs created after the last speciation event. Although the phylogenetic tree method often succeeds where RSD fails, the reverse could occasionally be true. Accordingly, we used the data obtained with either methods or their intersection to number the orthologs that are adjacent in for each pair of genomes, the Positional Orthologous Genes (POGs), and to further study their properties. Once all these synteny blocks have been detected, we showed that POGs are subject to more evolutionary constraints than orthologs outside synteny groups, whichever the taxonomic distance separating the compared organisms. Conclusion The suite of programs described in this paper allows a reliable detection of orthologs and is useful for evaluating gene order conservation in prokaryotes whichever their taxonomic distance. Thus, our approach will make easy the rapid identification of POGS in the next few years as we are expecting to be inundated with thousands of completely sequenced microbial genomes. PMID:18047665

  19. Underlying Principles of Natural Selection in Network Evolution: Systems Biology Approach

    PubMed Central

    Chen, Bor-Sen; Wu, Wei-Sheng

    2007-01-01

    Systems biology is a rapidly expanding field that integrates diverse areas of science such as physics, engineering, computer science, mathematics, and biology toward the goal of elucidating the underlying principles of hierarchical metabolic and regulatory systems in the cell, and ultimately leading to predictive understanding of cellular response to perturbations. Because post-genomics research is taking place throughout the tree of life, comparative approaches offer a way for combining data from many organisms to shed light on the evolution and function of biological networks from the gene to the organismal level. Therefore, systems biology can build on decades of theoretical work in evolutionary biology, and at the same time evolutionary biology can use the systems biology approach to go in new uncharted directions. In this study, we present a review of how the post-genomics era is adopting comparative approaches and dynamic system methods to understand the underlying design principles of network evolution and to shape the nascent field of evolutionary systems biology. Finally, the application of evolutionary systems biology to robust biological network designs is also discussed from the synthetic biology perspective. PMID:19468310

  20. The genomic landscape of rapid repeated evolutionary adaptation to toxic pollution in wild fish

    EPA Science Inventory

    Atlantic killifish populations have rapidly adapted to normally lethal levels of pollution in four urban estuaries. Through analysis of 384 whole killifish genome sequences and comparative transcriptomics in four pairs of sensitive and tolerant populations, we identify the aryl h...

  1. Integrating genomics into evolutionary medicine.

    PubMed

    Rodríguez, Juan Antonio; Marigorta, Urko M; Navarro, Arcadi

    2014-12-01

    The application of the principles of evolutionary biology into medicine was suggested long ago and is already providing insight into the ultimate causes of disease. However, a full systematic integration of medical genomics and evolutionary medicine is still missing. Here, we briefly review some cases where the combination of the two fields has proven profitable and highlight two of the main issues hindering the development of evolutionary genomic medicine as a mature field, namely the dissociation between fitness and health and the still considerable difficulties in predicting phenotypes from genotypes. We use publicly available data to illustrate both problems and conclude that new approaches are needed for evolutionary genomic medicine to overcome these obstacles. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Linking genotype to phenotype in a changing ocean: inferring the genomic architecture of a blue mussel stress response with genome-wide association.

    PubMed

    Kingston, S E; Martino, P; Melendy, M; Reed, F A; Carlon, D B

    2018-03-01

    A key component to understanding the evolutionary response to a changing climate is linking underlying genetic variation to phenotypic variation in stress response. Here, we use a genome-wide association approach (GWAS) to understand the genetic architecture of calcification rates under simulated climate stress. We take advantage of the genomic gradient across the blue mussel hybrid zone (Mytilus edulis and Mytilus trossulus) in the Gulf of Maine (GOM) to link genetic variation with variance in calcification rates in response to simulated climate change. Falling calcium carbonate saturation states are predicted to negatively impact many marine organisms that build calcium carbonate shells - like blue mussels. We sampled wild mussels and measured net calcification phenotypes after exposing mussels to a 'climate change' common garden, where we raised temperature by 3°C, decreased pH by 0.2 units and limited food supply by filtering out planktonic particles >5 μm, compared to ambient GOM conditions in the summer. This climate change exposure greatly increased phenotypic variation in net calcification rates compared to ambient conditions. We then used regression models to link the phenotypic variation with over 170 000 single nucleotide polymorphism loci (SNPs) generated by genotype by sequencing to identify genomic locations associated with calcification phenotype, and estimate heritability and architecture of the trait. We identified at least one of potentially 2-10 genomic regions responsible for 30% of the phenotypic variation in calcification rates that are potential targets of natural selection by climate change. Our simulations suggest a power of 13.7% with our study's average effective sample size of 118 individuals and rare alleles, but a power of >90% when effective sample size is 900. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.

  3. Evolutionary Analysis and Expression Profiling of Zebra Finch Immune Genes

    PubMed Central

    Ekblom, Robert; French, Lisa; Slate, Jon; Burke, Terry

    2010-01-01

    Genes of the immune system are generally considered to evolve rapidly due to host–parasite coevolution. They are therefore of great interest in evolutionary biology and molecular ecology. In this study, we manually annotated 144 avian immune genes from the zebra finch (Taeniopygia guttata) genome and conducted evolutionary analyses of these by comparing them with their orthologs in the chicken (Gallus gallus). Genes classified as immune receptors showed elevated dN/dS ratios compared with other classes of immune genes. Immune genes in general also appear to be evolving more rapidly than other genes, as inferred from a higher dN/dS ratio compared with the rest of the genome. Furthermore, ten genes (of 27) for which sequence data were available from at least three bird species showed evidence of positive selection acting on specific codons. From transcriptome data of eight different tissues, we found evidence for expression of 106 of the studied immune genes, with primary expression of most of these in bursa, blood, and spleen. These immune-related genes showed a more tissue-specific expression pattern than other genes in the zebra finch genome. Several of the avian immune genes investigated here provide strong candidates for in-depth studies of molecular adaptation in birds. PMID:20884724

  4. Examining the process of de novo gene birth: an educational primer on "integration of new genes into cellular networks, and their structural maturation".

    PubMed

    Frietze, Seth; Leatherman, Judith

    2014-03-01

    New genes that arise from modification of the noncoding portion of a genome rather than being duplicated from parent genes are called de novo genes. These genes, identified by their brief evolution and lack of parent genes, provide an opportunity to study the timeframe in which emerging genes integrate into cellular networks, and how the characteristics of these genes change as they mature into bona fide genes. An article by G. Abrusán provides an opportunity to introduce students to fundamental concepts in evolutionary and comparative genetics and to provide a technical background by which to discuss systems biology approaches when studying the evolutionary process of gene birth. Basic background needed to understand the Abrusán study and details on comparative genomic concepts tailored for a classroom discussion are provided, including discussion questions and a supplemental exercise on navigating a genome database.

  5. Complete nucleotide sequence of pig (Sus scrofa) mitochondrial genome and dating evolutionary divergence within Artiodactyla.

    PubMed

    Lin, C S; Sun, Y L; Liu, C Y; Yang, P C; Chang, L C; Cheng, I C; Mao, S J; Huang, M C

    1999-08-05

    The complete nucleotide sequence of the pig (Sus scrofa) mitochondrial genome, containing 16613bp, is presented in this report. The genome is not a specific length because of the presence of the variable numbers of tandem repeats, 5'-CGTGCGTACA in the displacement loop (D-loop). Genes responsible for 12S and 16S rRNAs, 22 tRNAs, and 13 protein-coding regions are found. The genome carries very few intergenic nucleotides with several instances of overlap between protein-coding or tRNA genes, except in the D-loop region. For evaluating the possible evolutionary relationships between Artiodactyla and Cetacea, the nucleotide substitutions and amino acid sequences of 13 protein-coding genes were aligned by pairwise comparisons of the pig, cow, and fin whale. By comparing these sequences, we suggest that there is a closer relationship between the pig and cow than that between either of these species and fin whale. In addition, the accumulation of transversions and gaps in pig 12S and 16S rRNA genes was compared with that in other eutherian species, including cow, fin whale, human, horse, and harbor seal. The results also reveal a close phylogenetic relationship between pig and cow, as compared to fin whale and others. Thus, according to the sequence differences of mitochondrial rRNA genes in eutherian species, the evolutionary separation of pig and cow occurred about 53-60 million years ago.

  6. Updated clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer.

    PubMed

    Wolf, Yuri I; Makarova, Kira S; Yutin, Natalya; Koonin, Eugene V

    2012-12-14

    Collections of Clusters of Orthologous Genes (COGs) provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs). Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether) into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major 'highways' of horizontal gene transfer. The updated collection of arCOGs is expected to become a key resource for comparative genomics, evolutionary reconstruction and functional annotation of new archaeal genomes. Given that, in spite of the major increase in the number of genomes, the conserved core of archaeal genes appears to be stabilizing, the major evolutionary trends revealed here have a chance to stand the test of time. This article was reviewed by (for complete reviews see the Reviewers' Reports section): Dr. PLG, Prof. PF, Dr. PL (nominated by Prof. JPG).

  7. Shifts in the evolutionary rate and intensity of purifying selection between two Brassica genomes revealed by analyses of orthologous transposons and relics of a whole genome triplication.

    PubMed

    Zhao, Meixia; Du, Jianchang; Lin, Feng; Tong, Chaobo; Yu, Jingyin; Huang, Shunmou; Wang, Xiaowu; Liu, Shengyi; Ma, Jianxin

    2013-10-01

    Recent sequencing of the Brassica rapa and Brassica oleracea genomes revealed extremely contrasting genomic features such as the abundance and distribution of transposable elements between the two genomes. However, whether and how these structural differentiations may have influenced the evolutionary rates of the two genomes since their split from a common ancestor are unknown. Here, we investigated and compared the rates of nucleotide substitution between two long terminal repeats (LTRs) of individual orthologous LTR-retrotransposons, the rates of synonymous and non-synonymous substitution among triplicated genes retained in both genomes from a shared whole genome triplication event, and the rates of genetic recombination estimated/deduced by the comparison of physical and genetic distances along chromosomes and ratios of solo LTRs to intact elements. Overall, LTR sequences and genic sequences showed more rapid nucleotide substitution in B. rapa than in B. oleracea. Synonymous substitution of triplicated genes retained from a shared whole genome triplication was detected at higher rates in B. rapa than in B. oleracea. Interestingly, non-synonymous substitution was observed at lower rates in the former than in the latter, indicating shifted densities of purifying selection between the two genomes. In addition to evolutionary asymmetry, orthologous genes differentially regulated and/or disrupted by transposable elements between the two genomes were also characterized. Our analyses suggest that local genomic and epigenomic features, such as recombination rates and chromatin dynamics reshaped by independent proliferation of transposable elements and elimination between the two genomes, are perhaps partially the causes and partially the outcomes of the observed inter-specific asymmetric evolution. © 2013 Purdue University The Plant Journal © 2013 John Wiley & Sons Ltd.

  8. The evolutionary fate of the chloroplast and nuclear rps16 genes as revealed through the sequencing and comparative analyses of four novel legume chloroplast genomes from Lupinus.

    PubMed

    Keller, J; Rousseau-Gueutin, M; Martin, G E; Morice, J; Boutte, J; Coissac, E; Ourari, M; Aïnouche, M; Salmon, A; Cabello-Hurtado, F; Aïnouche, A

    2017-08-01

    The Fabaceae family is considered as a model system for understanding chloroplast genome evolution due to the presence of extensive structural rearrangements, gene losses and localized hypermutable regions. Here, we provide sequences of four chloroplast genomes from the Lupinus genus, belonging to the underinvestigated Genistoid clade. Notably, we found in Lupinus species the functional loss of the essential rps16 gene, which was most likely replaced by the nuclear rps16 gene that encodes chloroplast and mitochondrion targeted RPS16 proteins. To study the evolutionary fate of the rps16 gene, we explored all available plant chloroplast, mitochondrial and nuclear genomes. Whereas no plant mitochondrial genomes carry an rps16 gene, many plants still have a functional nuclear and chloroplast rps16 gene. Ka/Ks ratios revealed that both chloroplast and nuclear rps16 copies were under purifying selection. However, due to the dual targeting of the nuclear rps16 gene product and the absence of a mitochondrial copy, the chloroplast gene may be lost. We also performed comparative analyses of lupine plastomes (SNPs, indels and repeat elements), identified the most variable regions and examined their phylogenetic utility. The markers identified here will help to reveal the evolutionary history of lupines, Genistoids and closely related clades. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  9. Inferring transposons activity chronology by TRANScendence - TEs database and de-novo mining tool.

    PubMed

    Startek, Michał Piotr; Nogły, Jakub; Gromadka, Agnieszka; Grzebelus, Dariusz; Gambin, Anna

    2017-10-16

    The constant progress in sequencing technology leads to ever increasing amounts of genomic data. In the light of current evidence transposable elements (TEs for short) are becoming useful tools for learning about the evolution of host genome. Therefore the software for genome-wide detection and analysis of TEs is of great interest. Here we describe the computational tool for mining, classifying and storing TEs from newly sequenced genomes. This is an online, web-based, user-friendly service, enabling users to upload their own genomic data, and perform de-novo searches for TEs. The detected TEs are automatically analyzed, compared to reference databases, annotated, clustered into families, and stored in TEs repository. Also, the genome-wide nesting structure of found elements are detected and analyzed by new method for inferring evolutionary history of TEs. We illustrate the functionality of our tool by performing a full-scale analyses of TE landscape in Medicago truncatula genome. TRANScendence is an effective tool for the de-novo annotation and classification of transposable elements in newly-acquired genomes. Its streamlined interface makes it well-suited for evolutionary studies.

  10. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

    PubMed

    2004-12-09

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

  11. Minimal-assumption inference from population-genomic data

    NASA Astrophysics Data System (ADS)

    Weissman, Daniel; Hallatschek, Oskar

    Samples of multiple complete genome sequences contain vast amounts of information about the evolutionary history of populations, much of it in the associations among polymorphisms at different loci. Current methods that take advantage of this linkage information rely on models of recombination and coalescence, limiting the sample sizes and populations that they can analyze. We introduce a method, Minimal-Assumption Genomic Inference of Coalescence (MAGIC), that reconstructs key features of the evolutionary history, including the distribution of coalescence times, by integrating information across genomic length scales without using an explicit model of recombination, demography or selection. Using simulated data, we show that MAGIC's performance is comparable to PSMC' on single diploid samples generated with standard coalescent and recombination models. More importantly, MAGIC can also analyze arbitrarily large samples and is robust to changes in the coalescent and recombination processes. Using MAGIC, we show that the inferred coalescence time histories of samples of multiple human genomes exhibit inconsistencies with a description in terms of an effective population size based on single-genome data.

  12. Genomics and the making of yeast biodiversity.

    PubMed

    Hittinger, Chris Todd; Rokas, Antonis; Bai, Feng-Yan; Boekhout, Teun; Gonçalves, Paula; Jeffries, Thomas W; Kominek, Jacek; Lachance, Marc-André; Libkind, Diego; Rosa, Carlos A; Sampaio, José Paulo; Kurtzman, Cletus P

    2015-12-01

    Yeasts are unicellular fungi that do not form fruiting bodies. Although the yeast lifestyle has evolved multiple times, most known species belong to the subphylum Saccharomycotina (syn. Hemiascomycota, hereafter yeasts). This diverse group includes the premier eukaryotic model system, Saccharomyces cerevisiae; the common human commensal and opportunistic pathogen, Candida albicans; and over 1000 other known species (with more continuing to be discovered). Yeasts are found in every biome and continent and are more genetically diverse than angiosperms or chordates. Ease of culture, simple life cycles, and small genomes (∼10-20Mbp) have made yeasts exceptional models for molecular genetics, biotechnology, and evolutionary genomics. Here we discuss recent developments in understanding the genomic underpinnings of the making of yeast biodiversity, comparing and contrasting natural and human-associated evolutionary processes. Only a tiny fraction of yeast biodiversity and metabolic capabilities has been tapped by industry and science. Expanding the taxonomic breadth of deep genomic investigations will further illuminate how genome function evolves to encode their diverse metabolisms and ecologies. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Sequencing and comparing whole mitochondrial genomes ofanimals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based onmore » our experiences to date with determining and comparing complete mtDNA sequences.« less

  14. Genome-wide analysis reveals class and gene specific codon usage adaptation in avian paramyxoviruses 1

    USDA-ARS?s Scientific Manuscript database

    In order to characterize the evolutionary adaptations of avian paramyxovirus 1 (APMV-1) genomes, we have compared codon usage and codon adaptation indexes among groups of Newcastle disease viruses that differ in biological, ecological, and genetic characteristics. We have used available GenBank com...

  15. Evolutionary genomics of LysM genes in land plants.

    PubMed

    Zhang, Xue-Cheng; Cannon, Steven B; Stacey, Gary

    2009-08-03

    The ubiquitous LysM motif recognizes peptidoglycan, chitooligosaccharides (chitin) and, presumably, other structurally-related oligosaccharides. LysM-containing proteins were first shown to be involved in bacterial cell wall degradation and, more recently, were implicated in perceiving chitin (one of the established pathogen-associated molecular patterns) and lipo-chitin (nodulation factors) in flowering plants. However, the majority of LysM genes in plants remain functionally uncharacterized and the evolutionary history of complex LysM genes remains elusive. We show that LysM-containing proteins display a wide range of complex domain architectures. However, only a simple core architecture is conserved across kingdoms. Each individual kingdom appears to have evolved a distinct array of domain architectures. We show that early plant lineages acquired four characteristic architectures and progressively lost several primitive architectures. We report plant LysM phylogenies and associated gene, protein and genomic features, and infer the relative timing of duplications of LYK genes. We report a domain architecture catalogue of LysM proteins across all kingdoms. The unique pattern of LysM protein domain architectures indicates the presence of distinctive evolutionary paths in individual kingdoms. We describe a comparative and evolutionary genomics study of LysM genes in plant kingdom. One of the two groups of tandemly arrayed plant LYK genes likely resulted from an ancient genome duplication followed by local genomic rearrangement, while the origin of the other groups of tandemly arrayed LYK genes remains obscure. Given the fact that no animal LysM motif-containing genes have been functionally characterized, this study provides clues to functional characterization of plant LysM genes and is also informative with regard to evolutionary and functional studies of animal LysM genes.

  16. TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS

    PubMed Central

    Jones, Matthew R.; Good, Jeffrey M.

    2016-01-01

    The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993

  17. Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana.

    PubMed

    Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi

    2014-01-03

    Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.

  18. Teaching the Toolkit: A Laboratory Series to Demonstrate the Evolutionary Conservation of Metazoan Cell Signaling Pathways

    ERIC Educational Resources Information Center

    LeClair, Elizabeth E.

    2008-01-01

    A major finding of comparative genomics and developmental genetics is that metazoans share certain conserved, embryonically deployed signaling pathways that instruct cells as to their ultimate fate. Because the DNA encoding these pathways predates the evolutionary split of most animal groups, it should in principle be possible to clone…

  19. Floral gene resources from basal angiosperms for comparative genomics research

    PubMed Central

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways. PMID:15799777

  20. Comparative Genomics Study of Multi-Drug-Resistance Mechanisms in the Antibiotic-Resistant Streptococcus suis R61 Strain

    PubMed Central

    Zhang, Anding; Wu, Jiayan; Chen, Bo; Hua, Yafeng; Yu, Jun; Chen, Huanchun; Xiao, Jingfa; Jin, Meilin

    2011-01-01

    Background Streptococcus suis infections are a serious problem for both humans and pigs worldwide. The emergence and increasing prevalence of antibiotic-resistant S. suis strains pose significant clinical and societal challenges. Results In our study, we sequenced one multi-drug-resistant S. suis strain, R61, and one S. suis strain, A7, which is fully sensitive to all tested antibiotics. Comparative genomic analysis revealed that the R61 strain is phylogenetically distinct from other S. suis strains, and the genome of R61 exhibits extreme levels of evolutionary plasticity with high levels of gene gain and loss. Our results indicate that the multi-drug-resistant strain R61 has evolved three main categories of resistance. Conclusions Comparative genomic analysis of S. suis strains with diverse drug-resistant phenotypes provided evidence that horizontal gene transfer is an important evolutionary force in shaping the genome of multi-drug-resistant strain R61. In this study, we discovered novel and previously unexamined mutations that are strong candidates for conferring drug resistance. We believe that these mutations will provide crucial clues for designing new drugs against this pathogen. In addition, our work provides a clear demonstration that the use of drugs has driven the emergence of the multi-drug-resistant strain R61. PMID:21966396

  1. Comparative genomics study of multi-drug-resistance mechanisms in the antibiotic-resistant Streptococcus suis R61 strain.

    PubMed

    Hu, Pan; Yang, Ming; Zhang, Anding; Wu, Jiayan; Chen, Bo; Hua, Yafeng; Yu, Jun; Chen, Huanchun; Xiao, Jingfa; Jin, Meilin

    2011-01-01

    Streptococcus suis infections are a serious problem for both humans and pigs worldwide. The emergence and increasing prevalence of antibiotic-resistant S. suis strains pose significant clinical and societal challenges. In our study, we sequenced one multi-drug-resistant S. suis strain, R61, and one S. suis strain, A7, which is fully sensitive to all tested antibiotics. Comparative genomic analysis revealed that the R61 strain is phylogenetically distinct from other S. suis strains, and the genome of R61 exhibits extreme levels of evolutionary plasticity with high levels of gene gain and loss. Our results indicate that the multi-drug-resistant strain R61 has evolved three main categories of resistance. Comparative genomic analysis of S. suis strains with diverse drug-resistant phenotypes provided evidence that horizontal gene transfer is an important evolutionary force in shaping the genome of multi-drug-resistant strain R61. In this study, we discovered novel and previously unexamined mutations that are strong candidates for conferring drug resistance. We believe that these mutations will provide crucial clues for designing new drugs against this pathogen. In addition, our work provides a clear demonstration that the use of drugs has driven the emergence of the multi-drug-resistant strain R61.

  2. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    PubMed

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  3. Partial Shotgun Sequencing of the Boechera stricta Genome Reveals Extensive Microsynteny and Promoter Conservation with Arabidopsis1[W

    PubMed Central

    Windsor, Aaron J.; Schranz, M. Eric; Formanová, Nataša; Gebauer-Jung, Steffi; Bishop, John G.; Schnabelrauch, Domenica; Kroymann, Juergen; Mitchell-Olds, Thomas

    2006-01-01

    Comparative genomics provides insight into the evolutionary dynamics that shape discrete sequences as well as whole genomes. To advance comparative genomics within the Brassicaceae, we have end sequenced 23,136 medium-sized insert clones from Boechera stricta, a wild relative of Arabidopsis (Arabidopsis thaliana). A significant proportion of these sequences, 18,797, are nonredundant and display highly significant similarity (BLASTn e-value ≤ 10−30) to low copy number Arabidopsis genomic regions, including more than 9,000 annotated coding sequences. We have used this dataset to identify orthologous gene pairs in the two species and to perform a global comparison of DNA regions 5′ to annotated coding regions. On average, the 500 nucleotides upstream to coding sequences display 71.4% identity between the two species. In a similar analysis, 61.4% identity was observed between 5′ noncoding sequences of Brassica oleracea and Arabidopsis, indicating that regulatory regions are not as diverged among these lineages as previously anticipated. By mapping the B. stricta end sequences onto the Arabidopsis genome, we have identified nearly 2,000 conserved blocks of microsynteny (bracketing 26% of the Arabidopsis genome). A comparison of fully sequenced B. stricta inserts to their homologous Arabidopsis genomic regions indicates that indel polymorphisms >5 kb contribute substantially to the genome size difference observed between the two species. Further, we demonstrate that microsynteny inferred from end-sequence data can be applied to the rapid identification and cloning of genomic regions of interest from nonmodel species. These results suggest that among diploid relatives of Arabidopsis, small- to medium-scale shotgun sequencing approaches can provide rapid and cost-effective benefits to evolutionary and/or functional comparative genomic frameworks. PMID:16607030

  4. Using genomics to characterize evolutionary potential for conservation of wild populations

    PubMed Central

    Harrisson, Katherine A; Pavlova, Alexandra; Telonis-Scott, Marina; Sunnucks, Paul

    2014-01-01

    Genomics promises exciting advances towards the important conservation goal of maximizing evolutionary potential, notwithstanding associated challenges. Here, we explore some of the complexity of adaptation genetics and discuss the strengths and limitations of genomics as a tool for characterizing evolutionary potential in the context of conservation management. Many traits are polygenic and can be strongly influenced by minor differences in regulatory networks and by epigenetic variation not visible in DNA sequence. Much of this critical complexity is difficult to detect using methods commonly used to identify adaptive variation, and this needs appropriate consideration when planning genomic screens, and when basing management decisions on genomic data. When the genomic basis of adaptation and future threats are well understood, it may be appropriate to focus management on particular adaptive traits. For more typical conservations scenarios, we argue that screening genome-wide variation should be a sensible approach that may provide a generalized measure of evolutionary potential that accounts for the contributions of small-effect loci and cryptic variation and is robust to uncertainty about future change and required adaptive response(s). The best conservation outcomes should be achieved when genomic estimates of evolutionary potential are used within an adaptive management framework. PMID:25553064

  5. Comparative Methylome Analyses Identify Epigenetic Regulatory Loci of Human Brain Evolution

    PubMed Central

    Mendizabal, Isabel; Shi, Lei; Keller, Thomas E.; Konopka, Genevieve; Preuss, Todd M.; Hsieh, Tzung-Fu; Hu, Enzhi; Zhang, Zhe; Su, Bing; Yi, Soojin V.

    2016-01-01

    How do epigenetic modifications change across species and how do these modifications affect evolution? These are fundamental questions at the forefront of our evolutionary epigenomic understanding. Our previous work investigated human and chimpanzee brain methylomes, but it was limited by the lack of outgroup data which is critical for comparative (epi)genomic studies. Here, we compared whole genome DNA methylation maps from brains of humans, chimpanzees and also rhesus macaques (outgroup) to elucidate DNA methylation changes during human brain evolution. Moreover, we validated that our approach is highly robust by further examining 38 human-specific DMRs using targeted deep genomic and bisulfite sequencing in an independent panel of 37 individuals from five primate species. Our unbiased genome-scan identified human brain differentially methylated regions (DMRs), irrespective of their associations with annotated genes. Remarkably, over half of the newly identified DMRs locate in intergenic regions or gene bodies. Nevertheless, their regulatory potential is on par with those of promoter DMRs. An intriguing observation is that DMRs are enriched in active chromatin loops, suggesting human-specific evolutionary remodeling at a higher-order chromatin structure. These findings indicate that there is substantial reprogramming of epigenomic landscapes during human brain evolution involving noncoding regions. PMID:27563052

  6. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses

    PubMed Central

    Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel

    2015-01-01

    The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166

  7. Large Diversity of Nonstandard Genes and Dynamic Evolution of Chloroplast Genomes in Siphonous Green Algae (Bryopsidales, Chlorophyta)

    PubMed Central

    Leliaert, Frederik; Marcelino, Vanessa R

    2018-01-01

    Abstract Chloroplast genomes have undergone tremendous alterations through the evolutionary history of the green algae (Chloroplastida). This study focuses on the evolution of chloroplast genomes in the siphonous green algae (order Bryopsidales). We present five new chloroplast genomes, which along with existing sequences, yield a data set representing all but one families of the order. Using comparative phylogenetic methods, we investigated the evolutionary dynamics of genomic features in the order. Our results show extensive variation in chloroplast genome architecture and intron content. Variation in genome size is accounted for by the amount of intergenic space and freestanding open reading frames that do not show significant homology to standard plastid genes. We show the diversity of these nonstandard genes based on their conserved protein domains, which are often associated with mobile functions (reverse transcriptase/intron maturase, integrases, phage- or plasmid-DNA primases, transposases, integrases, ligases). Investigation of the introns showed proliferation of group II introns in the early evolution of the order and their subsequent loss in the core Halimedineae, possibly through RT-mediated intron loss. PMID:29635329

  8. Evolutionary paths of streptococcal and staphylococcal superantigens

    PubMed Central

    2012-01-01

    Background Streptococcus pyogenes (GAS) harbors several superantigens (SAgs) in the prophage region of its genome, although speG and smez are not located in this region. The diversity of SAgs is thought to arise during horizontal transfer, but their evolutionary pathways have not yet been determined. We recently completed sequencing the entire genome of S. dysgalactiae subsp. equisimilis (SDSE), the closest relative of GAS. Although speG is the only SAg gene of SDSE, speG was present in only 50% of clinical SDSE strains and smez in none. In this study, we analyzed the evolutionary paths of streptococcal and staphylococcal SAgs. Results We compared the sequences of the 12–60 kb speG regions of nine SDSE strains, five speG+ and four speG–. We found that the synteny of this region was highly conserved, whether or not the speG gene was present. Synteny analyses based on genome-wide comparisons of GAS and SDSE indicated that speG is the direct descendant of a common ancestor of streptococcal SAgs, whereas smez was deleted from SDSE after SDSE and GAS split from a common ancestor. Cumulative nucleotide skew analysis of SDSE genomes suggested that speG was located outside segments of steeper slopes than the stable region in the genome, whereas the region flanking smez was unstable, as expected from the results of GAS. We also detected a previously undescribed staphylococcal SAg gene, selW, and a staphylococcal SAg -like gene, ssl, in the core genomes of all Staphylococcus aureus strains sequenced. Amino acid substitution analyses, based on dN/dS window analysis of the products encoded by speG, selW and ssl suggested that all three genes have been subjected to strong positive selection. Evolutionary analysis based on the Bayesian Markov chain Monte Carlo method showed that each clade included at least one direct descendant. Conclusions Our findings reveal a plausible model for the comprehensive evolutionary pathway of streptococcal and staphylococcal SAgs. PMID:22900646

  9. Massive gene acquisitions in Mycobacterium indicus pranii provide a perspective on mycobacterial evolution

    PubMed Central

    Saini, Vikram; Raghuvanshi, Saurabh; Khurana, Jitendra P.; Ahmed, Niyaz; Hasnain, Seyed E.; Tyagi, Akhilesh K.; Tyagi, Anil K.

    2012-01-01

    Understanding the evolutionary and genomic mechanisms responsible for turning the soil-derived saprophytic mycobacteria into lethal intracellular pathogens is a critical step towards the development of strategies for the control of mycobacterial diseases. In this context, Mycobacterium indicus pranii (MIP) is of specific interest because of its unique immunological and evolutionary significance. Evolutionarily, it is the progenitor of opportunistic pathogens belonging to M. avium complex and is endowed with features that place it between saprophytic and pathogenic species. Herein, we have sequenced the complete MIP genome to understand its unique life style, basis of immunomodulation and habitat diversification in mycobacteria. As a case of massive gene acquisitions, 50.5% of MIP open reading frames (ORFs) are laterally acquired. We show, for the first time for Mycobacterium, that MIP genome has mosaic architecture. These gene acquisitions have led to the enrichment of selected gene families critical to MIP physiology. Comparative genomic analysis indicates a higher antigenic potential of MIP imparting it a unique ability for immunomodulation. Besides, it also suggests an important role of genomic fluidity in habitat diversification within mycobacteria and provides a unique view of evolutionary divergence and putative bottlenecks that might have eventually led to intracellular survival and pathogenic attributes in mycobacteria. PMID:22965120

  10. Mosaic Graphs and Comparative Genomics in Phage Communities

    PubMed Central

    Belcaid, Mahdi; Bergeron, Anne

    2010-01-01

    Abstract Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities. PMID:20874413

  11. CnidBase: The Cnidarian Evolutionary Genomics Database

    PubMed Central

    Ryan, Joseph F.; Finnerty, John R.

    2003-01-01

    CnidBase, the Cnidarian Evolutionary Genomics Database, is a tool for investigating the evolutionary, developmental and ecological factors that affect gene expression and gene function in cnidarians. In turn, CnidBase will help to illuminate the role of specific genes in shaping cnidarian biodiversity in the present day and in the distant past. CnidBase highlights evolutionary changes between species within the phylum Cnidaria and structures genomic and expression data to facilitate comparisons to non-cnidarian metazoans. CnidBase aims to further the progress that has already been made in the realm of cnidarian evolutionary genomics by creating a central community resource which will help drive future research and facilitate more accurate classification and comparison of new experimental data with existing data. CnidBase is available at http://cnidbase.bu.edu/. PMID:12519972

  12. The complete mitochondrial genome of Arctic Calanus hyperboreus (Copepoda, Calanoida) reveals characteristic patterns in calanoid mitochondrial genome.

    PubMed

    Kim, Sanghee; Lim, Byung-Jin; Min, Gi-Sik; Choi, Han-Gu

    2013-05-10

    Copepoda is the most diverse and abundant group of crustaceans, but its phylogenetic relationships are ambiguous. Mitochondrial (mt) genomes are useful for studying evolutionary history, but only six complete Copepoda mt genomes have been made available and these have extremely rearranged genome structures. This study determined the mt genome of Calanus hyperboreus, making it the first reported Arctic copepod mt genome and the first complete mt genome of a calanoid copepod. The mt genome of C. hyperboreus is 17,910 bp in length and it contains the entire set of 37 mt genes, including 13 protein-coding genes, 2 rRNAs, and 22 tRNAs. It has a very unusual gene structure, including the longest control region reported for a crustacean, a large tRNA gene cluster, and reversed GC skews in 11 out of 13 protein-coding genes (84.6%). Despite the unusual features, comparing this genome to published copepod genomes revealed retained pan-crustacean features, as well as a conserved calanoid-specific pattern. Our data provide a foundation for exploring the calanoid pattern and the mechanisms of mt gene rearrangement in the evolutionary history of the copepod mt genome. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. Genome Sequence of Fusarium oxysporum f. sp. melonis, a fungus causing wilt disease on melon

    USDA-ARS?s Scientific Manuscript database

    This manuscript reports the genome sequence of F. oxysporum f. sp. melonis, a fungal pathogen that causes Fusarium wilt disease on melon (Cucumis melo). The project is part of a large comparative study designed to explore the genetic composition and evolutionary origin of this group of horizontally ...

  14. Genome sequence of Fusarium oxysporum f. sp. melonis, a fungus causing wilt disease on melon

    USDA-ARS?s Scientific Manuscript database

    This manuscript reports the genome sequence of F. oxysporum f. sp. melonis, a fungal pathogen that causes Fusarium wilt disease on melon (Cucumis melo). The project is part of a large comparative study designed to explore the genetic composition and evolutionary origin of this group of horizontally ...

  15. Evolutionary trajectories of snake genes and genomes revealed by comparative analyses of five-pacer viper

    PubMed Central

    Yin, Wei; Wang, Zong-ji; Li, Qi-ye; Lian, Jin-ming; Zhou, Yang; Lu, Bing-zheng; Jin, Li-jun; Qiu, Peng-xin; Zhang, Pei; Zhu, Wen-bo; Wen, Bo; Huang, Yi-jun; Lin, Zhi-long; Qiu, Bi-tao; Su, Xing-wen; Yang, Huan-ming; Zhang, Guo-jie; Yan, Guang-mei; Zhou, Qi

    2016-01-01

    Snakes have numerous features distinctive from other tetrapods and a rich history of genome evolution that is still obscure. Here, we report the high-quality genome of the five-pacer viper, Deinagkistrodon acutus, and comparative analyses with other representative snake and lizard genomes. We map the evolutionary trajectories of transposable elements (TEs), developmental genes and sex chromosomes onto the snake phylogeny. TEs exhibit dynamic lineage-specific expansion, and many viper TEs show brain-specific gene expression along with their nearby genes. We detect signatures of adaptive evolution in olfactory, venom and thermal-sensing genes and also functional degeneration of genes associated with vision and hearing. Lineage-specific relaxation of functional constraints on respective Hox and Tbx limb-patterning genes supports fossil evidence for a successive loss of forelimbs then hindlimbs during snake evolution. Finally, we infer that the ZW sex chromosome pair had undergone at least three recombination suppression events in the ancestor of advanced snakes. These results altogether forge a framework for our deep understanding into snakes' history of molecular evolution. PMID:27708285

  16. Hemimetabolous genomes reveal molecular basis of termite eusociality.

    PubMed

    Harrison, Mark C; Jongepier, Evelien; Robertson, Hugh M; Arning, Nicolas; Bitard-Feildel, Tristan; Chao, Hsu; Childers, Christopher P; Dinh, Huyen; Doddapaneni, Harshavardhan; Dugan, Shannon; Gowin, Johannes; Greiner, Carolin; Han, Yi; Hu, Haofu; Hughes, Daniel S T; Huylmans, Ann-Kathrin; Kemena, Carsten; Kremer, Lukas P M; Lee, Sandra L; Lopez-Ezquerra, Alberto; Mallet, Ludovic; Monroy-Kuhn, Jose M; Moser, Annabell; Murali, Shwetha C; Muzny, Donna M; Otani, Saria; Piulachs, Maria-Dolors; Poelchau, Monica; Qu, Jiaxin; Schaub, Florentine; Wada-Katsumata, Ayako; Worley, Kim C; Xie, Qiaolin; Ylla, Guillem; Poulsen, Michael; Gibbs, Richard A; Schal, Coby; Richards, Stephen; Belles, Xavier; Korb, Judith; Bornberg-Bauer, Erich

    2018-03-01

    Around 150 million years ago, eusocial termites evolved from within the cockroaches, 50 million years before eusocial Hymenoptera, such as bees and ants, appeared. Here, we report the 2-Gb genome of the German cockroach, Blattella germanica, and the 1.3-Gb genome of the drywood termite Cryptotermes secundus. We show evolutionary signatures of termite eusociality by comparing the genomes and transcriptomes of three termites and the cockroach against the background of 16 other eusocial and non-eusocial insects. Dramatic adaptive changes in genes underlying the production and perception of pheromones confirm the importance of chemical communication in the termites. These are accompanied by major changes in gene regulation and the molecular evolution of caste determination. Many of these results parallel molecular mechanisms of eusocial evolution in Hymenoptera. However, the specific solutions are remarkably different, thus revealing a striking case of convergence in one of the major evolutionary transitions in biological complexity.

  17. Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

    PubMed Central

    Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

    2000-01-01

    The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409

  18. Reconstruction of a composite comparative map composed of ten legume genomes.

    PubMed

    Lee, Chaeyoung; Yu, Dongwoon; Choi, Hong-Kyu; Kim, Ryan W

    2017-01-01

    The Fabaceae (legume family) is the third largest and the second of agricultural importance among flowering plant groups. In this study, we report the reconstruction of a composite comparative map composed of ten legume genomes, including seven species from the galegoid clade ( Medicago truncatula , Medicago sativa , Lens culinaris, Pisum sativum , Lotus japonicus , Cicer arietinum , Vicia faba ) and three species from the phaseoloid clade ( Vigna radiata , Phaseolus vulgaris , Glycine max ). To accomplish this comparison, a total of 209 cross-species gene-derived markers were employed. The comparative analysis resulted in a single extensive genetic/genomic network composed of 93 chromosomes or linkage groups, from which 110 synteny blocks and other evolutionary events (e.g., 13 inversions) were identified. This comparative map also allowed us to deduce several large scale evolutionary events, such as chromosome fusion/fission, with which might explain differences in chromosome numbers among compared species or between the two clades. As a result, useful properties of cross-species genic markers were re-verified as an efficient tool for cross-species translation of genomic information, and similar approaches, combined with a high throughput bioinformatic marker design program, should be effective for applying the knowledge of trait-associated genes to other important crop species for breeding purposes. Here, we provide a basic comparative framework for the ten legume species, and expect to be usefully applied towards the crop improvement in legume breeding.

  19. Comparative genomics of grass EST libraries reveals previously uncharacterized splicing events in crop plants.

    PubMed

    Chuang, Trees-Juen; Yang, Min-Yu; Lin, Chuang-Chieh; Hsieh, Ping-Hung; Hung, Li-Yuan

    2015-02-05

    Crop plants such as rice, maize and sorghum play economically-important roles as main sources of food, fuel, and animal feed. However, current genome annotations of crop plants still suffer false-positive predictions; a more comprehensive registry of alternative splicing (AS) events is also in demand. Comparative genomics of crop plants is largely unexplored. We performed a large-scale comparative analysis (ExonFinder) of the expressed sequence tag (EST) library from nine grass plants against three crop genomes (rice, maize, and sorghum) and identified 2,879 previously-unannotated exons (i.e., novel exons) in the three crops. We validated 81% of the tested exons by RT-PCR-sequencing, supporting the effectiveness of our in silico strategy. Evolutionary analysis reveals that the novel exons, comparing with their flanking annotated ones, are generally under weaker selection pressure at the protein level, but under stronger pressure at the RNA level, suggesting that most of the novel exons also represent novel alternatively spliced variants (ASVs). However, we also observed the consistency of evolutionary rates between certain novel exons and their flanking exons, which provided further evidence of their co-occurrence in the transcripts, suggesting that previously-annotated isoforms might be subject to erroneous predictions. Our validation showed that 54% of the tested genes expressed the newly-identified isoforms that contained the novel exons, rather than the previously-annotated isoforms that excluded them. The consistent results were steadily observed across cultivated (Oryza sativa and O. glaberrima) and wild (O. rufipogon and O. nivara) rice species, asserting the necessity of our curation of the crop genome annotations. Our comparative analyses also inferred the common ancestral transcriptome of grass plants and gain- and loss-of-ASV events. We have reannotated the rice, maize, and sorghum genomes, and showed that evolutionary rates might serve as an indicator for determining whether the identified exons were alternatively spliced. This study not only presents an effective in silico strategy for the improvement of plant annotations, but also provides further insights into the role of AS events in the evolution and domestication of crop plants. ExonFinder and the novel exons/ASVs identified are publicly accessible at http://exonfinder.sourceforge.net/ .

  20. Tempo and mode of genomic mutations unveil human evolutionary history.

    PubMed

    Hara, Yuichiro

    2015-01-01

    Mutations that have occurred in human genomes provide insight into various aspects of evolutionary history such as speciation events and degrees of natural selection. Comparing genome sequences between human and great apes or among humans is a feasible approach for inferring human evolutionary history. Recent advances in high-throughput or so-called 'next-generation' DNA sequencing technologies have enabled the sequencing of thousands of individual human genomes, as well as a variety of reference genomes of hominids, many of which are publicly available. These sequence data can help to unveil the detailed demographic history of the lineage leading to humans as well as the explosion of modern human population size in the last several thousand years. In addition, high-throughput sequencing illustrates the tempo and mode of de novo mutations, which are producing human genetic variation at this moment. Pedigree-based human genome sequencing has shown that mutation rates vary significantly across the human genome. These studies have also provided an improved timescale of human evolution, because the mutation rate estimated from pedigree analysis is half that estimated from traditional analyses based on molecular phylogeny. Because of the dramatic reduction in sequencing cost, sequencing on-demand samples designed for specific studies is now also becoming popular. To produce data of sufficient quality to meet the requirements of the study, it is necessary to set an explicit sequencing plan that includes the choice of sample collection methods, sequencing platforms, and number of sequence reads.

  1. Whole genome comparisons of Fragaria, Prunus and Malus reveal different modes of evolution between Rosaceous subfamilies

    PubMed Central

    2012-01-01

    Background Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes. Results Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA) algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes. Conclusion Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae. PMID:22475018

  2. Random genetic drift, natural selection, and noise in human cranial evolution.

    PubMed

    Roseman, Charles C

    2016-08-01

    This study assesses the extent to which relationships among groups complicate comparative studies of adaptation in recent human cranial variation and the extent to which departures from neutral additive models of evolution hinder the reconstruction of population relationships among groups using cranial morphology. Using a maximum likelihood evolutionary model fitting approach and a mixed population genomic and cranial data set, I evaluate the relative fits of several widely used models of human cranial evolution. Moreover, I compare the goodness of fit of models of cranial evolution constrained by genomic variation to test hypotheses about population specific departures from neutrality. Models from population genomics are much better fits to cranial variation than are traditional models from comparative human biology. There is not enough evolutionary information in the cranium to reconstruct much of recent human evolution but the influence of population history on cranial variation is strong enough to cause comparative studies of adaptation serious difficulties. Deviations from a model of random genetic drift along a tree-like population history show the importance of environmental effects, gene flow, and/or natural selection on human cranial variation. Moreover, there is a strong signal of the effect of natural selection or an environmental factor on a group of humans from Siberia. The evolution of the human cranium is complex and no one evolutionary process has prevailed at the expense of all others. A holistic unification of phenome, genome, and environmental context, gives us a strong point of purchase on these problems, which is unavailable to any one traditional approach alone. Am J Phys Anthropol 160:582-592, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  3. Maintaining replication origins in the face of genomic change.

    PubMed

    Di Rienzi, Sara C; Lindstrom, Kimberly C; Mann, Tobias; Noble, William S; Raghuraman, M K; Brewer, Bonita J

    2012-10-01

    Origins of replication present a paradox to evolutionary biologists. As a collection, they are absolutely essential genomic features, but individually are highly redundant and nonessential. It is therefore difficult to predict to what extent and in what regard origins are conserved over evolutionary time. Here, through a comparative genomic analysis of replication origins and chromosomal replication patterns in the budding yeasts Saccharomyces cerevisiae and Lachancea waltii, we assess to what extent replication origins survived genomic change produced from 150 million years of evolution. We find that L. waltii origins exhibit a core consensus sequence and nucleosome occupancy pattern highly similar to those of S. cerevisiae origins. We further observe that the overall progression of chromosomal replication is similar between L. waltii and S. cerevisiae. Nevertheless, few origins show evidence of being conserved in location between the two species. Among the conserved origins are those surrounding centromeres and adjacent to histone genes, suggesting that proximity to an origin may be important for their regulation. We conclude that, over evolutionary time, origins maintain sequence, structure, and regulation, but are continually being created and destroyed, with the result that their locations are generally not conserved.

  4. Maintaining replication origins in the face of genomic change

    PubMed Central

    Di Rienzi, Sara C.; Lindstrom, Kimberly C.; Mann, Tobias; Noble, William S.; Raghuraman, M.K.; Brewer, Bonita J.

    2012-01-01

    Origins of replication present a paradox to evolutionary biologists. As a collection, they are absolutely essential genomic features, but individually are highly redundant and nonessential. It is therefore difficult to predict to what extent and in what regard origins are conserved over evolutionary time. Here, through a comparative genomic analysis of replication origins and chromosomal replication patterns in the budding yeasts Saccharomyces cerevisiae and Lachancea waltii, we assess to what extent replication origins survived genomic change produced from 150 million years of evolution. We find that L. waltii origins exhibit a core consensus sequence and nucleosome occupancy pattern highly similar to those of S. cerevisiae origins. We further observe that the overall progression of chromosomal replication is similar between L. waltii and S. cerevisiae. Nevertheless, few origins show evidence of being conserved in location between the two species. Among the conserved origins are those surrounding centromeres and adjacent to histone genes, suggesting that proximity to an origin may be important for their regulation. We conclude that, over evolutionary time, origins maintain sequence, structure, and regulation, but are continually being created and destroyed, with the result that their locations are generally not conserved. PMID:22665441

  5. Polyploidy can drive rapid adaptation in yeast

    NASA Astrophysics Data System (ADS)

    Selmecki, Anna M.; Maruvka, Yosef E.; Richmond, Phillip A.; Guillet, Marie; Shoresh, Noam; Sorenson, Amber L.; de, Subhajyoti; Kishony, Roy; Michor, Franziska; Dowell, Robin; Pellman, David

    2015-03-01

    Polyploidy is observed across the tree of life, yet its influence on evolution remains incompletely understood. Polyploidy, usually whole-genome duplication, is proposed to alter the rate of evolutionary adaptation. This could occur through complex effects on the frequency or fitness of beneficial mutations. For example, in diverse cell types and organisms, immediately after a whole-genome duplication, newly formed polyploids missegregate chromosomes and undergo genetic instability. The instability following whole-genome duplications is thought to provide adaptive mutations in microorganisms and can promote tumorigenesis in mammalian cells. Polyploidy may also affect adaptation independently of beneficial mutations through ploidy-specific changes in cell physiology. Here we perform in vitro evolution experiments to test directly whether polyploidy can accelerate evolutionary adaptation. Compared with haploids and diploids, tetraploids undergo significantly faster adaptation. Mathematical modelling suggests that rapid adaptation of tetraploids is driven by higher rates of beneficial mutations with stronger fitness effects, which is supported by whole-genome sequencing and phenotypic analyses of evolved clones. Chromosome aneuploidy, concerted chromosome loss, and point mutations all provide large fitness gains. We identify several mutations whose beneficial effects are manifest specifically in the tetraploid strains. Together, these results provide direct quantitative evidence that in some environments polyploidy can accelerate evolutionary adaptation.

  6. Genome-wide investigation reveals high evolutionary rates in annual model plants.

    PubMed

    Yue, Jia-Xing; Li, Jinpeng; Wang, Dan; Araki, Hitoshi; Tian, Dacheng; Yang, Sihai

    2010-11-09

    Rates of molecular evolution vary widely among species. While significant deviations from molecular clock have been found in many taxa, effects of life histories on molecular evolution are not fully understood. In plants, annual/perennial life history traits have long been suspected to influence the evolutionary rates at the molecular level. To date, however, the number of genes investigated on this subject is limited and the conclusions are mixed. To evaluate the possible heterogeneity in evolutionary rates between annual and perennial plants at the genomic level, we investigated 85 nuclear housekeeping genes, 10 non-housekeeping families, and 34 chloroplast genes using the genomic data from model plants including Arabidopsis thaliana and Medicago truncatula for annuals and grape (Vitis vinifera) and popular (Populus trichocarpa) for perennials. According to the cross-comparisons among the four species, 74-82% of the nuclear genes and 71-97% of the chloroplast genes suggested higher rates of molecular evolution in the two annuals than those in the two perennials. The significant heterogeneity in evolutionary rate between annuals and perennials was consistently found both in nonsynonymous sites and synonymous sites. While a linear correlation of evolutionary rates in orthologous genes between species was observed in nonsynonymous sites, the correlation was weak or invisible in synonymous sites. This tendency was clearer in nuclear genes than in chloroplast genes, in which the overall evolutionary rate was small. The slope of the regression line was consistently lower than unity, further confirming the higher evolutionary rate in annuals at the genomic level. The higher evolutionary rate in annuals than in perennials appears to be a universal phenomenon both in nuclear and chloroplast genomes in the four dicot model plants we investigated. Therefore, such heterogeneity in evolutionary rate should result from factors that have genome-wide influence, most likely those associated with annual/perennial life history. Although we acknowledge current limitations of this kind of study, mainly due to a small sample size available and a distant taxonomic relationship of the model organisms, our results indicate that the genome-wide survey is a promising approach toward further understanding of the mechanism determining the molecular evolutionary rate at the genomic level.

  7. GENOME-WIDE COMPARATIVE ANALYSIS OF PHYLOGENETIC TREES: THE PROKARYOTIC FOREST OF LIFE

    PubMed Central

    Puigbò, Pere; Wolf, Yuri I.; Koonin, Eugene V.

    2013-01-01

    Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance (SD) method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the applications methods used to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a ‘species tree’. PMID:22399455

  8. Genome-wide comparative analysis of phylogenetic trees: the prokaryotic forest of life.

    PubMed

    Puigbò, Pere; Wolf, Yuri I; Koonin, Eugene V

    2012-01-01

    Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article, we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the application of these methods to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."

  9. Integration of Structural Dynamics and Molecular Evolution via Protein Interaction Networks: A New Era in Genomic Medicine

    PubMed Central

    Kumar, Avishek; Butler, Brandon M.; Kumar, Sudhir; Ozkan, S. Banu

    2016-01-01

    Summary Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. PMID:26684487

  10. Evolutionary Genomics and Adaptive Evolution of the Hedgehog Gene Family (Shh, Ihh and Dhh) in Vertebrates

    PubMed Central

    Pereira, Joana; Johnson, Warren E.; O’Brien, Stephen J.; Jarvis, Erich D.; Zhang, Guojie; Gilbert, M. Thomas P.; Vasconcelos, Vitor; Antunes, Agostinho

    2014-01-01

    The Hedgehog (Hh) gene family codes for a class of secreted proteins composed of two active domains that act as signalling molecules during embryo development, namely for the development of the nervous and skeletal systems and the formation of the testis cord. While only one Hh gene is found typically in invertebrate genomes, most vertebrates species have three (Sonic hedgehog – Shh; Indian hedgehog – Ihh; and Desert hedgehog – Dhh), each with different expression patterns and functions, which likely helped promote the increasing complexity of vertebrates and their successful diversification. In this study, we used comparative genomic and adaptive evolutionary analyses to characterize the evolution of the Hh genes in vertebrates following the two major whole genome duplication (WGD) events. To overcome the lack of Hh-coding sequences on avian publicly available databases, we used an extensive dataset of 45 avian and three non-avian reptilian genomes to show that birds have all three Hh paralogs. We find suggestions that following the WGD events, vertebrate Hh paralogous genes evolved independently within similar linkage groups and under different evolutionary rates, especially within the catalytic domain. The structural regions around the ion-binding site were identified to be under positive selection in the signaling domain. These findings contrast with those observed in invertebrates, where different lineages that experienced gene duplication retained similar selective constraints in the Hh orthologs. Our results provide new insights on the evolutionary history of the Hh gene family, the functional roles of these paralogs in vertebrate species, and on the location of mutational hotspots. PMID:25549322

  11. Are there laws of genome evolution?

    PubMed

    Koonin, Eugene V

    2011-08-01

    Research in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law-like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model. These models do not explicitly incorporate selection; therefore, the observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles. Although a complete physical theory of evolutionary biology is inconceivable, the universals of genome evolution might qualify as "laws of evolutionary genomics" in the same sense "law" is understood in modern physics.

  12. Defining functional DNA elements in the human genome

    PubMed Central

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  13. Phylogenomic, Pan-genomic, Pathogenomic and Evolutionary Genomic Insights into the Agronomically Relevant Enterobacteria Pantoea ananatis and Pantoea stewartii

    PubMed Central

    De Maayer, Pieter; Aliyu, Habibu; Vikram, Surendra; Blom, Jochen; Duffy, Brion; Cowan, Don A.; Smits, Theo H. M.; Venter, Stephanus N.; Coutinho, Teresa A.

    2017-01-01

    Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart’s wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes, has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis. While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range. PMID:28959245

  14. Structured populations of Sulfolobus acidocaldarius with susceptibility to mobile genetic elements

    USGS Publications Warehouse

    Anderson, Rika E.; Kouris, Angela; Seward, Christopher H.; Campbell, Kate M.; Whitaker, Rachel J.

    2017-01-01

    The impact of a structured environment on genome evolution can be determined through comparative population genomics of species that live in the same habitat. Recent work comparing three genome sequences of Sulfolobus acidocaldarius suggested that highly structured, extreme, hot spring environments do not limit dispersal of this thermoacidophile, in contrast to other co-occurring Sulfolobus species. Instead, a high level of conservation among these three S. acidocaldarius genomes was hypothesized to result from rapid, global-scale dispersal promoted by low susceptibility to viruses that sets S. acidocaldarius apart from its sister Sulfolobus species. To test this hypothesis, we conducted a comparative analysis of 47 genomes of S. acidocaldarius from spatial and temporal sampling of two hot springs in Yellowstone National Park. While we confirm the low diversity in the core genome, we observe differentiation among S. acidocaldarius populations, likely resulting from low migration among hot spring “islands” in Yellowstone National Park. Patterns of genomic variation indicate that differing geological contexts result in the elimination or preservation of diversity among differentiated populations. We observe multiple deletions associated with a large genomic island rich in glycosyltransferases, differential integrations of the Sulfolobus turreted icosahedral virus, as well as two different plasmid elements. These data demonstrate that neither rapid dispersal nor lack of mobile genetic elements result in low diversity in the S. acidocaldariusgenomes. We suggest instead that significant differences in the recent evolutionary history, or the intrinsic evolutionary rates, of sister Sulfolobusspecies result in the relatively low diversity of the S. acidocaldarius genome.

  15. Phylogenomic, Pan-genomic, Pathogenomic and Evolutionary Genomic Insights into the Agronomically Relevant Enterobacteria Pantoea ananatis and Pantoea stewartii.

    PubMed

    De Maayer, Pieter; Aliyu, Habibu; Vikram, Surendra; Blom, Jochen; Duffy, Brion; Cowan, Don A; Smits, Theo H M; Venter, Stephanus N; Coutinho, Teresa A

    2017-01-01

    Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart's wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes , has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis . While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range.

  16. Genomic diversity of Bombyx mori nucleopolyhedrovirus strains.

    PubMed

    Xu, Yi-Peng; Cheng, Ruo-Lin; Xi, Yu; Zhang, Chuan-Xi

    2013-07-01

    Bombyx mori nucleopolyhedrovirus (BmNPV) is a baculovirus that selectively infects the domestic silkworm. In this study, six BmNPV strains were compared at the whole genome level. We found that the number of bro genes and the composition of the homologous regions (hrs) are the two primary areas of divergence within these genomes. When we compared the ORFs of these BmNPV variants, we noticed a high degree of sequence divergence in the ORFs that are not baculovirus core genes. This result is consistent with the results derived from phylogenetic trees and evolutionary pressure analyses of these ORFs, indicating that ORFs that are not core genes likely play important roles in the evolution of BmNPV strains. The evolutionary relationships of these BmNPV strains might be explained by their geographic origins or those of their hosts. In addition, the total number of hr palindromes seems to affect viral DNA replication in Bm5 cells. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

    PubMed

    Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

    2012-01-15

    Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes. Copyright © 2011 Elsevier B.V. All rights reserved.

  18. Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

    PubMed Central

    Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K.; Duan, Yongping; Luo, Feng

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention. PMID:25811466

  19. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

    PubMed

    Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

    2015-01-01

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.

  20. Insights from Human/Mouse genome comparisons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pennacchio, Len A.

    2003-03-30

    Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestrymore » of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.« less

  1. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.

    PubMed

    Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel

    2015-05-27

    The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  2. Comparative Mitogenomics of Plant Bugs (Hemiptera: Miridae): Identifying the AGG Codon Reassignments between Serine and Lysine

    PubMed Central

    Wang, Pei; Song, Fan; Cai, Wanzhi

    2014-01-01

    Insect mitochondrial genomes are very important to understand the molecular evolution as well as for phylogenetic and phylogeographic studies of the insects. The Miridae are the largest family of Heteroptera encompassing more than 11,000 described species and of great economic importance. For better understanding the diversity and the evolution of plant bugs, we sequence five new mitochondrial genomes and present the first comparative analysis of nine mitochondrial genomes of mirids available to date. Our result showed that gene content, gene arrangement, base composition and sequences of mitochondrial transcription termination factor were conserved in plant bugs. Intra-genus species shared more conserved genomic characteristics, such as nucleotide and amino acid composition of protein-coding genes, secondary structure and anticodon mutations of tRNAs, and non-coding sequences. Control region possessed several distinct characteristics, including: variable size, abundant tandem repetitions, and intra-genus conservation; and was useful in evolutionary and population genetic studies. The AGG codon reassignments were investigated between serine and lysine in the genera Adelphocoris and other cimicomorphans. Our analysis revealed correlated evolution between reassignments of the AGG codon and specific point mutations at the antidocons of tRNALys and tRNASer(AGN). Phylogenetic analysis indicated that mitochondrial genome sequences were useful in resolving family level relationship of Cimicomorpha. Comparative evolutionary analysis of plant bug mitochondrial genomes allowed the identification of previously neglected coding genes or non-coding regions as potential molecular markers. The finding of the AGG codon reassignments between serine and lysine indicated the parallel evolution of the genetic code in Hemiptera mitochondrial genomes. PMID:24988409

  3. Selfish genetic elements, genetic conflict, and evolutionary innovation.

    PubMed

    Werren, John H

    2011-06-28

    Genomes are vulnerable to selfish genetic elements (SGEs), which enhance their own transmission relative to the rest of an individual's genome but are neutral or harmful to the individual as a whole. As a result, genetic conflict occurs between SGEs and other genetic elements in the genome. There is growing evidence that SGEs, and the resulting genetic conflict, are an important motor for evolutionary change and innovation. In this review, the kinds of SGEs and their evolutionary consequences are described, including how these elements shape basic biological features, such as genome structure and gene regulation, evolution of new genes, origin of new species, and mechanisms of sex determination and development. The dynamics of SGEs are also considered, including possible "evolutionary functions" of SGEs.

  4. Selfish genetic elements, genetic conflict, and evolutionary innovation

    PubMed Central

    Werren, John H.

    2011-01-01

    Genomes are vulnerable to selfish genetic elements (SGEs), which enhance their own transmission relative to the rest of an individual's genome but are neutral or harmful to the individual as a whole. As a result, genetic conflict occurs between SGEs and other genetic elements in the genome. There is growing evidence that SGEs, and the resulting genetic conflict, are an important motor for evolutionary change and innovation. In this review, the kinds of SGEs and their evolutionary consequences are described, including how these elements shape basic biological features, such as genome structure and gene regulation, evolution of new genes, origin of new species, and mechanisms of sex determination and development. The dynamics of SGEs are also considered, including possible “evolutionary functions” of SGEs. PMID:21690392

  5. The scope and strength of sex-specific selection in genome evolution

    PubMed Central

    Wright, A E; Mank, J E

    2013-01-01

    Males and females share the vast majority of their genomes and yet are often subject to different, even conflicting, selection. Genomic and transcriptomic developments have made it possible to assess sex-specific selection at the molecular level, and it is clear that sex-specific selection shapes the evolutionary properties of several genomic characteristics, including transcription, post-transcriptional regulation, imprinting, genome structure and gene sequence. Sex-specific selection is strongly influenced by mating system, which also causes neutral evolutionary changes that affect different regions of the genome in different ways. Here, we synthesize theoretical and molecular work in order to provide a cohesive view of the role of sex-specific selection and mating system in genome evolution. We also highlight the need for a combined approach, incorporating both genomic data and experimental phenotypic studies, in order to understand precisely how sex-specific selection drives evolutionary change across the genome. PMID:23848139

  6. Using hidden Markov models and observed evolution to annotate viral genomes.

    PubMed

    McCauley, Stephen; Hein, Jotun

    2006-06-01

    ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.

  7. On the need for widespread horizontal gene transfers under genome size constraint.

    PubMed

    Isambert, Hervé; Stein, Richard R

    2009-08-25

    While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery. We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss. This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the emergence of more complex life styles and ecological integration of many eukaryotes. This article was reviewed by Pierre Pontarotti, Eugene V Koonin and Sergei Maslov.

  8. Initial sequencing and comparative analysis of the mouse genome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of themore » genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.« less

  9. Comparative genomics meets topology: a novel view on genome median and halving problems.

    PubMed

    Alexeev, Nikita; Avdeyev, Pavel; Alekseyev, Max A

    2016-11-11

    Genome median and genome halving are combinatorial optimization problems that aim at reconstruction of ancestral genomes by minimizing the number of evolutionary events between them and genomes of the extant species. While these problems have been widely studied in past decades, their solutions are often either not efficient or not biologically adequate. These shortcomings have been recently addressed by restricting the problems solution space. We show that the restricted variants of genome median and halving problems are, in fact, closely related. We demonstrate that these problems have a neat topological interpretation in terms of embedded graphs and polygon gluings. We illustrate how such interpretation can lead to solutions to these problems in particular cases. This study provides an unexpected link between comparative genomics and topology, and demonstrates advantages of solving genome median and halving problems within the topological framework.

  10. Association between genomic instability and evolutionary chromosomal rearrangements in Neotropical Primates.

    PubMed

    Puntieri, Fiona; Andrioli, Nancy B; Nieves, Mariela

    2018-06-14

    During the last decades the mammalian genome has been proposed to have regions prone to breakage and reorganization concentrated in certain chromosomal bands that seem to correspond to evolutionary breakpoints. These bands are likely to be involved in chromosome fragility or instability. In Primates, some biomarkers of genetic damage may be associated with various degrees of genomic instability. Here, we investigated the usefulness of Sister Chromatid Exchange (SCE) as a biomarker of potential sites of frequent chromosome breakage and rearrangement in Alouatta caraya, Ateles chamek, Ateles paniscus and Cebus cay. These Neotropical species have particular genomic and chromosomal features allowing the analysis of genomic instability for comparative purposes. We determined the frequency of spontaneous induction of SCEs and assessed the relationship between these and structural rearrangements implicated in the evolution of the primates of interest. Overall, A. caraya and C. cay presented a low proportion of statistically significant unstable bands, suggesting fairly stable genomes and the existence of some kind of protection against endogenous damage. In contrast, Ateles showed a highly significant proportion of unstable bands; these were mainly found in the rearranged regions, which is consistent with the numerous genomic reorganizations that might have occurred during the evolution of this genus.

  11. Comparative Analysis of the Mitochondrial Genomes of Callitettixini Spittlebugs (Hemiptera: Cercopidae) Confirms the Overall High Evolutionary Speed of the AT-Rich Region but Reveals the Presence of Short Conservative Elements at the Tribal Level

    PubMed Central

    Liu, Jie; Bu, Cuiping; Wipfler, Benjamin; Liang, Aiping

    2014-01-01

    The present study compares the mitochondrial genomes of five species of the spittlebug tribe Callitettixini (Hemiptera: Cercopoidea: Cercopidae) from eastern Asia. All genomes of the five species sequenced are circular double-stranded DNA molecules and range from 15,222 to 15,637 bp in length. They contain 22 tRNA genes, 13 protein coding genes (PCGs) and 2 rRNA genes and share the putative ancestral gene arrangement of insects. The PCGs show an extreme bias of nucleotide and amino acid composition. Significant differences of the substitution rates among the different genes as well as the different codon position of each PCG are revealed by the comparative evolutionary analyses. The substitution speeds of the first and second codon position of different PCGs are negatively correlated with their GC content. Among the five species, the AT-rich region features great differences in length and pattern and generally shows a 2–5 times higher substitution rate than the fastest PCG in the mitochondrial genome, atp8. Despite the significant variability in length, short conservative segments were identified in the AT-rich region within Callitettixini, although absent from the other groups of the spittlebug superfamily Cercopoidea. PMID:25285442

  12. Comparative Methylome Analyses Identify Epigenetic Regulatory Loci of Human Brain Evolution.

    PubMed

    Mendizabal, Isabel; Shi, Lei; Keller, Thomas E; Konopka, Genevieve; Preuss, Todd M; Hsieh, Tzung-Fu; Hu, Enzhi; Zhang, Zhe; Su, Bing; Yi, Soojin V

    2016-11-01

    How do epigenetic modifications change across species and how do these modifications affect evolution? These are fundamental questions at the forefront of our evolutionary epigenomic understanding. Our previous work investigated human and chimpanzee brain methylomes, but it was limited by the lack of outgroup data which is critical for comparative (epi)genomic studies. Here, we compared whole genome DNA methylation maps from brains of humans, chimpanzees and also rhesus macaques (outgroup) to elucidate DNA methylation changes during human brain evolution. Moreover, we validated that our approach is highly robust by further examining 38 human-specific DMRs using targeted deep genomic and bisulfite sequencing in an independent panel of 37 individuals from five primate species. Our unbiased genome-scan identified human brain differentially methylated regions (DMRs), irrespective of their associations with annotated genes. Remarkably, over half of the newly identified DMRs locate in intergenic regions or gene bodies. Nevertheless, their regulatory potential is on par with those of promoter DMRs. An intriguing observation is that DMRs are enriched in active chromatin loops, suggesting human-specific evolutionary remodeling at a higher-order chromatin structure. These findings indicate that there is substantial reprogramming of epigenomic landscapes during human brain evolution involving noncoding regions. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Comparative Genomics of Listeria Sensu Lato: Genus-Wide Differences in Evolutionary Dynamics and the Progressive Gain of Complex, Potentially Pathogenicity-Related Traits through Lateral Gene Transfer

    PubMed Central

    Chiara, Matteo; Caruso, Marta; D’Erchia, Anna Maria; Manzari, Caterina; Fraccalvieri, Rosa; Goffredo, Elisa; Latorre, Laura; Miccolupo, Angela; Padalino, Iolanda; Santagada, Gianfranco; Chiocco, Doriano; Pesole, Graziano; Horner, David S.; Parisi, Antonio

    2015-01-01

    Historically, genome-wide and molecular characterization of the genus Listeria has concentrated on the important human pathogen Listeria monocytogenes and a small number of closely related species, together termed Listeria sensu strictu. More recently, a number of genome sequences for more basal, and nonpathogenic, members of the Listeria genus have become available, facilitating a wider perspective on the evolution of pathogenicity and genome level evolutionary dynamics within the entire genus (termed Listeria sensu lato). Here, we have sequenced the genomes of additional Listeria fleischmannii and Listeria newyorkensis isolates and explored the dynamics of genome evolution in Listeria sensu lato. Our analyses suggest that acquisition of genetic material through gene duplication and divergence as well as through lateral gene transfer (mostly from outside Listeria) is widespread throughout the genus. Novel genetic material is apparently subject to rapid turnover. Multiple lines of evidence point to significant differences in evolutionary dynamics between the most basal Listeria subclade and all other congeners, including both sensu strictu and other sensu lato isolates. Strikingly, these differences are likely attributable to stochastic, population-level processes and contribute to observed variation in genome size across the genus. Notably, our analyses indicate that the common ancestor of Listeria sensu lato lacked flagella, which were acquired by lateral gene transfer by a common ancestor of Listeria grayi and Listeria sensu strictu, whereas a recently functionally characterized pathogenicity island, responsible for the capacity to produce cobalamin and utilize ethanolamine/propane-2-diol, was acquired in an ancestor of Listeria sensu strictu. PMID:26185097

  14. Integration of structural dynamics and molecular evolution via protein interaction networks: a new era in genomic medicine.

    PubMed

    Kumar, Avishek; Butler, Brandon M; Kumar, Sudhir; Ozkan, S Banu

    2015-12-01

    Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Methods for Genome-Wide Analysis of Gene Expression Changes in Polyploids

    PubMed Central

    Wang, Jianlin; Lee, Jinsuk J.; Tian, Lu; Lee, Hyeon-Se; Chen, Meng; Rao, Sheetal; Wei, Edward N.; Doerge, R. W.; Comai, Luca; Jeffrey Chen, Z.

    2007-01-01

    Polyploidy is an evolutionary innovation, providing extra sets of genetic material for phenotypic variation and adaptation. It is predicted that changes of gene expression by genetic and epigenetic mechanisms are responsible for novel variation in nascent and established polyploids (Liu and Wendel, 2002; Osborn et al., 2003; Pikaard, 2001). Studying gene expression changes in allopolyploids is more complicated than in autopolyploids, because allopolyploids contain more than two sets of genomes originating from divergent, but related, species. Here we describe two methods that are applicable to the genome-wide analysis of gene expression differences resulting from genome duplication in autopolyploids or interactions between homoeologous genomes in allopolyploids. First, we describe an amplified fragment length polymorphism (AFLP)–complementary DNA (cDNA) display method that allows the discrimination of homoeologous loci based on restriction polymorphisms between the progenitors. Second, we describe microarray analyses that can be used to compare gene expression differences between the allopolyploids and respective progenitors using appropriate experimental design and statistical analysis. We demonstrate the utility of these two complementary methods and discuss the pros and cons of using the methods to analyze gene expression changes in autopolyploids and allopolyploids. Furthermore, we describe these methods in general terms to be of wider applicability for comparative gene expression in a variety of evolutionary, genetic, biological, and physiological contexts. PMID:15865985

  16. Comparative Genomics and Host Resistance against Infectious Diseases

    PubMed Central

    Qureshi, Salman T.; Skamene, Emil

    1999-01-01

    The large size and complexity of the human genome have limited the identification and functional characterization of components of the innate immune system that play a critical role in front-line defense against invading microorganisms. However, advances in genome analysis (including the development of comprehensive sets of informative genetic markers, improved physical mapping methods, and novel techniques for transcript identification) have reduced the obstacles to discovery of novel host resistance genes. Study of the genomic organization and content of widely divergent vertebrate species has shown a remarkable degree of evolutionary conservation and enables meaningful cross-species comparison and analysis of newly discovered genes. Application of comparative genomics to host resistance will rapidly expand our understanding of human immune defense by facilitating the translation of knowledge acquired through the study of model organisms. We review the rationale and resources for comparative genomic analysis and describe three examples of host resistance genes successfully identified by this approach. PMID:10081670

  17. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary

    PubMed Central

    Vanneste, Kevin; Baele, Guy; Maere, Steven; Van de Peer, Yves

    2014-01-01

    Ancient whole-genome duplications (WGDs), also referred to as paleopolyploidizations, have been reported in most evolutionary lineages. Their attributed role remains a major topic of discussion, ranging from an evolutionary dead end to a road toward evolutionary success, with evidence supporting both fates. Previously, based on dating WGDs in a limited number of plant species, we found a clustering of angiosperm paleopolyploidizations around the Cretaceous–Paleogene (K–Pg) extinction event about 66 million years ago. Here we revisit this finding, which has proven controversial, by combining genome sequence information for many more plant lineages and using more sophisticated analyses. We include 38 full genome sequences and three transcriptome assemblies in a Bayesian evolutionary analysis framework that incorporates uncorrelated relaxed clock methods and fossil uncertainty. In accordance with earlier findings, we demonstrate a strongly nonrandom pattern of genome duplications over time with many WGDs clustering around the K–Pg boundary. We interpret these results in the context of recent studies on invasive polyploid plant species, and suggest that polyploid establishment is promoted during times of environmental stress. We argue that considering the evolutionary potential of polyploids in light of the environmental and ecological conditions present around the time of polyploidization could mitigate the stark contrast in the proposed evolutionary fates of polyploids. PMID:24835588

  18. Comparative genomics explains the evolutionary success of reef-forming corals.

    PubMed

    Bhattacharya, Debashish; Agrawal, Shobhit; Aranda, Manuel; Baumgarten, Sebastian; Belcaid, Mahdi; Drake, Jeana L; Erwin, Douglas; Foret, Sylvian; Gates, Ruth D; Gruber, David F; Kamel, Bishoy; Lesser, Michael P; Levy, Oren; Liew, Yi Jin; MacManes, Matthew; Mass, Tali; Medina, Monica; Mehr, Shaadi; Meyer, Eli; Price, Dana C; Putnam, Hollie M; Qiu, Huan; Shinzato, Chuya; Shoguchi, Eiichi; Stokes, Alexander J; Tambutté, Sylvie; Tchernov, Dan; Voolstra, Christian R; Wagner, Nicole; Walker, Charles W; Weber, Andreas Pm; Weis, Virginia; Zelzion, Ehud; Zoccola, Didier; Falkowski, Paul G

    2016-05-24

    Transcriptome and genome data from twenty stony coral species and a selection of reference bilaterians were studied to elucidate coral evolutionary history. We identified genes that encode the proteins responsible for the precipitation and aggregation of the aragonite skeleton on which the organisms live, and revealed a network of environmental sensors that coordinate responses of the host animals to temperature, light, and pH. Furthermore, we describe a variety of stress-related pathways, including apoptotic pathways that allow the host animals to detoxify reactive oxygen and nitrogen species that are generated by their intracellular photosynthetic symbionts, and determine the fate of corals under environmental stress. Some of these genes arose through horizontal gene transfer and comprise at least 0.2% of the animal gene inventory. Our analysis elucidates the evolutionary strategies that have allowed symbiotic corals to adapt and thrive for hundreds of millions of years.

  19. Genome sequencing and comparative genomics of enterohemorrhagic Escherichia coli O145:H25 and O145:H28 reveal distinct evolutionary paths and marked variations in traits associated with virulence & colonization.

    PubMed

    Lorenz, Sandra C; Gonzalez-Escalona, Narjol; Kotewicz, Michael L; Fischer, Markus; Kase, Julie A

    2017-08-22

    Enterohemorrhagic Escherichia coli (EHEC) O145 are among the top non-O157 serogroups associated with severe human disease worldwide. Two serotypes, O145:H25 and O145:H28 have been isolated from human patients but little information is available regarding the virulence repertoire, origin and evolutionary relatedness of O145:H25. Hence, we sequenced the complete genome of two O145:H25 strains associated with hemolytic uremic syndrome (HUS) and compared the genomes with those of previously sequenced O145:H28 and other EHEC strains. The genomes of the two O145:H25 strains were 5.3 Mbp in size; slightly smaller than those of O145:H28 and other EHEC strains. Both strains contained three nearly identical plasmids and several prophages and integrative elements, many of which differed significantly in size, gene content and organization as compared to those present in O145:H28 and other EHECs. Furthermore, notable variations were observed in several fimbrial gene cluster and intimin types possessed by O145:H25 and O145:H28 indicating potential adaptation to distinct areas of host colonization. Comparative genomics further revealed that O145:H25 are genetically more similar to other non-O157 EHEC strains than to O145:H28. Phylogenetic analysis accompanied by comparative genomics revealed that O145:H25 and O145:H28 evolved from two separate clonal lineages and that horizontal gene transfer and gene loss played a major role in the divergence of these EHEC serotypes. The data provide further evidence that ruminants might be a possible reservoir for O145:H25 but that they might be impaired in their ability to establish a persistent colonization as compared to other EHEC strains.

  20. Insights into the evolution of Darwin’s finches from comparative analysis of the Geospiza magnirostris genome sequence

    PubMed Central

    2013-01-01

    Background A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin’s (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2–3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris. Results 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin’s finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins. Conclusions These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin’s finches. PMID:23402223

  1. Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts

    PubMed Central

    Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo

    2007-01-01

    Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083

  2. Complete chloroplast genome sequences of Praxelis (Eupatorium catarium Veldkamp), an important invasive species.

    PubMed

    Zhang, Ying; Li, Lei; Yan, Ting Liang; Liu, Qiang

    2014-10-01

    Praxelis (Eupatorium catarium Veldkamp) is a new hazardous invasive plant species that has caused serious economic losses and environmental damage in the Northern hemisphere tropical and subtropical regions. Although previous studies focused on detecting the biological characteristics of this plant to prevent its expansion, little effort has been made to understand the impact of Praxelis on the ecosystem in an evolutionary process. The genetic information of Praxelis is required for further phylogenetic identification and evolutionary studies. Here, we report the complete Praxelis chloroplast (cp) genome sequence. The Praxelis chloroplast genome is 151,410 bp in length including a small single-copy region (18,547 bp) and a large single-copy region (85,311 bp) separated by a pair of inverted repeats (IRs; 23,776 bp). The genome contains 85 unique and 18 duplicated genes in the IR region. The gene content and organization are similar to other Asteraceae tribe cp genomes. We also analyzed the whole cp genome sequence, repeat structure, codon usage, contraction of the IR and gene structure/organization features between native and invasive Asteraceae plants, in order to understand the evolution of organelle genomes between native and invasive Asteraceae. Comparative analysis identified the 14 markers containing greater than 2% parsimony-informative characters, indicating that they are potential informative markers for barcoding and phylogenetic analysis. Moreover, a sister relationship between Praxelis and seven other species in Asteraceae was found based on phylogenetic analysis of 28 protein-coding sequences. Complete cp genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. The Tarenaya hassleriana Genome Provides Insight into Reproductive Trait and Genome Evolution of Crucifers[W][OPEN

    PubMed Central

    Cheng, Shifeng; van den Bergh, Erik; Zeng, Peng; Zhong, Xiao; Xu, Jiajia; Liu, Xin; Hofberger, Johannes; de Bruijn, Suzanne; Bhide, Amey S.; Kuelahoglu, Canan; Bian, Chao; Chen, Jing; Fan, Guangyi; Kaufmann, Kerstin; Hall, Jocelyn C.; Becker, Annette; Bräutigam, Andrea; Weber, Andreas P.M.; Shi, Chengcheng; Zheng, Zhijun; Li, Wujiao; Lv, Mingju; Tao, Yimin; Wang, Junyi; Zou, Hongfeng; Quan, Zhiwu; Hibberd, Julian M.; Zhang, Gengyun; Zhu, Xin-Guang; Xu, Xun; Schranz, M. Eric

    2013-01-01

    The Brassicaceae, including Arabidopsis thaliana and Brassica crops, is unmatched among plants in its wealth of genomic and functional molecular data and has long served as a model for understanding gene, genome, and trait evolution. However, genome information from a phylogenetic outgroup that is essential for inferring directionality of evolutionary change has been lacking. We therefore sequenced the genome of the spider flower (Tarenaya hassleriana) from the Brassicaceae sister family, the Cleomaceae. By comparative analysis of the two lineages, we show that genome evolution following ancient polyploidy and gene duplication events affect reproductively important traits. We found an ancient genome triplication in Tarenaya (Th-α) that is independent of the Brassicaceae-specific duplication (At-α) and nested Brassica (Br-α) triplication. To showcase the potential of sister lineage genome analysis, we investigated the state of floral developmental genes and show Brassica retains twice as many floral MADS (for MINICHROMOSOME MAINTENANCE1, AGAMOUS, DEFICIENS and SERUM RESPONSE FACTOR) genes as Tarenaya that likely contribute to morphological diversity in Brassica. We also performed synteny analysis of gene families that confer self-incompatibility in Brassicaceae and found that the critical SERINE RECEPTOR KINASE receptor gene is derived from a lineage-specific tandem duplication. The T. hassleriana genome will facilitate future research toward elucidating the evolutionary history of Brassicaceae genomes. PMID:23983221

  4. Landscape community genomics: understanding eco-evolutionary processes in complex environments

    USGS Publications Warehouse

    Hand, Brian K.; Lowe, Winsor H.; Kovach, Ryan P.; Muhlfeld, Clint C.; Luikart, Gordon

    2015-01-01

    Extrinsic factors influencing evolutionary processes are often categorically lumped into interactions that are environmentally (e.g., climate, landscape) or community-driven, with little consideration of the overlap or influence of one on the other. However, genomic variation is strongly influenced by complex and dynamic interactions between environmental and community effects. Failure to consider both effects on evolutionary dynamics simultaneously can lead to incomplete, spurious, or erroneous conclusions about the mechanisms driving genomic variation. We highlight the need for a landscape community genomics (LCG) framework to help to motivate and challenge scientists in diverse fields to consider a more holistic, interdisciplinary perspective on the genomic evolution of multi-species communities in complex environments.

  5. Bacterial Genome Instability

    PubMed Central

    Darmon, Elise

    2014-01-01

    SUMMARY Bacterial genomes are remarkably stable from one generation to the next but are plastic on an evolutionary time scale, substantially shaped by horizontal gene transfer, genome rearrangement, and the activities of mobile DNA elements. This implies the existence of a delicate balance between the maintenance of genome stability and the tolerance of genome instability. In this review, we describe the specialized genetic elements and the endogenous processes that contribute to genome instability. We then discuss the consequences of genome instability at the physiological level, where cells have harnessed instability to mediate phase and antigenic variation, and at the evolutionary level, where horizontal gene transfer has played an important role. Indeed, this ability to share DNA sequences has played a major part in the evolution of life on Earth. The evolutionary plasticity of bacterial genomes, coupled with the vast numbers of bacteria on the planet, substantially limits our ability to control disease. PMID:24600039

  6. The Ditylenchus destructor genome provides new insights into the evolution of plant parasitic nematodes

    PubMed Central

    Zheng, Jinshui; Peng, Donghai; Chen, Ling; Liu, Hualin; Chen, Feng; Xu, Mengci; Ju, Shouyong; Ruan, Lifang

    2016-01-01

    Plant-parasitic nematodes were found in 4 of the 12 clades of phylum Nematoda. These nematodes in different clades may have originated independently from their free-living fungivorous ancestors. However, the exact evolutionary process of these parasites is unclear. Here, we sequenced the genome sequence of a migratory plant nematode, Ditylenchus destructor. We performed comparative genomics among the free-living nematode, Caenorhabditis elegans and all the plant nematodes with genome sequences available. We found that, compared with C. elegans, the core developmental control processes underwent heavy reduction, though most signal transduction pathways were conserved. We also found D. destructor contained more homologies of the key genes in the above processes than the other plant nematodes. We suggest that Ditylenchus spp. may be an intermediate evolutionary history stage from free-living nematodes that feed on fungi to obligate plant-parasitic nematodes. Based on the facts that D. destructor can feed on fungi and has a relatively short life cycle, and that it has similar features to both C. elegans and sedentary plant-parasitic nematodes from clade 12, we propose it as a new model to study the biology, biocontrol of plant nematodes and the interaction between nematodes and plants. PMID:27466450

  7. Rapid evolution in insect pests: the importance of space and time in population genomics studies.

    PubMed

    Pélissié, Benjamin; Crossley, Michael S; Cohen, Zachary Paul; Schoville, Sean D

    2018-04-01

    Pest species in agroecosystems often exhibit patterns of rapid evolution to environmental and human-imposed selection pressures. Although the role of adaptive processes is well accepted, few insect pests have been studied in detail and most research has focused on selection at insecticide resistance candidate genes. Emerging genomic datasets provide opportunities to detect and quantify selection in insect pest populations, and address long-standing questions about mechanisms underlying rapid evolutionary change. We examine the strengths of recent studies that stratify population samples both in space (along environmental gradients and comparing ancestral vs. derived populations) and in time (using chronological sampling, museum specimens and comparative phylogenomics), resulting in critical insights on evolutionary processes, and providing new directions for studying pests in agroecosystems. Copyright © 2018 Elsevier Inc. All rights reserved.

  8. Lengths of Orthologous Prokaryotic Proteins Are Affected by Evolutionary Factors

    PubMed Central

    Tatarinova, Tatiana; Dien Bard, Jennifer; Cohen, Irit

    2015-01-01

    Proteins of the same functional family (for example, kinases) may have significantly different lengths. It is an open question whether such variation in length is random or it appears as a response to some unknown evolutionary driving factors. The main purpose of this paper is to demonstrate existence of factors affecting prokaryotic gene lengths. We believe that the ranking of genomes according to lengths of their genes, followed by the calculation of coefficients of association between genome rank and genome property, is a reasonable approach in revealing such evolutionary driving factors. As we demonstrated earlier, our chosen approach, Bubble-sort, combines stability, accuracy, and computational efficiency as compared to other ranking methods. Application of Bubble Sort to the set of 1390 prokaryotic genomes confirmed that genes of Archaeal species are generally shorter than Bacterial ones. We observed that gene lengths are affected by various factors: within each domain, different phyla have preferences for short or long genes; thermophiles tend to have shorter genes than the soil-dwellers; halophiles tend to have longer genes. We also found that species with overrepresentation of cytosines and guanines in the third position of the codon (GC3 content) tend to have longer genes than species with low GC3 content. PMID:26114113

  9. Lengths of Orthologous Prokaryotic Proteins Are Affected by Evolutionary Factors.

    PubMed

    Tatarinova, Tatiana; Salih, Bilal; Dien Bard, Jennifer; Cohen, Irit; Bolshoy, Alexander

    2015-01-01

    Proteins of the same functional family (for example, kinases) may have significantly different lengths. It is an open question whether such variation in length is random or it appears as a response to some unknown evolutionary driving factors. The main purpose of this paper is to demonstrate existence of factors affecting prokaryotic gene lengths. We believe that the ranking of genomes according to lengths of their genes, followed by the calculation of coefficients of association between genome rank and genome property, is a reasonable approach in revealing such evolutionary driving factors. As we demonstrated earlier, our chosen approach, Bubble-sort, combines stability, accuracy, and computational efficiency as compared to other ranking methods. Application of Bubble Sort to the set of 1390 prokaryotic genomes confirmed that genes of Archaeal species are generally shorter than Bacterial ones. We observed that gene lengths are affected by various factors: within each domain, different phyla have preferences for short or long genes; thermophiles tend to have shorter genes than the soil-dwellers; halophiles tend to have longer genes. We also found that species with overrepresentation of cytosines and guanines in the third position of the codon (GC3 content) tend to have longer genes than species with low GC3 content.

  10. Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

    PubMed

    Inglin, Raffael C; Meile, Leo; Stevens, Marc J A

    2018-04-24

    Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events. We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation. Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed by the environment and HGT.

  11. Comparative genomics of ParaHox clusters of teleost fishes: gene cluster breakup and the retention of gene sets following whole genome duplications

    PubMed Central

    Siegel, Nicol; Hoegg, Simone; Salzburger, Walter; Braasch, Ingo; Meyer, Axel

    2007-01-01

    Background The evolutionary lineage leading to the teleost fish underwent a whole genome duplication termed FSGD or 3R in addition to two prior genome duplications that took place earlier during vertebrate evolution (termed 1R and 2R). Resulting from the FSGD, additional copies of genes are present in fish, compared to tetrapods whose lineage did not experience the 3R genome duplication. Interestingly, we find that ParaHox genes do not differ in number in extant teleost fishes despite their additional genome duplication from the genomic situation in mammals, but they are distributed over twice as many paralogous regions in fish genomes. Results We determined the DNA sequence of the entire ParaHox C1 paralogon in the East African cichlid fish Astatotilapia burtoni, and compared it to orthologous regions in other vertebrate genomes as well as to the paralogous vertebrate ParaHox D paralogons. Evolutionary relationships among genes from these four chromosomal regions were studied with several phylogenetic algorithms. We provide evidence that the genes of the ParaHox C paralogous cluster are duplicated in teleosts, just as it had been shown previously for the D paralogon genes. Overall, however, synteny and cluster integrity seems to be less conserved in ParaHox gene clusters than in Hox gene clusters. Comparative analyses of non-coding sequences uncovered conserved, possibly co-regulatory elements, which are likely to contain promoter motives of the genes belonging to the ParaHox paralogons. Conclusion There seems to be strong stabilizing selection for gene order as well as gene orientation in the ParaHox C paralogon, since with a few exceptions, only the lengths of the introns and intergenic regions differ between the distantly related species examined. The high degree of evolutionary conservation of this gene cluster's architecture in particular – but possibly clusters of genes more generally – might be linked to the presence of promoter, enhancer or inhibitor motifs that serve to regulate more than just one gene. Therefore, deletions, inversions or relocations of individual genes could destroy the regulation of the clustered genes in this region. The existence of such a regulation network might explain the evolutionary conservation of gene order and orientation over the course of hundreds of millions of years of vertebrate evolution. Another possible explanation for the highly conserved gene order might be the existence of a regulator not located immediately next to its corresponding gene but further away since a relocation or inversion would possibly interrupt this interaction. Different ParaHox clusters were found to have experienced differential gene loss in teleosts. Yet the complete set of these homeobox genes was maintained, albeit distributed over almost twice the number of chromosomes. Selection due to dosage effects and/or stoichiometric disturbance might act more strongly to maintain a modal number of homeobox genes (and possibly transcription factors more generally) per genome, yet permit the accumulation of other (non regulatory) genes associated with these homeobox gene clusters. PMID:17822543

  12. Evolutionary insights from Erwinia amylovora genomics.

    PubMed

    Smits, Theo H M; Rezzonico, Fabio; Duffy, Brion

    2011-08-20

    Evolutionary genomics is coming into focus with the recent availability of complete sequences for many bacterial species. A hypothesis on the evolution of virulence factors in the plant pathogen Erwinia amylovora, the causative agent of fire blight, was generated using comparative genomics with the genomes E. amylovora, Erwinia pyrifoliae and Erwinia tasmaniensis. Putative virulence factors were mapped to the proposed genealogy of the genus Erwinia that is based on phylogenetic and genomic data. Ancestral origin of several virulence factors was identified, including levan biosynthesis, sorbitol metabolism, three T3SS and two T6SS. Other factors appeared to have been acquired after divergence of pathogenic species, including a second flagellar gene and two glycosyltransferases involved in amylovoran biosynthesis. E. amylovora singletons include 3 unique T3SS effectors that may explain differential virulence/host ranges. E. amylovora also has a unique T1SS export system, and a unique third T6SS gene cluster. Genetic analysis revealed signatures of foreign DNA suggesting that horizontal gene transfer is responsible for some of these differential features between the three species. Copyright © 2010 Elsevier B.V. All rights reserved.

  13. Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction.

    PubMed

    Muley, Vijaykumar Yogesh; Ranjan, Akash

    2012-01-01

    Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.

  14. Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics

    PubMed Central

    Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

    2015-01-01

    The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. PMID:25378326

  15. Signatures of host specialization and a recent transposable element burst in the dynamic one-speed genome of the fungal barley powdery mildew pathogen.

    PubMed

    Frantzeskakis, Lamprinos; Kracher, Barbara; Kusch, Stefan; Yoshikawa-Maekawa, Makoto; Bauer, Saskia; Pedersen, Carsten; Spanu, Pietro D; Maekawa, Takaki; Schulze-Lefert, Paul; Panstruga, Ralph

    2018-05-22

    Powdery mildews are biotrophic pathogenic fungi infecting a number of economically important plants. The grass powdery mildew, Blumeria graminis, has become a model organism to study host specialization of obligate biotrophic fungal pathogens. We resolved the large-scale genomic architecture of B. graminis forma specialis hordei (Bgh) to explore the potential influence of its genome organization on the co-evolutionary process with its host plant, barley (Hordeum vulgare). The near-chromosome level assemblies of the Bgh reference isolate DH14 and one of the most diversified isolates, RACE1, enabled a comparative analysis of these haploid genomes, which are highly enriched with transposable elements (TEs). We found largely retained genome synteny and gene repertoires, yet detected copy number variation (CNV) of secretion signal peptide-containing protein-coding genes (SPs) and locally disrupted synteny blocks. Genes coding for sequence-related SPs are often locally clustered, but neither the SPs nor the TEs reside preferentially in genomic regions with unique features. Extended comparative analysis with different host-specific B. graminis formae speciales revealed the existence of a core suite of SPs, but also isolate-specific SP sets as well as congruence of SP CNV and phylogenetic relationship. We further detected evidence for a recent, lineage-specific expansion of TEs in the Bgh genome. The characteristics of the Bgh genome (largely retained synteny, CNV of SP genes, recently proliferated TEs and a lack of significant compartmentalization) are consistent with a "one-speed" genome that differs in its architecture and (co-)evolutionary pattern from the "two-speed" genomes reported for several other filamentous phytopathogens.

  16. Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

    PubMed

    Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

    2013-09-02

    In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome-wide collection of reference RNA motif regulons is available in the RegPrecise database (http://regprecise.lbl.gov/).

  17. Contrasting patterns of evolutionary constraint and novelty revealed by comparative sperm proteomic analysis in Lepidoptera.

    PubMed

    Whittington, Emma; Forsythe, Desiree; Borziak, Kirill; Karr, Timothy L; Walters, James R; Dorus, Steve

    2017-12-02

    Rapid evolution is a hallmark of reproductive genetic systems and arises through the combined processes of sequence divergence, gene gain and loss, and changes in gene and protein expression. While studies aiming to disentangle the molecular ramifications of these processes are progressing, we still know little about the genetic basis of evolutionary transitions in reproductive systems. Here we conduct the first comparative analysis of sperm proteomes in Lepidoptera, a group that exhibits dichotomous spermatogenesis, in which males produce a functional fertilization-competent sperm (eupyrene) and an incompetent sperm morph lacking nuclear DNA (apyrene). Through the integrated application of evolutionary proteomics and genomics, we characterize the genomic patterns potentially associated with the origination and evolution of this unique spermatogenic process and assess the importance of genetic novelty in Lepidopteran sperm biology. Comparison of the newly characterized Monarch butterfly (Danaus plexippus) sperm proteome to those of the Carolina sphinx moth (Manduca sexta) and the fruit fly (Drosophila melanogaster) demonstrated conservation at the level of protein abundance and post-translational modification within Lepidoptera. In contrast, comparative genomic analyses across insects reveals significant divergence at two levels that differentiate the genetic architecture of sperm in Lepidoptera from other insects. First, a significant reduction in orthology among Monarch sperm genes relative to the remainder of the genome in non-Lepidopteran insect species was observed. Second, a substantial number of sperm proteins were found to be specific to Lepidoptera, in that they lack detectable homology to the genomes of more distantly related insects. Lastly, the functional importance of Lepidoptera specific sperm proteins is broadly supported by their increased abundance relative to proteins conserved across insects. Our results identify a burst of genetic novelty amongst sperm proteins that may be associated with the origin of heteromorphic spermatogenesis in ancestral Lepidoptera and/or the subsequent evolution of this system. This pattern of genomic diversification is distinct from the remainder of the genome and thus suggests that this transition has had a marked impact on lepidopteran genome evolution. The identification of abundant sperm proteins unique to Lepidoptera, including proteins distinct between specific lineages, will accelerate future functional studies aiming to understand the developmental origin of dichotomous spermatogenesis and the functional diversification of the fertilization incompetent apyrene sperm morph.

  18. Tracing the evolutionary origin of vertebrate skeletal tissues: insights from cephalochordate amphioxus.

    PubMed

    Yong, Luok Wen; Yu, Jr-Kai

    2016-08-01

    Vertebrate mineralized skeletal tissues are widely considered as an evolutionary novelty. Despite the importance of these tissues to the adaptation and radiation of vertebrate animals, the evolutionary origin of vertebrate skeletal tissues remains largely unclear. Cephalochordates (Amphioxus) occupy a key phylogenetic position and can serve as a valuable model for studying the evolution of vertebrate skeletal tissues. Here we summarize recent advances in amphioxus developmental biology and comparative genomics that can help to elucidate the evolutionary origins of the vertebrate skeletal tissues and their underlying developmental gene regulatory networks (GRN). By making comparisons to the developmental studies in vertebrate models and recent discoveries in paleontology and genomics, it becomes evident that the collagen matrix-based connective tissues secreted by the somite-derived cells in amphioxus likely represent the rudimentary skeletal tissues in chordates. We propose that upon the foundation of this collagenous precursor, novel tissue mineralization genes that arose from gene duplications were incorporated into an ancestral mesodermal GRN that makes connective and supporting tissues, leading to the emergence of highly-mineralized skeletal tissues in early vertebrates. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies

    PubMed Central

    2015-01-01

    Background Most models of genome evolution concern either genetic sequences, gene content or gene order. They sometimes integrate two of the three levels, but rarely the three of them. Probabilistic models of gene order evolution usually have to assume constant gene content or adopt a presence/absence coding of gene neighborhoods which is blind to complex events modifying gene content. Results We propose a probabilistic evolutionary model for gene neighborhoods, allowing genes to be inserted, duplicated or lost. It uses reconciled phylogenies, which integrate sequence and gene content evolution. We are then able to optimize parameters such as phylogeny branch lengths, or probabilistic laws depicting the diversity of susceptibility of syntenic regions to rearrangements. We reconstruct a structure for ancestral genomes by optimizing a likelihood, keeping track of all evolutionary events at the level of gene content and gene synteny. Ancestral syntenies are associated with a probability of presence. We implemented the model with the restriction that at most one gene duplication separates two gene speciations in reconciled gene trees. We reconstruct ancestral syntenies on a set of 12 drosophila genomes, and compare the evolutionary rates along the branches and along the sites. We compare with a parsimony method and find a significant number of results not supported by the posterior probability. The model is implemented in the Bio++ library. It thus benefits from and enriches the classical models and methods for molecular evolution. PMID:26452018

  20. Evolutionary relationships of Fusobacterium nucleatum based on phylogenetic analysis and comparative genomics

    PubMed Central

    Mira, Alex; Pushker, Ravindra; Legault, Boris A; Moreira, David; Rodríguez-Valera, Francisco

    2004-01-01

    Background The phylogenetic position and evolutionary relationships of Fusobacteria remain uncertain. Especially intriguing is their relatedness to low G+C Gram positive bacteria (Firmicutes) by ribosomal molecular phylogenies, but their possession of a typical gram negative outer membrane. Taking advantage of the recent completion of the Fusobacterium nucleatum genome sequence we have examined the evolutionary relationships of Fusobacterium genes by phylogenetic analysis and comparative genomics tools. Results The data indicate that Fusobacterium has a core genome of a very different nature to other bacterial lineages, and branches out at the base of Firmicutes. However, depending on the method used, 35–56% of Fusobacterium genes appear to have a xenologous origin from bacteroidetes, proteobacteria, spirochaetes and the Firmicutes themselves. A high number of hypothetical ORFs with unusual codon usage and short lengths were found and hypothesized to be remnants of transferred genes that were discarded. Some proteins and operons are also hypothesized to be of mixed ancestry. A large portion of the Gram-negative cell wall-related genes seems to have been transferred from proteobacteria. Conclusions Many instances of similarity to other inhabitants of the dental plaque that have been sequenced were found. This suggests that the close physical contact found in this environment might facilitate horizontal gene transfer, supporting the idea of niche-specific gene pools. We hypothesize that at a point in time, probably associated to the rise of mammals, a strong selective pressure might have existed for a cell with a Clostridia-like metabolic apparatus but with the adhesive and immune camouflage features of Proteobacteria. PMID:15566569

  1. DCODE.ORG Anthology of Comparative Genomic Tools

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a toolmore » for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.« less

  2. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

    PubMed

    Wenger, Yvan; Galliot, Brigitte

    2013-03-25

    Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.

  3. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

    PubMed Central

    2013-01-01

    Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871

  4. Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System.

    PubMed

    Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

    2017-01-01

    Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.

  5. Genome Dynamics in Legionella: The Basis of Versatility and Adaptation to Intracellular Replication

    PubMed Central

    Gomez-Valero, Laura; Buchrieser, Carmen

    2013-01-01

    Legionella pneumophila is a bacterial pathogen present in aquatic environments that can cause a severe pneumonia called Legionnaires’ disease. Soon after its recognition, it was shown that Legionella replicates inside amoeba, suggesting that bacteria replicating in environmental protozoa are able to exploit conserved signaling pathways in human phagocytic cells. Comparative, evolutionary, and functional genomics suggests that the Legionella–amoeba interaction has shaped this pathogen more than previously thought. A complex evolutionary scenario involving mobile genetic elements, type IV secretion systems, and horizontal gene transfer among Legionella, amoeba, and other organisms seems to take place. This long-lasting coevolution led to the development of very sophisticated virulence strategies and a high level of temporal and spatial fine-tuning of bacteria host–cell interactions. We will discuss current knowledge of the evolution of virulence of Legionella from a genomics perspective and propose our vision of the emergence of this human pathogen from the environment. PMID:23732852

  6. Genomic signatures of evolutionary transitions from solitary to group living

    PubMed Central

    Kapheim, Karen M.; Pan, Hailin; Li, Cai; Salzberg, Steven L.; Puiu, Daniela; Magoc, Tanja; Robertson, Hugh M.; Hudson, Matthew E.; Venkat, Aarti; Fischman, Brielle J.; Hernandez, Alvaro; Yandell, Mark; Ence, Daniel; Holt, Carson; Yocum, George D.; Kemp, William P.; Bosch, Jordi; Waterhouse, Robert M.; Zdobnov, Evgeny M.; Stolle, Eckart; Kraus, F. Bernhard; Helbing, Sophie; Moritz, Robin F. A.; Glastad, Karl M.; Hunt, Brendan G.; Goodisman, Michael A. D.; Hauser, Frank; Grimmelikhuijzen, Cornelis J. P.; Pinheiro, Daniel Guariz; Nunes, Francis Morais Franco; Soares, Michelle Prioli Miranda; Tanaka, Érica Donato; Simões, Zilá Luz Paulino; Hartfelder, Klaus; Evans, Jay D.; Barribeau, Seth M.; Johnson, Reed M.; Massey, Jonathan H.; Southey, Bruce R.; Hasselmann, Martin; Hamacher, Daniel; Biewer, Matthias; Kent, Clement F.; Zayed, Amro; Blatti, Charles; Sinha, Saurabh; Johnston, J. Spencer; Hanrahan, Shawn J.; Kocher, Sarah D.; Wang, Jun; Robinson, Gene E.; Zhang, Guojie

    2017-01-01

    The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks. PMID:25977371

  7. Social evolution. Genomic signatures of evolutionary transitions from solitary to group living.

    PubMed

    Kapheim, Karen M; Pan, Hailin; Li, Cai; Salzberg, Steven L; Puiu, Daniela; Magoc, Tanja; Robertson, Hugh M; Hudson, Matthew E; Venkat, Aarti; Fischman, Brielle J; Hernandez, Alvaro; Yandell, Mark; Ence, Daniel; Holt, Carson; Yocum, George D; Kemp, William P; Bosch, Jordi; Waterhouse, Robert M; Zdobnov, Evgeny M; Stolle, Eckart; Kraus, F Bernhard; Helbing, Sophie; Moritz, Robin F A; Glastad, Karl M; Hunt, Brendan G; Goodisman, Michael A D; Hauser, Frank; Grimmelikhuijzen, Cornelis J P; Pinheiro, Daniel Guariz; Nunes, Francis Morais Franco; Soares, Michelle Prioli Miranda; Tanaka, Érica Donato; Simões, Zilá Luz Paulino; Hartfelder, Klaus; Evans, Jay D; Barribeau, Seth M; Johnson, Reed M; Massey, Jonathan H; Southey, Bruce R; Hasselmann, Martin; Hamacher, Daniel; Biewer, Matthias; Kent, Clement F; Zayed, Amro; Blatti, Charles; Sinha, Saurabh; Johnston, J Spencer; Hanrahan, Shawn J; Kocher, Sarah D; Wang, Jun; Robinson, Gene E; Zhang, Guojie

    2015-06-05

    The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks. Copyright © 2015, American Association for the Advancement of Science.

  8. Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boore, Jeffrey L.

    2004-11-27

    Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincinglymore » resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.« less

  9. The draft genome of the parasitic nematode Trichinella spiralis

    PubMed Central

    Mitreva, Makedonka; Jasmer, Douglas P.; Zarlenga, Dante S.; Wang, Zhengyuan; Abubucker, Sahar; Martin, John; Taylor, Christina M.; Yin, Yong; Fulton, Lucinda; Minx, Pat; Yang, Shiaw-Pyng; Warren, Wesley C.; Fulton, Robert S.; Bhonagiri, Veena; Zhang, Xu; Hallsworth-Pepin, Kym; Clifton, Sandra W.; McCarter, James P.; Appleton, Judith; Mardis, Elaine R.; Wilson, Richard K.

    2011-01-01

    Genome-based studies of metazoan evolution are most informative when phylogenetically diverse species are incorporated in the analysis. As such, evolutionary trends within and outside the phylum Nematoda have been less revealing by focusing only on comparisons involving Caenorhabditis elegans. Herein, we present a draft of the 64 megabase nuclear genome of Trichinella spiralis, containing 15,808 protein coding genes. This parasitic nematode is an extant member of a clade that diverged early in the evolution of the phylum enabling identification of archetypical genes and molecular signatures exclusive to nematodes. Comparative analyses support intrachromosomal rearrangements across the phylum, disproportionate numbers of protein family deaths over births in parasitic vs. a non-parasitic nematode, and a preponderance of gene loss and gain events in nematodes relative to Drosophila melanogaster. This sequence and the panphylum characteristics identified herein will advance evolutionary studies and strategies to combat global parasites of humans, food animals and crops. PMID:21336279

  10. Genome dynamics in Legionella: the basis of versatility and adaptation to intracellular replication.

    PubMed

    Gomez-Valero, Laura; Buchrieser, Carmen

    2013-06-01

    Legionella pneumophila is a bacterial pathogen present in aquatic environments that can cause a severe pneumonia called Legionnaires' disease. Soon after its recognition, it was shown that Legionella replicates inside amoeba, suggesting that bacteria replicating in environmental protozoa are able to exploit conserved signaling pathways in human phagocytic cells. Comparative, evolutionary, and functional genomics suggests that the Legionella-amoeba interaction has shaped this pathogen more than previously thought. A complex evolutionary scenario involving mobile genetic elements, type IV secretion systems, and horizontal gene transfer among Legionella, amoeba, and other organisms seems to take place. This long-lasting coevolution led to the development of very sophisticated virulence strategies and a high level of temporal and spatial fine-tuning of bacteria host-cell interactions. We will discuss current knowledge of the evolution of virulence of Legionella from a genomics perspective and propose our vision of the emergence of this human pathogen from the environment.

  11. Genomic insights into the evolutionary origin of Myxozoa within Cnidaria

    PubMed Central

    Chang, E. Sally; Neuhof, Moran; Rubinstein, Nimrod D.; Diamant, Arik; Philippe, Hervé; Huchon, Dorothée; Cartwright, Paulyn

    2015-01-01

    The Myxozoa comprise over 2,000 species of microscopic obligate parasites that use both invertebrate and vertebrate hosts as part of their life cycle. Although the evolutionary origin of myxozoans has been elusive, a close relationship with cnidarians, a group that includes corals, sea anemones, jellyfish, and hydroids, is supported by some phylogenetic studies and the observation that the distinctive myxozoan structure, the polar capsule, is remarkably similar to the stinging structures (nematocysts) in cnidarians. To gain insight into the extreme evolutionary transition from a free-living cnidarian to a microscopic endoparasite, we analyzed genomic and transcriptomic assemblies from two distantly related myxozoan species, Kudoa iwatai and Myxobolus cerebralis, and compared these to the transcriptome and genome of the less reduced cnidarian parasite, Polypodium hydriforme. A phylogenomic analysis, using for the first time to our knowledge, a taxonomic sampling that represents the breadth of myxozoan diversity, including four newly generated myxozoan assemblies, confirms that myxozoans are cnidarians and are a sister taxon to P. hydriforme. Estimations of genome size reveal that myxozoans have one of the smallest reported animal genomes. Gene enrichment analyses show depletion of expressed genes in categories related to development, cell differentiation, and cell–cell communication. In addition, a search for candidate genes indicates that myxozoans lack key elements of signaling pathways and transcriptional factors important for multicellular development. Our results suggest that the degeneration of the myxozoan body plan from a free-living cnidarian to a microscopic parasitic cnidarian was accompanied by extreme reduction in genome size and gene content. PMID:26627241

  12. Genomic insights into the evolutionary origin of Myxozoa within Cnidaria.

    PubMed

    Chang, E Sally; Neuhof, Moran; Rubinstein, Nimrod D; Diamant, Arik; Philippe, Hervé; Huchon, Dorothée; Cartwright, Paulyn

    2015-12-01

    The Myxozoa comprise over 2,000 species of microscopic obligate parasites that use both invertebrate and vertebrate hosts as part of their life cycle. Although the evolutionary origin of myxozoans has been elusive, a close relationship with cnidarians, a group that includes corals, sea anemones, jellyfish, and hydroids, is supported by some phylogenetic studies and the observation that the distinctive myxozoan structure, the polar capsule, is remarkably similar to the stinging structures (nematocysts) in cnidarians. To gain insight into the extreme evolutionary transition from a free-living cnidarian to a microscopic endoparasite, we analyzed genomic and transcriptomic assemblies from two distantly related myxozoan species, Kudoa iwatai and Myxobolus cerebralis, and compared these to the transcriptome and genome of the less reduced cnidarian parasite, Polypodium hydriforme. A phylogenomic analysis, using for the first time to our knowledge, a taxonomic sampling that represents the breadth of myxozoan diversity, including four newly generated myxozoan assemblies, confirms that myxozoans are cnidarians and are a sister taxon to P. hydriforme. Estimations of genome size reveal that myxozoans have one of the smallest reported animal genomes. Gene enrichment analyses show depletion of expressed genes in categories related to development, cell differentiation, and cell-cell communication. In addition, a search for candidate genes indicates that myxozoans lack key elements of signaling pathways and transcriptional factors important for multicellular development. Our results suggest that the degeneration of the myxozoan body plan from a free-living cnidarian to a microscopic parasitic cnidarian was accompanied by extreme reduction in genome size and gene content.

  13. Comparative Genomics of Listeria Sensu Lato: Genus-Wide Differences in Evolutionary Dynamics and the Progressive Gain of Complex, Potentially Pathogenicity-Related Traits through Lateral Gene Transfer.

    PubMed

    Chiara, Matteo; Caruso, Marta; D'Erchia, Anna Maria; Manzari, Caterina; Fraccalvieri, Rosa; Goffredo, Elisa; Latorre, Laura; Miccolupo, Angela; Padalino, Iolanda; Santagada, Gianfranco; Chiocco, Doriano; Pesole, Graziano; Horner, David S; Parisi, Antonio

    2015-07-15

    Historically, genome-wide and molecular characterization of the genus Listeria has concentrated on the important human pathogen Listeria monocytogenes and a small number of closely related species, together termed Listeria sensu strictu. More recently, a number of genome sequences for more basal, and nonpathogenic, members of the Listeria genus have become available, facilitating a wider perspective on the evolution of pathogenicity and genome level evolutionary dynamics within the entire genus (termed Listeria sensu lato). Here, we have sequenced the genomes of additional Listeria fleischmannii and Listeria newyorkensis isolates and explored the dynamics of genome evolution in Listeria sensu lato. Our analyses suggest that acquisition of genetic material through gene duplication and divergence as well as through lateral gene transfer (mostly from outside Listeria) is widespread throughout the genus. Novel genetic material is apparently subject to rapid turnover. Multiple lines of evidence point to significant differences in evolutionary dynamics between the most basal Listeria subclade and all other congeners, including both sensu strictu and other sensu lato isolates. Strikingly, these differences are likely attributable to stochastic, population-level processes and contribute to observed variation in genome size across the genus. Notably, our analyses indicate that the common ancestor of Listeria sensu lato lacked flagella, which were acquired by lateral gene transfer by a common ancestor of Listeria grayi and Listeria sensu strictu, whereas a recently functionally characterized pathogenicity island, responsible for the capacity to produce cobalamin and utilize ethanolamine/propane-2-diol, was acquired in an ancestor of Listeria sensu strictu. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Comparative genomics of transcriptional regulation of methionine metabolism in proteobacteria

    DOE PAGES

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; ...

    2014-11-20

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ~200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific andmore » genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.« less

  15. Human genomic disease variants: a neutral evolutionary explanation.

    PubMed

    Dudley, Joel T; Kim, Yuseob; Liu, Li; Markov, Glenn J; Gerold, Kristyn; Chen, Rong; Butte, Atul J; Kumar, Sudhir

    2012-08-01

    Many perspectives on the role of evolution in human health include nonempirical assumptions concerning the adaptive evolutionary origins of human diseases. Evolutionary analyses of the increasing wealth of clinical and population genomic data have begun to challenge these presumptions. In order to systematically evaluate such claims, the time has come to build a common framework for an empirical and intellectual unification of evolution and modern medicine. We review the emerging evidence and provide a supporting conceptual framework that establishes the classical neutral theory of molecular evolution (NTME) as the basis for evaluating disease- associated genomic variations in health and medicine. For over a decade, the NTME has already explained the origins and distribution of variants implicated in diseases and has illuminated the power of evolutionary thinking in genomic medicine. We suggest that a majority of disease variants in modern populations will have neutral evolutionary origins (previously neutral), with a relatively smaller fraction exhibiting adaptive evolutionary origins (previously adaptive). This pattern is expected to hold true for common as well as rare disease variants. Ultimately, a neutral evolutionary perspective will provide medicine with an informative and actionable framework that enables objective clinical assessment beyond convenient tendencies to invoke past adaptive events in human history as a root cause of human disease.

  16. Human genomic disease variants: A neutral evolutionary explanation

    PubMed Central

    Dudley, Joel T.; Kim, Yuseob; Liu, Li; Markov, Glenn J.; Gerold, Kristyn; Chen, Rong; Butte, Atul J.; Kumar, Sudhir

    2012-01-01

    Many perspectives on the role of evolution in human health include nonempirical assumptions concerning the adaptive evolutionary origins of human diseases. Evolutionary analyses of the increasing wealth of clinical and population genomic data have begun to challenge these presumptions. In order to systematically evaluate such claims, the time has come to build a common framework for an empirical and intellectual unification of evolution and modern medicine. We review the emerging evidence and provide a supporting conceptual framework that establishes the classical neutral theory of molecular evolution (NTME) as the basis for evaluating disease- associated genomic variations in health and medicine. For over a decade, the NTME has already explained the origins and distribution of variants implicated in diseases and has illuminated the power of evolutionary thinking in genomic medicine. We suggest that a majority of disease variants in modern populations will have neutral evolutionary origins (previously neutral), with a relatively smaller fraction exhibiting adaptive evolutionary origins (previously adaptive). This pattern is expected to hold true for common as well as rare disease variants. Ultimately, a neutral evolutionary perspective will provide medicine with an informative and actionable framework that enables objective clinical assessment beyond convenient tendencies to invoke past adaptive events in human history as a root cause of human disease. PMID:22665443

  17. Rapid diversification of five Oryza AA genomes associated with rice adaptation.

    PubMed

    Zhang, Qun-Jie; Zhu, Ting; Xia, En-Hua; Shi, Chao; Liu, Yun-Long; Zhang, Yun; Liu, Yuan; Jiang, Wen-Kai; Zhao, You-Jie; Mao, Shu-Yan; Zhang, Li-Ping; Huang, Hui; Jiao, Jun-Ying; Xu, Ping-Zhen; Yao, Qiu-Yang; Zeng, Fan-Chun; Yang, Li-Li; Gao, Ju; Tao, Da-Yun; Wang, Yue-Ju; Bennetzen, Jeffrey L; Gao, Li-Zhi

    2014-11-18

    Comparative genomic analyses among closely related species can greatly enhance our understanding of plant gene and genome evolution. We report de novo-assembled AA-genome sequences for Oryza nivara, Oryza glaberrima, Oryza barthii, Oryza glumaepatula, and Oryza meridionalis. Our analyses reveal massive levels of genomic structural variation, including segmental duplication and rapid gene family turnover, with particularly high instability in defense-related genes. We show, on a genomic scale, how lineage-specific expansion or contraction of gene families has led to their morphological and reproductive diversification, thus enlightening the evolutionary process of speciation and adaptation. Despite strong purifying selective pressures on most Oryza genes, we documented a large number of positively selected genes, especially those genes involved in flower development, reproduction, and resistance-related processes. These diversifying genes are expected to have played key roles in adaptations to their ecological niches in Asia, South America, Africa and Australia. Extensive variation in noncoding RNA gene numbers, function enrichment, and rates of sequence divergence might also help account for the different genetic adaptations of these rice species. Collectively, these resources provide new opportunities for evolutionary genomics, numerous insights into recent speciation, a valuable database of functional variation for crop improvement, and tools for efficient conservation of wild rice germplasm.

  18. Rapid diversification of five Oryza AA genomes associated with rice adaptation

    PubMed Central

    Zhang, Qun-Jie; Zhu, Ting; Xia, En-Hua; Shi, Chao; Liu, Yun-Long; Zhang, Yun; Liu, Yuan; Jiang, Wen-Kai; Zhao, You-Jie; Mao, Shu-Yan; Zhang, Li-Ping; Huang, Hui; Jiao, Jun-Ying; Xu, Ping-Zhen; Yao, Qiu-Yang; Zeng, Fan-Chun; Yang, Li-Li; Gao, Ju; Tao, Da-Yun; Wang, Yue-Ju; Bennetzen, Jeffrey L.; Gao, Li-Zhi

    2014-01-01

    Comparative genomic analyses among closely related species can greatly enhance our understanding of plant gene and genome evolution. We report de novo-assembled AA-genome sequences for Oryza nivara, Oryza glaberrima, Oryza barthii, Oryza glumaepatula, and Oryza meridionalis. Our analyses reveal massive levels of genomic structural variation, including segmental duplication and rapid gene family turnover, with particularly high instability in defense-related genes. We show, on a genomic scale, how lineage-specific expansion or contraction of gene families has led to their morphological and reproductive diversification, thus enlightening the evolutionary process of speciation and adaptation. Despite strong purifying selective pressures on most Oryza genes, we documented a large number of positively selected genes, especially those genes involved in flower development, reproduction, and resistance-related processes. These diversifying genes are expected to have played key roles in adaptations to their ecological niches in Asia, South America, Africa and Australia. Extensive variation in noncoding RNA gene numbers, function enrichment, and rates of sequence divergence might also help account for the different genetic adaptations of these rice species. Collectively, these resources provide new opportunities for evolutionary genomics, numerous insights into recent speciation, a valuable database of functional variation for crop improvement, and tools for efficient conservation of wild rice germplasm. PMID:25368197

  19. Comparative mitochondrial genome analysis reveals the evolutionary rearrangement mechanism in Brassica.

    PubMed

    Yang, J; Liu, G; Zhao, N; Chen, S; Liu, D; Ma, W; Hu, Z; Zhang, M

    2016-05-01

    The genus Brassica has many species that are important for oil, vegetable and other food products. Three mitochondrial genome types (mitotype) originated from its common ancestor. In this paper, a B. nigra mitochondrial main circle genome with 232,407 bp was generated through de novo assembly. Synteny analysis showed that the mitochondrial genomes of B. rapa and B. oleracea had a better syntenic relationship than B. nigra. Principal components analysis and development of a phylogenetic tree indicated maternal ancestors of three allotetraploid species in Us triangle of Brassica. Diversified mitotypes were found in allotetraploid B. napus, in which napus-type B. napus was derived from B. oleracea, while polima-type B. napus was inherited from B. rapa. In addition, the mitochondrial genome of napus-type B. napus was closer to botrytis-type than capitata-type B. oleracea. The sub-stoichiometric shifting of several mitochondrial genes suggested that mitochondrial genome rearrangement underwent evolutionary selection during domestication and/or plant breeding. Our findings clarify the role of diploid species in the maternal origin of allotetraploid species in Brassica and suggest the possibility of breeding selection of the mitochondrial genome. © 2015 German Botanical Society and The Royal Botanical Society of the Netherlands.

  20. Toward understanding the evolution of vertebrate gene regulatory networks: comparative genomics and epigenomic approaches.

    PubMed

    Martinez-Morales, Juan R

    2016-07-01

    Vertebrates, as most animal phyla, originated >500 million years ago during the Cambrian explosion, and progressively radiated into the extant classes. Inferring the evolutionary history of the group requires understanding the architecture of the developmental programs that constrain the vertebrate anatomy. Here, I review recent comparative genomic and epigenomic studies, based on ChIP-seq and chromatin accessibility, which focus on the identification of functionally equivalent cis-regulatory modules among species. This pioneer work, primarily centered in the mammalian lineage, has set the groundwork for further studies in representative vertebrate and chordate species. Mapping of active regulatory regions across lineages will shed new light on the evolutionary forces stabilizing ancestral developmental programs, as well as allowing their variation to sustain morphological adaptations on the inherited vertebrate body plan. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  1. Evolutionary Patterns of RNA-Based Duplication in Non-Mammalian Chordates

    PubMed Central

    Li, Xin; Vibranovski, Maria D.; Gan, Xiaoni; Wang, Dengqiang; Wang, Wen; Long, Manyuan; He, Shunping

    2011-01-01

    The role of RNA-based duplication, or retroposition, in the evolution of new gene functions in mammals, plants, and Drosophila has been widely reported. However, little is known about RNA-based duplication in non-mammalian chordates. In this study, we screened ten non-mammalian chordate genomes for retrocopies and investigated their evolutionary patterns. We identified numerous retrocopies in these species. Examination of the age distribution of these retrocopies revealed no burst of young retrocopies in ancient chordate species. Upon comparing these non-mammalian chordate species to the mammalian species, we observed that a larger fraction of the non-mammalian retrocopies was under strong evolutionary constraints than mammalian retrocopies are, as evidenced by signals of purifying selection and expression profiles. For the Western clawed frog, Medaka, and Sea squirt, many retrogenes have evolved gonad and brain expression patterns, similar to what was observed in human. Testing of retrogene movement in the Medaka genome, where the nascent sex chrosomes have been well assembled, did not reveal any significant gene movement. Taken together, our analyses demonstrate that RNA-based duplication generates many functional genes and can make a significant contribution to the evolution of non-mammalian genomes. PMID:21779328

  2. The emergence of human-evolutionary medical genomics

    PubMed Central

    Crespi, Bernard J

    2011-01-01

    In this review, I describe how evolutionary genomics is uniquely suited to spearhead advances in understanding human disease risk, owing to the privileged position of genes as fundamental causes of phenotypic variation, and the ability of population genetic and phylogenetic methods to robustly infer processes of natural selection, drift, and mutation from genetic variation at the levels of family, population, species, and clade. I first provide an overview of models for the origins and maintenance of genetically based disease risk in humans. I then discuss how analyses of genetic disease risk can be dovetailed with studies of positive and balancing selection, to evaluate the degree to which the ‘genes that make us human’ also represent the genes that mediate risk of polygenic disease. Finally, I present four basic principles for the nascent field of human evolutionary medical genomics, each of which represents a process that is nonintuitive from a proximate perspective. Joint consideration of these principles compels novel forms of interdisciplinary analyses, most notably studies that (i) analyze tradeoffs at the level of molecular genetics, and (ii) identify genetic variants that are derived in the human lineage or in specific populations, and then compare individuals with derived versus ancestral alleles. PMID:25567974

  3. The genomic landscape of rapid repeated evolutionary ...

    EPA Pesticide Factsheets

    Atlantic killifish populations have rapidly adapted to normally lethal levels of pollution in four urban estuaries. Through analysis of 384 whole killifish genome sequences and comparative transcriptomics in four pairs of sensitive and tolerant populations, we identify the aryl hydrocarbon receptor–based signaling pathway as a shared target of selection. This suggests evolutionary constraint on adaptive solutions to complex toxicant mixtures at each site. However, distinct molecular variants apparently contribute to adaptive pathway modification among tolerant populations. Selection also targets other toxicity-mediatinggenes and genes of connected signaling pathways; this indicates complex tolerance phenotypes and potentially compensatory adaptations. Molecular changes are consistent with selection on standing genetic variation. In killifish, high nucleotide diversityhas likely been a crucial substrate for selective sweeps to propel rapid adaptation. This manuscript describes genomic evaluations that contribute to our understanding of the ecological and evolutionary risks associated with chronic contaminant exposures to wildlife populations. Here, we assessed genetic patterns associated with long-term response to an important class of highly toxic environmental pollutants. Specifically, chemical-specific tolerance has rapidly and repeatedly evolved in an estuarine fish species resident to estuaries of the Atlantic U.S. coast. We used laboratory studies to ch

  4. Grand challenges in evolutionary and population genetics: The importance of integrating epigenetics, genomics, modeling, and experimentation

    Treesearch

    Samuel A. Cushman

    2014-01-01

    This is a time of explosive growth in the fields of evolutionary and population genetics, with whole genome sequencing and bioinformatics driving a transformative paradigm shift (Morozova and Marra, 2008). At the same time, advances in epigenetics are thoroughly transforming our understanding of evolutionary processes and their implications for populations, species and...

  5. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes

    PubMed Central

    Wörheide, Gert

    2017-01-01

    Abstract One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. PMID:28633296

  6. Contrasting evolutionary patterns of multiple loci uncover new aspects in the genome origin and evolutionary history of Leymus (Triticeae; Poaceae).

    PubMed

    Sha, Li-Na; Fan, Xing; Li, Jun; Liao, Jin-Qiu; Zeng, Jian; Wang, Yi; Kang, Hou-Yang; Zhang, Hai-Qin; Zheng, You-Liang; Zhou, Yong-Hong

    2017-09-01

    Leymus Hochst. (Triticeae: Poaceae), a group of allopolyploid species with the NsXm genomes, is a perennial genus with diversity in morphology, cytology, ecology, and distribution in the Triticeae. To investigate the genome origin and evolutionary history of Leymus, three unlinked low-copy nuclear genes (Acc1, Pgk1, and GBSSI) and three chloroplast regions (trnL-F, matK, and rbcL) of 32 Leymus species were analyzed with those of 36 diploid species representing 18 basic genomes in the Triticeae. The phylogenetic relationships were reconstructed using Bayesian inference, Maximum parsimony, and NeighborNet methods. A time-calibrated phylogeny was generated to estimate the evolutionary history of Leymus. The results suggest that reticulate evolution has occurred in Leymus species, with several distinct progenitors contributing to the Leymus. The molecular data in resolution of the Xm-genome lineage resulted in two apparently contradictory results, with one placing the Xm-genome lineage as closely related to the P/F genome and the other splitting the Xm-genome lineage as sister to the Ns-genome donor. Our results suggested that (1) the Ns genome of Leymus was donated by Psathyrostachys, and additional Ns-containing alleles may be introgressed into some Leymus polyploids by recurrent hybridization; (2) The phylogenetic incongruence regarding the resolution of the Xm-genome lineage suggested that the Xm genome of Leymus was closely related to the P genome of Agropyron; (3) Both Ns- and Xm-genome lineages served as the maternal donor during the speciation of Leymus species; (4) The Pseudoroegneria, Lophopyrum and Australopyrum genomes contributed to some Leymus species. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Odonata (dragonflies and damselflies) as a bridge between ecology and evolutionary genomics.

    PubMed

    Bybee, Seth; Córdoba-Aguilar, Alex; Duryea, M Catherine; Futahashi, Ryo; Hansson, Bengt; Lorenzo-Carballa, M Olalla; Schilder, Ruud; Stoks, Robby; Suvorov, Anton; Svensson, Erik I; Swaegers, Janne; Takahashi, Yuma; Watts, Phillip C; Wellenreuther, Maren

    2016-01-01

    Odonata (dragonflies and damselflies) present an unparalleled insect model to integrate evolutionary genomics with ecology for the study of insect evolution. Key features of Odonata include their ancient phylogenetic position, extensive phenotypic and ecological diversity, several unique evolutionary innovations, ease of study in the wild and usefulness as bioindicators for freshwater ecosystems worldwide. In this review, we synthesize studies on the evolution, ecology and physiology of odonates, highlighting those areas where the integration of ecology with genomics would yield significant insights into the evolutionary processes that would not be gained easily by working on other animal groups. We argue that the unique features of this group combined with their complex life cycle, flight behaviour, diversity in ecological niches and their sensitivity to anthropogenic change make odonates a promising and fruitful taxon for genomics focused research. Future areas of research that deserve increased attention are also briefly outlined.

  8. Hunter-gatherer genomics: Evolutionary insights and ethical considerations

    PubMed Central

    Bankoff, Richard J.; Perry, George H.

    2016-01-01

    Hunting and gathering societies currently comprise only a small proportion of all human populations. However, the geographic and environmental diversity of modern hunter-gatherer groups, their inherent dependence on ecological resources, and their connection to patterns of behavior and subsistence that represent the vast majority of human history provide opportunities for scientific research to deliver major insights into the evolutionary history of our species. We review recent evolutionary genomic studies of hunter-gatherers, focusing especially on those that identify and functionally characterize phenotypic adaptations to local environments. We also call attention to specific ethical issues that scientists conducting hunter-gatherer genomics research ought to consider, including potential social and economic tensions between traditionally mobile hunter-gatherers and the land ownership-based nation-states by which they are governed, and the implications of genomic-based evidence of long-term evolutionary associations with particular habitats. PMID:27400119

  9. Reconstruction of the evolution of microbial defense systems.

    PubMed

    Puigbò, Pere; Makarova, Kira S; Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

    2017-04-04

    Evolution of bacterial and archaeal genomes is a highly dynamic process that involves intensive loss of genes as well as gene gain via horizontal transfer, with a lesser contribution from gene duplication. The rates of these processes can be estimated by comparing genomes that are linked by an evolutionary tree. These estimated rates of genome dynamics events substantially differ for different functional classes of genes. The genes involved in defense against viruses and other invading DNA are among those that are gained and lost at the highest rates. We employed a stochastic birth-and-death model to obtain maximum likelihood estimates of the rates of gain and loss of defense genes in 35 groups of closely related bacterial genomes and one group of archaeal genomes. We find that on average, the defense genes experience 1.4 fold higher flux than the rest of microbial genes. This excessive flux of defense genes over the genomic mean is consistent across diverse microbial groups. The few exceptions include intracellular parasites with small, degraded genomes that possess few defense systems which are more stable than in other microbes. Generally, defense genes follow the previously established pattern of genome dynamics, with gene family loss being about 3 times more common than gain and an order of magnitude more common than expansion or contraction of gene families. Case by case analysis of the evolutionary dynamics of defense genes indicates frequent multiple events in the same locus and widespread involvement of mobile elements in the gain and loss of defense genes. Evolution of microbial defense systems is highly dynamic but, notwithstanding the host-parasite arms race, generally follows the same trends that have been established for the rest of the genes. Apart from the paucity and the low flux of defense genes in parasitic bacteria with deteriorating genomes, there is no clear connection between the evolutionary regime of defense systems and microbial life style.

  10. Recombinant transfer in the basic genome of E. coli

    DOE PAGES

    Dixit, Purushottam; Studier, F. William; Pang, Tin Yau; ...

    2015-07-07

    An approximation to the ~4-Mbp basic genome shared by 32 strains of E. coli representing six evolutionary groups has been derived and analyzed computationally. A multiple-alignment of the 32 complete genome sequences was filtered to remove mobile elements and identify the most reliable ~90% of the aligned length of each of the resulting 496 basic-genome pairs. Patterns of single bp mutations (SNPs) in aligned pairs distinguish clonally inherited regions from regions where either genome has acquired DNA fragments from diverged genomes by homologous recombination since their last common ancestor. Such recombinant transfer is pervasive across the basic genome, mostly betweenmore » genomes in the same evolutionary group, and generates many unique mosaic patterns. The six least-diverged genome-pairs have one or two recombinant transfers of length ~40–115 kbp (and few if any other transfers), each containing one or more gene clusters known to confer strong selective advantage in some environments. Moderately diverged genome pairs (0.4–1% SNPs) show mosaic patterns of interspersed clonal and recombinant regions of varying lengths throughout the basic genome, whereas more highly diverged pairs within an evolutionary group or pairs between evolutionary groups having >1.3% SNPs have few clonal matches longer than a few kbp. Many recombinant transfers appear to incorporate fragments of the entering DNA produced by restriction systems of the recipient cell. A simple computational model can closely fit the data. As a result, most recombinant transfers seem likely to be due to generalized transduction by co-evolving populations of phages, which could efficiently distribute variability throughout bacterial genomes.« less

  11. Recombinant transfer in the basic genome of E. coli

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dixit, Purushottam; Studier, F. William; Pang, Tin Yau

    An approximation to the ~4-Mbp basic genome shared by 32 strains of E. coli representing six evolutionary groups has been derived and analyzed computationally. A multiple-alignment of the 32 complete genome sequences was filtered to remove mobile elements and identify the most reliable ~90% of the aligned length of each of the resulting 496 basic-genome pairs. Patterns of single bp mutations (SNPs) in aligned pairs distinguish clonally inherited regions from regions where either genome has acquired DNA fragments from diverged genomes by homologous recombination since their last common ancestor. Such recombinant transfer is pervasive across the basic genome, mostly betweenmore » genomes in the same evolutionary group, and generates many unique mosaic patterns. The six least-diverged genome-pairs have one or two recombinant transfers of length ~40–115 kbp (and few if any other transfers), each containing one or more gene clusters known to confer strong selective advantage in some environments. Moderately diverged genome pairs (0.4–1% SNPs) show mosaic patterns of interspersed clonal and recombinant regions of varying lengths throughout the basic genome, whereas more highly diverged pairs within an evolutionary group or pairs between evolutionary groups having >1.3% SNPs have few clonal matches longer than a few kbp. Many recombinant transfers appear to incorporate fragments of the entering DNA produced by restriction systems of the recipient cell. A simple computational model can closely fit the data. As a result, most recombinant transfers seem likely to be due to generalized transduction by co-evolving populations of phages, which could efficiently distribute variability throughout bacterial genomes.« less

  12. Comparative Genomics of Erwinia amylovora and Related Erwinia Species—What do We Learn?

    PubMed Central

    Zhao, Youfu; Qi, Mingsheng

    2011-01-01

    Erwinia amylovora, the causal agent of fire blight disease of apples and pears, is one of the most important plant bacterial pathogens with worldwide economic significance. Recent reports on the complete or draft genome sequences of four species in the genus Erwinia, including E. amylovora, E. pyrifoliae, E. tasmaniensis, and E. billingiae, have provided us near complete genetic information about this pathogen and its closely-related species. This review describes in silico subtractive hybridization-based comparative genomic analyses of eight genomes currently available, and highlights what we have learned from these comparative analyses, as well as genetic and functional genomic studies. Sequence analyses reinforce the assumption that E. amylovora is a relatively homogeneous species and support the current classification scheme of E. amylovora and its related species. The potential evolutionary origin of these Erwinia species is also proposed. The current understanding of the pathogen, its virulence mechanism and host specificity from genome sequencing data is summarized. Future research directions are also suggested. PMID:24710213

  13. Dcode.org anthology of comparative genomic tools.

    PubMed

    Loots, Gabriela G; Ovcharenko, Ivan

    2005-07-01

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the non-coding encryption of gene regulation across genomes. To facilitate the practical application of comparative sequence analysis to genetics and genomics, we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools, zPicture and Mulan; a phylogenetic shadowing tool, eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools, rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, Creme 2.0; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here, we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ website.

  14. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    2016-10-01

    The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  15. Ancient bacterial endosymbionts of insects: Genomes as sources of insight and springboards for inquiry.

    PubMed

    Wernegreen, Jennifer J

    2017-09-15

    Ancient associations between insects and bacteria provide models to study intimate host-microbe interactions. Currently, a wealth of genome sequence data for long-term, obligately intracellular (primary) endosymbionts of insects reveals profound genomic consequences of this specialized bacterial lifestyle. Those consequences include severe genome reduction and extreme base compositions. This minireview highlights the utility of genome sequence data to understand how, and why, endosymbionts have been pushed to such extremes, and to illuminate the functional consequences of such extensive genome change. While the static snapshots provided by individual endosymbiont genomes are valuable, comparative analyses of multiple genomes have shed light on evolutionary mechanisms. Namely, genome comparisons have told us that selection is important in fine-tuning gene content, but at the same time, mutational pressure and genetic drift contribute to genome degradation. Examples from Blochmannia, the primary endosymbiont of the ant tribe Camponotini, illustrate the value and constraints of genome sequence data, and exemplify how genomes can serve as a springboard for further comparative and experimental inquiry. Copyright © 2017. Published by Elsevier Inc.

  16. How much does the amphioxus genome represent the ancestor of chordates?

    PubMed

    Louis, Alexandra; Roest Crollius, Hugues; Robinson-Rechavi, Marc

    2012-03-01

    One of the main motivations to study amphioxus is its potential for understanding the last common ancestor of chordates, which notably gave rise to the vertebrates. An important feature in this respect is the slow evolutionary rate that seems to have characterized the cephalochordate lineage, making amphioxus an interesting proxy for the chordate ancestor, as well as a key lineage to include in comparative studies. Whereas slow evolution was first noticed at the phenotypic level, it has also been described at the genomic level. Here, we examine whether the amphioxus genome is indeed a good proxy for the genome of the chordate ancestor, with a focus on protein-coding genes. We investigate genome features, such as synteny, gene duplication and gene loss, and contrast the amphioxus genome with those of other deuterostomes that are used in comparative studies, such as Ciona, Oikopleura and urchin.

  17. The seahorse genome and the evolution of its specialized morphology.

    PubMed

    Lin, Qiang; Fan, Shaohua; Zhang, Yanhong; Xu, Meng; Zhang, Huixian; Yang, Yulan; Lee, Alison P; Woltering, Joost M; Ravi, Vydianathan; Gunter, Helen M; Luo, Wei; Gao, Zexia; Lim, Zhi Wei; Qin, Geng; Schneider, Ralf F; Wang, Xin; Xiong, Peiwen; Li, Gang; Wang, Kai; Min, Jiumeng; Zhang, Chi; Qiu, Ying; Bai, Jie; He, Weiming; Bian, Chao; Zhang, Xinhui; Shan, Dai; Qu, Hongyue; Sun, Ying; Gao, Qiang; Huang, Liangmin; Shi, Qiong; Meyer, Axel; Venkatesh, Byrappa

    2016-12-14

    Seahorses have a specialized morphology that includes a toothless tubular mouth, a body covered with bony plates, a male brood pouch, and the absence of caudal and pelvic fins. Here we report the sequencing and de novo assembly of the genome of the tiger tail seahorse, Hippocampus comes. Comparative genomic analysis identifies higher protein and nucleotide evolutionary rates in H. comes compared with other teleost fish genomes. We identified an astacin metalloprotease gene family that has undergone expansion and is highly expressed in the male brood pouch. We also find that the H. comes genome lacks enamel matrix protein-coding proline/glutamine-rich secretory calcium-binding phosphoprotein genes, which might have led to the loss of mineralized teeth. tbx4, a regulator of hindlimb development, is also not found in H. comes genome. Knockout of tbx4 in zebrafish showed a 'pelvic fin-loss' phenotype similar to that of seahorses.

  18. A Time-Calibrated Road Map of Brassicaceae Species Radiation and Evolutionary History[OPEN

    PubMed Central

    Hohmann, Nora; Wolf, Eva M.

    2015-01-01

    The Brassicaceae include several major crop plants and numerous important model species in comparative evolutionary research such as Arabidopsis, Brassica, Boechera, Thellungiella, and Arabis species. As any evolutionary hypothesis needs to be placed in a temporal context, reliably dated major splits within the evolution of Brassicaceae are essential. We present a comprehensive time-calibrated framework with important divergence time estimates based on whole-chloroplast sequence data for 29 Brassicaceae species. Diversification of the Brassicaceae crown group started at the Eocene-to-Oligocene transition. Subsequent major evolutionary splits are dated to ∼20 million years ago, coinciding with the Oligocene-to-Miocene transition, with increasing drought and aridity and transient glaciation events. The age of the Arabidopsis thaliana crown group is 6 million years ago, at the Miocene and Pliocene border. The overall species richness of the family is well explained by high levels of neopolyploidy (43% in total), but this trend is neither directly associated with an increase in genome size nor is there a general lineage-specific constraint. Our results highlight polyploidization as an important source for generating new evolutionary lineages adapted to changing environments. We conclude that species radiation, paralleled by high levels of neopolyploidization, follows genome size decrease, stabilization, and genetic diploidization. PMID:26410304

  19. The Ditylenchus destructor genome provides new insights into the evolution of plant parasitic nematodes.

    PubMed

    Zheng, Jinshui; Peng, Donghai; Chen, Ling; Liu, Hualin; Chen, Feng; Xu, Mengci; Ju, Shouyong; Ruan, Lifang; Sun, Ming

    2016-07-27

    Plant-parasitic nematodes were found in 4 of the 12 clades of phylum Nematoda. These nematodes in different clades may have originated independently from their free-living fungivorous ancestors. However, the exact evolutionary process of these parasites is unclear. Here, we sequenced the genome sequence of a migratory plant nematode, Ditylenchus destructor We performed comparative genomics among the free-living nematode, Caenorhabditis elegans and all the plant nematodes with genome sequences available. We found that, compared with C. elegans, the core developmental control processes underwent heavy reduction, though most signal transduction pathways were conserved. We also found D. destructor contained more homologies of the key genes in the above processes than the other plant nematodes. We suggest that Ditylenchus spp. may be an intermediate evolutionary history stage from free-living nematodes that feed on fungi to obligate plant-parasitic nematodes. Based on the facts that D. destructor can feed on fungi and has a relatively short life cycle, and that it has similar features to both C. elegans and sedentary plant-parasitic nematodes from clade 12, we propose it as a new model to study the biology, biocontrol of plant nematodes and the interaction between nematodes and plants. © 2016 The Author(s).

  20. Hidden genetic variation in the germline genome of Tetrahymena thermophila.

    PubMed

    Dimond, K L; Zufall, R A

    2016-06-01

    Genome architecture varies greatly among eukaryotes. This diversity may profoundly affect the origin and maintenance of genetic variation within a population. Ciliates are microbial eukaryotes with unusual genome features, such as the separation of germline and somatic genomes within a single cell and amitotic division. These features have previously been proposed to increase the rate of molecular evolution in these species. Here, we assessed the fitness effects of genetic variation in the two genomes of natural isolates of the ciliate Tetrahymena thermophila. We find more extensive genetic variation in fitness in the transcriptionally silent germline genome than in the expressed somatic genome. Surprisingly, this variation is not primarily deleterious, but has both beneficial and deleterious effects. We conclude that Tetrahymena genome architecture allows for the maintenance of genetic variation that would otherwise be eliminated by selection. We consider the effect of selection on the two genomes and the impacts of reproductive strategies and the mechanism of sex determination on the structure of this variation. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

  1. Genome of the Actinomycete Plant Pathogen Clavibacter michiganensis subsp. sepedonicus Suggests Recent Niche Adaptation▿ †

    PubMed Central

    Bentley, Stephen D.; Corton, Craig; Brown, Susan E.; Barron, Andrew; Clark, Louise; Doggett, Jon; Harris, Barbara; Ormond, Doug; Quail, Michael A.; May, Georgiana; Francis, David; Knudson, Dennis; Parkhill, Julian; Ishimaru, Carol A.

    2008-01-01

    Clavibacter michiganensis subsp. sepedonicus is a plant-pathogenic bacterium and the causative agent of bacterial ring rot, a devastating agricultural disease under strict quarantine control and zero tolerance in the seed potato industry. This organism appears to be largely restricted to an endophytic lifestyle, proliferating within plant tissues and unable to persist in the absence of plant material. Analysis of the genome sequence of C. michiganensis subsp. sepedonicus and comparison with the genome sequences of related plant pathogens revealed a dramatic recent evolutionary history. The genome contains 106 insertion sequence elements, which appear to have been active in extensive rearrangement of the chromosome compared to that of Clavibacter michiganensis subsp. michiganensis. There are 110 pseudogenes with overrepresentation in functions associated with carbohydrate metabolism, transcriptional regulation, and pathogenicity. Genome comparisons also indicated that there is substantial gene content diversity within the species, probably due to differential gene acquisition and loss. These genomic features and evolutionary dating suggest that there was recent adaptation for life in a restricted niche where nutrient diversity and perhaps competition are low, correlated with a reduced ability to exploit previously occupied complex niches outside the plant. Toleration of factors such as multiplication and integration of insertion sequence elements, genome rearrangements, and functional disruption of many genes and operons seems to indicate that there has been general relaxation of selective pressure on a large proportion of the genome. PMID:18192393

  2. Whole genome analysis reveals the diversity and evolutionary relationships between necrotic enteritis-causing strains of Clostridium perfringens.

    PubMed

    Lacey, Jake A; Allnutt, Theodore R; Vezina, Ben; Van, Thi Thu Hao; Stent, Thomas; Han, Xiaoyan; Rood, Julian I; Wade, Ben; Keyburn, Anthony L; Seemann, Torsten; Chen, Honglei; Haring, Volker; Johanesen, Priscilla A; Lyras, Dena; Moore, Robert J

    2018-05-22

    Clostridium perfringens causes a range of diseases in animals and humans including necrotic enteritis in chickens and food poisoning and gas gangrene in humans. Necrotic enteritis is of concern in commercial chicken production due to the cost of the implementation of infection control measures and to productivity losses. This study has focused on the genomic analysis of a range of chicken-derived C. perfringens isolates, from around the world and from different years. The genomes were sequenced and compared with 20 genomes available from public databases, which were from a diverse collection of isolates from chickens, other animals, and humans. We used a distance based phylogeny that was constructed based on gene content rather than sequence identity. Similarity between strains was defined as the number of genes that they have in common divided by their total number of genes. In this type of phylogenetic analysis, evolutionary distance can be interpreted in terms of evolutionary events such as acquisition and loss of genes, whereas the underlying properties (the gene content) can be interpreted in terms of function. We also compared these methods to the sequence-based phylogeny of the core genome. Distinct pathogenic clades of necrotic enteritis-causing C. perfringens were identified. They were characterised by variable regions encoded on the chromosome, with predicted roles in capsule production, adhesion, inhibition of related strains, phage integration, and metabolism. Some strains have almost identical genomes, even though they were isolated from different geographic regions at various times, while other highly distant genomes appear to result in similar outcomes with regard to virulence and pathogenesis. The high level of diversity in chicken isolates suggests there is no reliable factor that defines a chicken strain of C. perfringens, however, disease-causing strains can be defined by the presence of netB-encoding plasmids. This study reveals that horizontal gene transfer appears to play a significant role in genetic variation of the C. perfringens chromosome as well as the plasmid content within strains.

  3. The Schistosoma mansoni phylome: using evolutionary genomics to gain insight into a parasite's biology.

    PubMed

    Silva, Larissa Lopes; Marcet-Houben, Marina; Nahum, Laila Alves; Zerlotini, Adhemar; Gabaldón, Toni; Oliveira, Guilherme

    2012-11-13

    Schistosoma mansoni is one of the causative agents of schistosomiasis, a neglected tropical disease that affects about 237 million people worldwide. Despite recent efforts, we still lack a general understanding of the relevant host-parasite interactions, and the possible treatments are limited by the emergence of resistant strains and the absence of a vaccine. The S. mansoni genome was completely sequenced and still under continuous annotation. Nevertheless, more than 45% of the encoded proteins remain without experimental characterization or even functional prediction. To improve our knowledge regarding the biology of this parasite, we conducted a proteome-wide evolutionary analysis to provide a broad view of the S. mansoni's proteome evolution and to improve its functional annotation. Using a phylogenomic approach, we reconstructed the S. mansoni phylome, which comprises the evolutionary histories of all parasite proteins and their homologs across 12 other organisms. The analysis of a total of 7,964 phylogenies allowed a deeper understanding of genomic complexity and evolutionary adaptations to a parasitic lifestyle. In particular, the identification of lineage-specific gene duplications pointed to the diversification of several protein families that are relevant for host-parasite interaction, including proteases, tetraspanins, fucosyltransferases, venom allergen-like proteins, and tegumental-allergen-like proteins. In addition to the evolutionary knowledge, the phylome data enabled us to automatically re-annotate 3,451 proteins through a phylogenetic-based approach rather than solely sequence similarity searches. To allow further exploitation of this valuable data, all information has been made available at PhylomeDB (http://www.phylomedb.org). In this study, we used an evolutionary approach to assess S. mansoni parasite biology, improve genome/proteome functional annotation, and provide insights into host-parasite interactions. Taking advantage of a proteome-wide perspective rather than focusing on individual proteins, we identified that this parasite has experienced specific gene duplication events, particularly affecting genes that are potentially related to the parasitic lifestyle. These innovations may be related to the mechanisms that protect S. mansoni against host immune responses being important adaptations for the parasite survival in a potentially hostile environment. Continuing this work, a comparative analysis involving genomic, transcriptomic, and proteomic data from other helminth parasites, other parasites, and vectors will supply more information regarding parasite's biology as well as host-parasite interactions.

  4. The locus of sexual selection: moving sexual selection studies into the post-genomics era.

    PubMed

    Wilkinson, G S; Breden, F; Mank, J E; Ritchie, M G; Higginson, A D; Radwan, J; Jaquiery, J; Salzburger, W; Arriero, E; Barribeau, S M; Phillips, P C; Renn, S C P; Rowe, L

    2015-04-01

    Sexual selection drives fundamental evolutionary processes such as trait elaboration and speciation. Despite this importance, there are surprisingly few examples of genes unequivocally responsible for variation in sexually selected phenotypes. This lack of information inhibits our ability to predict phenotypic change due to universal behaviours, such as fighting over mates and mate choice. Here, we discuss reasons for this apparent gap and provide recommendations for how it can be overcome by adopting contemporary genomic methods, exploiting underutilized taxa that may be ideal for detecting the effects of sexual selection and adopting appropriate experimental paradigms. Identifying genes that determine variation in sexually selected traits has the potential to improve theoretical models and reveal whether the genetic changes underlying phenotypic novelty utilize common or unique molecular mechanisms. Such a genomic approach to sexual selection will help answer questions in the evolution of sexually selected phenotypes that were first asked by Darwin and can furthermore serve as a model for the application of genomics in all areas of evolutionary biology. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.

  5. Second generation DNA sequencing of the mitogenome of the Chinstrap penguin and comparative genomics of Antarctic penguins.

    PubMed

    Subramanian, Sankar; Lingala, Syamala Gowri; Swaminathan, Siva; Huynen, Leon; Lambert, David

    2014-08-01

    The complete mitochondrial genome of the Chinstrap penguin (Pygoscelis antarcticus) was sequenced and compared with other penguin mitogenomes. The genome is 15,972 bp in length with the number and order of protein coding genes and RNAs being very similar to that of other known penguin mitogenomes. Comparative nucleotide analysis showed the Chinstrap mitogenome shares 94% homology with the mitogenome of its sister species, Pygoscelis adelie (Adélie penguin). Divergence at nonsynonymous nucleotide positions was found to be up to 23 times less than that observed in synonymous positions of protein coding genes, suggesting high selection constraints. The complete mitogenome data will be useful for genetic and evolutionary studies of penguins.

  6. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    PubMed Central

    Griffin, Darren K; Robertson, Lindsay B; Tempest, Helen G; Vignal, Alain; Fillon, Valérie; Crooijmans, Richard PMA; Groenen, Martien AM; Deryusheva, Svetlana; Gaginskaya, Elena; Carré, Wilfrid; Waddington, David; Talbot, Richard; Völker, Martin; Masabanda, Julio S; Burt, Dave W

    2008-01-01

    Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs) between birds. Results We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres) and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs. Conclusion This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents. PMID:18410676

  7. The genome diversity and karyotype evolution of mammals

    PubMed Central

    2011-01-01

    The past decade has witnessed an explosion of genome sequencing and mapping in evolutionary diverse species. While full genome sequencing of mammals is rapidly progressing, the ability to assemble and align orthologous whole chromosome regions from more than a few species is still not possible. The intense focus on building of comparative maps for companion (dog and cat), laboratory (mice and rat) and agricultural (cattle, pig, and horse) animals has traditionally been used as a means to understand the underlying basis of disease-related or economically important phenotypes. However, these maps also provide an unprecedented opportunity to use multispecies analysis as a tool for inferring karyotype evolution. Comparative chromosome painting and related techniques are now considered to be the most powerful approaches in comparative genome studies. Homologies can be identified with high accuracy using molecularly defined DNA probes for fluorescence in situ hybridization (FISH) on chromosomes of different species. Chromosome painting data are now available for members of nearly all mammalian orders. In most orders, there are species with rates of chromosome evolution that can be considered as 'default' rates. The number of rearrangements that have become fixed in evolutionary history seems comparatively low, bearing in mind the 180 million years of the mammalian radiation. Comparative chromosome maps record the history of karyotype changes that have occurred during evolution. The aim of this review is to provide an overview of these recent advances in our endeavor to decipher the karyotype evolution of mammals by integrating the published results together with some of our latest unpublished results. PMID:21992653

  8. The Genome Sequence of the North-European Cucumber (Cucumis sativus L.) Unravels Evolutionary Adaptation Mechanisms in Plants

    PubMed Central

    Wóycicki, Rafał; Witkowicz, Justyna; Gawroński, Piotr; Dąbrowska, Joanna; Lomsadze, Alexandre; Pawełkowicz, Magdalena; Siedlecka, Ewa; Yagi, Kohei; Pląder, Wojciech; Seroczyńska, Anna; Śmiech, Mieczysław; Gutman, Wojciech; Niemirowicz-Szczytt, Katarzyna; Bartoszewski, Grzegorz; Tagashira, Norikazu; Hoshi, Yoshikazu; Borodovsky, Mark; Karpiński, Stanisław; Malepszy, Stefan; Przybecki, Zbigniew

    2011-01-01

    Cucumber (Cucumis sativus L.), a widely cultivated crop, has originated from Eastern Himalayas and secondary domestication regions includes highly divergent climate conditions e.g. temperate and subtropical. We wanted to uncover adaptive genome differences between the cucumber cultivars and what sort of evolutionary molecular mechanisms regulate genetic adaptation of plants to different ecosystems and organism biodiversity. Here we present the draft genome sequence of the Cucumis sativus genome of the North-European Borszczagowski cultivar (line B10) and comparative genomics studies with the known genomes of: C. sativus (Chinese cultivar – Chinese Long (line 9930)), Arabidopsis thaliana, Populus trichocarpa and Oryza sativa. Cucumber genomes show extensive chromosomal rearrangements, distinct differences in quantity of the particular genes (e.g. involved in photosynthesis, respiration, sugar metabolism, chlorophyll degradation, regulation of gene expression, photooxidative stress tolerance, higher non-optimal temperatures tolerance and ammonium ion assimilation) as well as in distributions of abscisic acid-, dehydration- and ethylene-responsive cis-regulatory elements (CREs) in promoters of orthologous group of genes, which lead to the specific adaptation features. Abscisic acid treatment of non-acclimated Arabidopsis and C. sativus seedlings induced moderate freezing tolerance in Arabidopsis but not in C. sativus. This experiment together with analysis of abscisic acid-specific CRE distributions give a clue why C. sativus is much more susceptible to moderate freezing stresses than A. thaliana. Comparative analysis of all the five genomes showed that, each species and/or cultivars has a specific profile of CRE content in promoters of orthologous genes. Our results constitute the substantial and original resource for the basic and applied research on environmental adaptations of plants, which could facilitate creation of new crops with improved growth and yield in divergent conditions. PMID:21829493

  9. The genome sequence of the North-European cucumber (Cucumis sativus L.) unravels evolutionary adaptation mechanisms in plants.

    PubMed

    Wóycicki, Rafał; Witkowicz, Justyna; Gawroński, Piotr; Dąbrowska, Joanna; Lomsadze, Alexandre; Pawełkowicz, Magdalena; Siedlecka, Ewa; Yagi, Kohei; Pląder, Wojciech; Seroczyńska, Anna; Śmiech, Mieczysław; Gutman, Wojciech; Niemirowicz-Szczytt, Katarzyna; Bartoszewski, Grzegorz; Tagashira, Norikazu; Hoshi, Yoshikazu; Borodovsky, Mark; Karpiński, Stanisław; Malepszy, Stefan; Przybecki, Zbigniew

    2011-01-01

    Cucumber (Cucumis sativus L.), a widely cultivated crop, has originated from Eastern Himalayas and secondary domestication regions includes highly divergent climate conditions e.g. temperate and subtropical. We wanted to uncover adaptive genome differences between the cucumber cultivars and what sort of evolutionary molecular mechanisms regulate genetic adaptation of plants to different ecosystems and organism biodiversity. Here we present the draft genome sequence of the Cucumis sativus genome of the North-European Borszczagowski cultivar (line B10) and comparative genomics studies with the known genomes of: C. sativus (Chinese cultivar--Chinese Long (line 9930)), Arabidopsis thaliana, Populus trichocarpa and Oryza sativa. Cucumber genomes show extensive chromosomal rearrangements, distinct differences in quantity of the particular genes (e.g. involved in photosynthesis, respiration, sugar metabolism, chlorophyll degradation, regulation of gene expression, photooxidative stress tolerance, higher non-optimal temperatures tolerance and ammonium ion assimilation) as well as in distributions of abscisic acid-, dehydration- and ethylene-responsive cis-regulatory elements (CREs) in promoters of orthologous group of genes, which lead to the specific adaptation features. Abscisic acid treatment of non-acclimated Arabidopsis and C. sativus seedlings induced moderate freezing tolerance in Arabidopsis but not in C. sativus. This experiment together with analysis of abscisic acid-specific CRE distributions give a clue why C. sativus is much more susceptible to moderate freezing stresses than A. thaliana. Comparative analysis of all the five genomes showed that, each species and/or cultivars has a specific profile of CRE content in promoters of orthologous genes. Our results constitute the substantial and original resource for the basic and applied research on environmental adaptations of plants, which could facilitate creation of new crops with improved growth and yield in divergent conditions.

  10. Light-harvesting antenna complexes in the moss Physcomitrella patens: implications for the evolutionary transition from green algae to land plants.

    PubMed

    Iwai, Masakazu; Yokono, Makio

    2017-06-01

    Plants have successfully adapted to a vast range of terrestrial environments during their evolution. To elucidate the evolutionary transition of light-harvesting antenna proteins from green algae to land plants, the moss Physcomitrella patens is ideally placed basally among land plants. Compared to the genomes of green algae and land plants, the P. patens genome codes for more diverse and redundant light-harvesting antenna proteins. It also encodes Lhcb9, which has characteristics not found in other light-harvesting antenna proteins. The unique complement of light-harvesting antenna proteins in P. patens appears to facilitate protein interactions that include those lost in both green algae and land plants with regard to stromal electron transport pathways and photoprotection mechanisms. This review will highlight unique characteristics of the P. patens light-harvesting antenna system and the resulting implications about the evolutionary transition during plant terrestrialization. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Comparative genomics explains the evolutionary success of reef-forming corals

    PubMed Central

    Bhattacharya, Debashish; Agrawal, Shobhit; Aranda, Manuel; Baumgarten, Sebastian; Belcaid, Mahdi; Drake, Jeana L; Erwin, Douglas; Foret, Sylvian; Gates, Ruth D; Gruber, David F; Kamel, Bishoy; Lesser, Michael P; Levy, Oren; Liew, Yi Jin; MacManes, Matthew; Mass, Tali; Medina, Monica; Mehr, Shaadi; Meyer, Eli; Price, Dana C; Putnam, Hollie M; Qiu, Huan; Shinzato, Chuya; Shoguchi, Eiichi; Stokes, Alexander J; Tambutté, Sylvie; Tchernov, Dan; Voolstra, Christian R; Wagner, Nicole; Walker, Charles W; Weber, Andreas PM; Weis, Virginia; Zelzion, Ehud; Zoccola, Didier; Falkowski, Paul G

    2016-01-01

    Transcriptome and genome data from twenty stony coral species and a selection of reference bilaterians were studied to elucidate coral evolutionary history. We identified genes that encode the proteins responsible for the precipitation and aggregation of the aragonite skeleton on which the organisms live, and revealed a network of environmental sensors that coordinate responses of the host animals to temperature, light, and pH. Furthermore, we describe a variety of stress-related pathways, including apoptotic pathways that allow the host animals to detoxify reactive oxygen and nitrogen species that are generated by their intracellular photosynthetic symbionts, and determine the fate of corals under environmental stress. Some of these genes arose through horizontal gene transfer and comprise at least 0.2% of the animal gene inventory. Our analysis elucidates the evolutionary strategies that have allowed symbiotic corals to adapt and thrive for hundreds of millions of years. DOI: http://dx.doi.org/10.7554/eLife.13288.001 PMID:27218454

  12. Phylogenomic evolutionary surveys of subtilase superfamily genes in fungi.

    PubMed

    Li, Juan; Gu, Fei; Wu, Runian; Yang, JinKui; Zhang, Ke-Qin

    2017-03-30

    Subtilases belong to a superfamily of serine proteases which are ubiquitous in fungi and are suspected to have developed distinct functional properties to help fungi adapt to different ecological niches. In this study, we conducted a large-scale phylogenomic survey of subtilase protease genes in 83 whole genome sequenced fungal species in order to identify the evolutionary patterns and subsequent functional divergences of different subtilase families among the main lineages of the fungal kingdom. Our comparative genomic analyses of the subtilase superfamily indicated that extensive gene duplications, losses and functional diversifications have occurred in fungi, and that the four families of subtilase enzymes in fungi, including proteinase K-like, Pyrolisin, kexin and S53, have distinct evolutionary histories which may have facilitated the adaptation of fungi to a broad array of life strategies. Our study provides new insights into the evolution of the subtilase superfamily in fungi and expands our understanding of the evolution of fungi with different lifestyles.

  13. Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate

    PubMed Central

    Dehal, Paramvir; Boore, Jeffrey L

    2005-01-01

    The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, and then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish–tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of four-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage. PMID:16128622

  14. Joint scaling laws in functional and evolutionary categories in prokaryotic genomes

    PubMed Central

    Grilli, J.; Bassetti, B.; Maslov, S.; Cosentino Lagomarsino, M.

    2012-01-01

    We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model, numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional ‘recipe’ for genome composition of the type ‘a spoonful of sugar for each egg yolk’. The model jointly reproduces two known empirical laws: the distribution of family sizes and the non-linear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterizing these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterized by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes. PMID:21937509

  15. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    The composition of genomes with respect to short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. The underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, which we detect in all species across domains of life. We hypothesize that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Alternative contributions may come from interference of protein-DNA binding with replication and mutational repair processes, which operates with similar rates. We conclude that genome-wide word compositions have been molded by DNA binding proteins through tiny evolutionary steps over timescales spanning millions of generations.

  16. Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics.

    PubMed

    Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

    2015-01-01

    The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Application of resequencing to rice genomics, functional genomics and evolutionary analysis

    PubMed Central

    2014-01-01

    Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357

  18. Short- and Long-term Evolutionary Dynamics of Bacterial Insertion Sequences: Insights from Wolbachia Endosymbionts

    PubMed Central

    Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard

    2011-01-01

    Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52–171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes. PMID:21940637

  19. Short- and long-term evolutionary dynamics of bacterial insertion sequences: insights from Wolbachia endosymbionts.

    PubMed

    Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard

    2011-01-01

    Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52-171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes.

  20. The Genomic Basis for Evolved Pollution Tolerance in Killifish (Fundulus heterclitus).

    EPA Science Inventory

    Uncovering the molecular mechanisms of adaptive variation is a leading challenge in evolutionary biology. Identifying genes that influence ecological traits can provide insight into the evolutionary processes behind genomic responses to environmental change. Here, we examine the...

  1. Evolutionary Genomics of Defense Systems in Archaea and Bacteria*

    PubMed Central

    Koonin, Eugene V.; Makarova, Kira S.; Wolf, Yuri I.

    2018-01-01

    Evolution of bacteria and archaea involves an incessant arms race against an enormous diversity of genetic parasites. Accordingly, a substantial fraction of the genes in most bacteria and archaea are dedicated to antiparasite defense. The functions of these defense systems follow several distinct strategies, including innate immunity; adaptive immunity; and dormancy induction, or programmed cell death. Recent comparative genomic studies taking advantage of the expanding database of microbial genomes and metagenomes, combined with direct experiments, resulted in the discovery of several previously unknown defense systems, including innate immunity centered on Argonaute proteins, bacteriophage exclusion, and new types of CRISPR-Cas systems of adaptive immunity. Some general principles of function and evolution of defense systems are starting to crystallize, in particular, extensive gain and loss of defense genes during the evolution of prokaryotes; formation of genomic defense islands; evolutionary connections between mobile genetic elements and defense, whereby genes of mobile elements are repeatedly recruited for defense functions; the partially selfish and addictive behavior of the defense systems; and coupling between immunity and dormancy induction/programmed cell death. PMID:28657885

  2. Genome evolution in Reptilia, the sister group of mammals.

    PubMed

    Janes, Daniel E; Organ, Christopher L; Fujita, Matthew K; Shedlock, Andrew M; Edwards, Scott V

    2010-01-01

    The genomes of birds and nonavian reptiles (Reptilia) are critical for understanding genome evolution in mammals and amniotes generally. Despite decades of study at the chromosomal and single-gene levels, and the evidence for great diversity in genome size, karyotype, and sex chromosome diversity, reptile genomes are virtually unknown in the comparative genomics era. The recent sequencing of the chicken and zebra finch genomes, in conjunction with genome scans and the online publication of the Anolis lizard genome, has begun to clarify the events leading from an ancestral amniote genome--predicted to be large and to possess a diverse repeat landscape on par with mammals and a birdlike sex chromosome system--to the small and highly streamlined genomes of birds. Reptilia exhibit a wide range of evolutionary rates of different subgenomes and, from isochores to mitochondrial DNA, provide a critical contrast to the genomic paradigms established in mammals.

  3. Evolutionary genomics of Entamoeba

    PubMed Central

    Weedall, Gareth D.; Hall, Neil

    2011-01-01

    Entamoeba histolytica is a human pathogen that causes amoebic dysentery and leads to significant morbidity and mortality worldwide. Understanding the genome and evolution of the parasite will help explain how, when and why it causes disease. Here we review current knowledge about the evolutionary genomics of Entamoeba: how differences between the genomes of different species may help explain different phenotypes, and how variation among E. histolytica parasites reveals patterns of population structure. The imminent expansion of the amount genome data will greatly improve our knowledge of the genus and of pathogenic species within it. PMID:21288488

  4. The infinite sites model of genome evolution.

    PubMed

    Ma, Jian; Ratan, Aakrosh; Raney, Brian J; Suh, Bernard B; Miller, Webb; Haussler, David

    2008-09-23

    We formalize the problem of recovering the evolutionary history of a set of genomes that are related to an unseen common ancestor genome by operations of speciation, deletion, insertion, duplication, and rearrangement of segments of bases. The problem is examined in the limit as the number of bases in each genome goes to infinity. In this limit, the chromosomes are represented by continuous circles or line segments. For such an infinite-sites model, we present a polynomial-time algorithm to find the most parsimonious evolutionary history of any set of related present-day genomes.

  5. MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

    PubMed

    Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

    2008-11-27

    The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.

  6. Distinct retroelement classes define evolutionary breakpoints demarcating sites of evolutionary novelty

    PubMed Central

    Longo, Mark S; Carone, Dawn M; Green, Eric D; O'Neill, Michael J; O'Neill, Rachel J

    2009-01-01

    Background Large-scale genome rearrangements brought about by chromosome breaks underlie numerous inherited diseases, initiate or promote many cancers and are also associated with karyotype diversification during species evolution. Recent research has shown that these breakpoints are nonrandomly distributed throughout the mammalian genome and many, termed "evolutionary breakpoints" (EB), are specific genomic locations that are "reused" during karyotypic evolution. When the phylogenetic trajectory of orthologous chromosome segments is considered, many of these EB are coincident with ancient centromere activity as well as new centromere formation. While EB have been characterized as repeat-rich regions, it has not been determined whether specific sequences have been retained during evolution that would indicate previous centromere activity or a propensity for new centromere formation. Likewise, the conservation of specific sequence motifs or classes at EBs among divergent mammalian taxa has not been determined. Results To define conserved sequence features of EBs associated with centromere evolution, we performed comparative sequence analysis of more than 4.8 Mb within the tammar wallaby, Macropus eugenii, derived from centromeric regions (CEN), euchromatic regions (EU), and an evolutionary breakpoint (EB) that has undergone convergent breakpoint reuse and past centromere activity in marsupials. We found a dramatic enrichment for long interspersed nucleotide elements (LINE1s) and endogenous retroviruses (ERVs) and a depletion of short interspersed nucleotide elements (SINEs) shared between CEN and EBs. We analyzed the orthologous human EB (14q32.33), known to be associated with translocations in many cancers including multiple myelomas and plasma cell leukemias, and found a conserved distribution of similar repetitive elements. Conclusion Our data indicate that EBs tracked within the class Mammalia harbor sequence features retained since the divergence of marsupials and eutherians that may have predisposed these genomic regions to large-scale chromosomal instability. PMID:19630942

  7. Common functional targets of adaptive micro- and macro-evolutionary divergence in killifish.

    PubMed

    Whitehead, Andrew; Zhang, Shujun; Roach, Jennifer L; Galvez, Fernando

    2013-07-01

    Environmental salinity presents a key barrier to dispersal for most aquatic organisms, and adaptation to alternate osmotic environments likely enables species diversification. Little is known of the functional basis for derived tolerance to environmental salinity. We integrate comparative physiology and functional genomics to explore the mechanistic underpinnings of evolved variation in osmotic plasticity within and among two species of killifish; Fundulus majalis harbours the ancestral mainly salt-tolerant phenotype, whereas Fundulus heteroclitus harbours a derived physiology that retains extreme salt tolerance but with expanded osmotic plasticity towards the freshwater end of the osmotic continuum. Common-garden comparative hypo-osmotic challenge experiments show that F. heteroclitus is capable of remodelling gill epithelia more quickly and at more extreme osmotic challenge than F. majalis. We detect an unusual pattern of baseline transcriptome divergence, where neutral evolutionary processes appear to govern expression divergence within species, but patterns of divergence for these genes between species do not follow neutral expectations. During acclimation, genome expression profiling identifies mechanisms of acclimation-associated response that are conserved within the genus including regulation of paracellular permeability. In contrast, several responses vary among species including those putatively associated with cell volume regulation, and these same mechanisms are targets for adaptive physiological divergence along osmotic gradients within F. heteroclitus. As such, the genomic and physiological mechanisms that are associated with adaptive fine-tuning within species also contribute to macro-evolutionary divergence as species diversify across osmotic niches. © 2013 John Wiley & Sons Ltd.

  8. Genome-wide evolutionary dynamics of influenza B viruses on a global scale

    PubMed Central

    Langat, Pinky; Bowden, Thomas A.; Edwards, Stephanie; Gall, Astrid; Rambaut, Andrew; Daniels, Rodney S.; Russell, Colin A.; Pybus, Oliver G.; McCauley, John

    2017-01-01

    The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally. PMID:29284042

  9. Megacycles of atmospheric carbon dioxide concentration correlate with fossil plant genome size.

    PubMed

    Franks, Peter J; Freckleton, Rob P; Beaulieu, Jeremy M; Leitch, Ilia J; Beerling, David J

    2012-02-19

    Tectonic processes drive megacycles of atmospheric carbon dioxide (CO(2)) concentration, c(a), that force large fluctuations in global climate. With a period of several hundred million years, these megacycles have been linked to the evolution of vascular plants, but adaptation at the subcellular scale has been difficult to determine because fossils typically do not preserve this information. Here we show, after accounting for evolutionary relatedness using phylogenetic comparative methods, that plant nuclear genome size (measured as the haploid DNA amount) and the size of stomatal guard cells are correlated across a broad taxonomic range of extant species. This phylogenetic regression was used to estimate the mean genome size of fossil plants from the size of fossil stomata. For the last 400 Myr, spanning almost the full evolutionary history of vascular plants, we found a significant correlation between fossil plant genome size and c(a), modelled independently using geochemical data. The correlation is consistent with selection for stomatal size and genome size by c(a) as plants adapted towards optimal leaf gas exchange under a changing CO(2) regime. Our findings point to the possibility that major episodes of change in c(a) throughout Earth history might have selected for changes in genome size, influencing plant diversification.

  10. A model species for agricultural pest genomics: the genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae)

    DOE PAGES

    Schoville, Sean D.; Chen, Yolanda H.; Andersson, Martin N.; ...

    2018-01-31

    The Colorado potato beetle is one of the most challenging agricultural pests to manage. It has shown a spectacular ability to adapt to a variety of solanaceaeous plants and variable climates during its global invasion, and, notably, to rapidly evolve insecticide resistance. To examine evidence of rapid evolutionary change, and to understand the genetic basis of herbivory and insecticide resistance, we tested for structural and functional genomic changes relative to other arthropod species using genome sequencing, transcriptomics, and community annotation. Two factors that might facilitate rapid evolutionary change include transposable elements, which comprise at least 17% of the genome andmore » are rapidly evolving compared to other Coleoptera, and high levels of nucleotide diversity in rapidly growing pest populations. Adaptations to plant feeding are evident in gene expansions and differential expression of digestive enzymes in gut tissues, as well as expansions of gustatory receptors for bitter tasting. Surprisingly, the suite of genes involved in insecticide resistance is similar to other beetles. Finally, duplications in the RNAi pathway might explain why Leptinotarsa decemlineata has high sensitivity to dsRNA. In conclusion, the L. decemlineata genome provides opportunities to investigate a broad range of phenotypes and to develop sustainable methods to control this widely successful pest.« less

  11. A model species for agricultural pest genomics: the genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schoville, Sean D.; Chen, Yolanda H.; Andersson, Martin N.

    The Colorado potato beetle is one of the most challenging agricultural pests to manage. It has shown a spectacular ability to adapt to a variety of solanaceaeous plants and variable climates during its global invasion, and, notably, to rapidly evolve insecticide resistance. To examine evidence of rapid evolutionary change, and to understand the genetic basis of herbivory and insecticide resistance, we tested for structural and functional genomic changes relative to other arthropod species using genome sequencing, transcriptomics, and community annotation. Two factors that might facilitate rapid evolutionary change include transposable elements, which comprise at least 17% of the genome andmore » are rapidly evolving compared to other Coleoptera, and high levels of nucleotide diversity in rapidly growing pest populations. Adaptations to plant feeding are evident in gene expansions and differential expression of digestive enzymes in gut tissues, as well as expansions of gustatory receptors for bitter tasting. Surprisingly, the suite of genes involved in insecticide resistance is similar to other beetles. Finally, duplications in the RNAi pathway might explain why Leptinotarsa decemlineata has high sensitivity to dsRNA. In conclusion, the L. decemlineata genome provides opportunities to investigate a broad range of phenotypes and to develop sustainable methods to control this widely successful pest.« less

  12. The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus

    PubMed Central

    Scannell, Devin R.; Zill, Oliver A.; Rokas, Antonis; Payen, Celia; Dunham, Maitreya J.; Eisen, Michael B.; Rine, Jasper; Johnston, Mark; Hittinger, Chris Todd

    2011-01-01

    High-quality, well-annotated genome sequences and standardized laboratory strains fuel experimental and evolutionary research. We present improved genome sequences of three species of Saccharomyces sensu stricto yeasts: S. bayanus var. uvarum (CBS 7001), S. kudriavzevii (IFO 1802T and ZP 591), and S. mikatae (IFO 1815T), and describe their comparison to the genomes of S. cerevisiae and S. paradoxus. The new sequences, derived by assembling millions of short DNA sequence reads together with previously published Sanger shotgun reads, have vastly greater long-range continuity and far fewer gaps than the previously available genome sequences. New gene predictions defined a set of 5261 protein-coding orthologs across the five most commonly studied Saccharomyces yeasts, enabling a re-examination of the tempo and mode of yeast gene evolution and improved inferences of species-specific gains and losses. To facilitate experimental investigations, we generated genetically marked, stable haploid strains for all three of these Saccharomyces species. These nearly complete genome sequences and the collection of genetically marked strains provide a valuable toolset for comparative studies of gene function, metabolism, and evolution, and render Saccharomyces sensu stricto the most experimentally tractable model genus. These resources are freely available and accessible through www.SaccharomycesSensuStricto.org. PMID:22384314

  13. A model species for agricultural pest genomics: the genome of the Colorado potato beetle, Leptinotarsa decemlineata (Coleoptera: Chrysomelidae).

    PubMed

    Schoville, Sean D; Chen, Yolanda H; Andersson, Martin N; Benoit, Joshua B; Bhandari, Anita; Bowsher, Julia H; Brevik, Kristian; Cappelle, Kaat; Chen, Mei-Ju M; Childers, Anna K; Childers, Christopher; Christiaens, Olivier; Clements, Justin; Didion, Elise M; Elpidina, Elena N; Engsontia, Patamarerk; Friedrich, Markus; García-Robles, Inmaculada; Gibbs, Richard A; Goswami, Chandan; Grapputo, Alessandro; Gruden, Kristina; Grynberg, Marcin; Henrissat, Bernard; Jennings, Emily C; Jones, Jeffery W; Kalsi, Megha; Khan, Sher A; Kumar, Abhishek; Li, Fei; Lombard, Vincent; Ma, Xingzhou; Martynov, Alexander; Miller, Nicholas J; Mitchell, Robert F; Munoz-Torres, Monica; Muszewska, Anna; Oppert, Brenda; Palli, Subba Reddy; Panfilio, Kristen A; Pauchet, Yannick; Perkin, Lindsey C; Petek, Marko; Poelchau, Monica F; Record, Éric; Rinehart, Joseph P; Robertson, Hugh M; Rosendale, Andrew J; Ruiz-Arroyo, Victor M; Smagghe, Guy; Szendrei, Zsofia; Thomas, Gregg W C; Torson, Alex S; Vargas Jentzsch, Iris M; Weirauch, Matthew T; Yates, Ashley D; Yocum, George D; Yoon, June-Sun; Richards, Stephen

    2018-01-31

    The Colorado potato beetle is one of the most challenging agricultural pests to manage. It has shown a spectacular ability to adapt to a variety of solanaceaeous plants and variable climates during its global invasion, and, notably, to rapidly evolve insecticide resistance. To examine evidence of rapid evolutionary change, and to understand the genetic basis of herbivory and insecticide resistance, we tested for structural and functional genomic changes relative to other arthropod species using genome sequencing, transcriptomics, and community annotation. Two factors that might facilitate rapid evolutionary change include transposable elements, which comprise at least 17% of the genome and are rapidly evolving compared to other Coleoptera, and high levels of nucleotide diversity in rapidly growing pest populations. Adaptations to plant feeding are evident in gene expansions and differential expression of digestive enzymes in gut tissues, as well as expansions of gustatory receptors for bitter tasting. Surprisingly, the suite of genes involved in insecticide resistance is similar to other beetles. Finally, duplications in the RNAi pathway might explain why Leptinotarsa decemlineata has high sensitivity to dsRNA. The L. decemlineata genome provides opportunities to investigate a broad range of phenotypes and to develop sustainable methods to control this widely successful pest.

  14. Y chromosome evolution: emerging insights into processes of Y chromosome degeneration

    PubMed Central

    Bachtrog, Doris

    2014-01-01

    The human Y chromosome is intriguing not only because it harbours the master-switch gene determining gender but also because of its unusual evolutionary trajectory. Previously an autosome, Y chromosome evolution has been characterized by massive gene decay. Recent whole-genome and transcriptome analyses of Y chromosomes in humans and other primates, in Drosophila species as well as in plants have shed light on the current gene content of the Y, its origins and its long-term fate. Comparative analysis of young and old Y chromosomes have given further insights into the evolutionary and molecular forces triggering Y degeneration and its evolutionary destiny. PMID:23329112

  15. Understanding the direction of evolution in Burkholderia glumae through comparative genomics.

    PubMed

    Lee, Hyun-Hee; Park, Jungwook; Kim, Jinnyun; Park, Inmyoung; Seo, Young-Su

    2016-02-01

    Members of the genus Burkholderia occupy remarkably diverse niches, with genome sizes ranging from ~3.75 to 11.29 Mbp. The genome of Burkholderia glumae ranges in size from ~5.81 to 7.89 Mbp. Unlike other plant pathogenic bacteria, B. glumae can infect a wide range of monocot and dicot plants. Comparative genome analysis of B. glumae strains can provide insight into genome variation as well as differential features of whole metabolism or pathways between multiple strains of B. glumae infecting the same host. Comparative analysis of complete genomes among B. glumae BGR1, B. glumae LMG 2196, and B. glumae PG1 revealed the largest departmentalization of genes onto separate replicons in B. glumae BGR1 and considerable downsizing of the genome in B. glumae LMG 2196. In addition, the presence of large-scale evolutionary events such as rearrangement and inversion and the development of highly specialized systems were found to be related to virulence-associated features in the three B. glumae strains. This connection may explain why this bacterium broadens its host range and reinforces its interaction with hosts.

  16. Ecological and evolutionary genomics of marine photosynthetic organisms.

    PubMed

    Coelho, Susana M; Simon, Nathalie; Ahmed, Sophia; Cock, J Mark; Partensky, Frédéric

    2013-02-01

    Environmental (ecological) genomics aims to understand the genetic basis of relationships between organisms and their abiotic and biotic environments. It is a rapidly progressing field of research largely due to recent advances in the speed and volume of genomic data being produced by next generation sequencing (NGS) technologies. Building on information generated by NGS-based approaches, functional genomic methodologies are being applied to identify and characterize genes and gene systems of both environmental and evolutionary relevance. Marine photosynthetic organisms (MPOs) were poorly represented amongst the early genomic models, but this situation is changing rapidly. Here we provide an overview of the recent advances in the application of ecological genomic approaches to both prokaryotic and eukaryotic MPOs. We describe how these approaches are being used to explore the biology and ecology of marine cyanobacteria and algae, particularly with regard to their functions in a broad range of marine ecosystems. Specifically, we review the ecological and evolutionary insights gained from whole genome and transcriptome sequencing projects applied to MPOs and illustrate how their genomes are yielding information on the specific features of these organisms. © 2012 Blackwell Publishing Ltd.

  17. Sequence and functional characterization of MIRNA164 promoters from Brassica shows copy number dependent regulatory diversification among homeologs.

    PubMed

    Jain, Aditi; Anand, Saurabh; Singh, Neer K; Das, Sandip

    2018-03-12

    The impact of polyploidy on functional diversification of cis-regulatory elements is poorly understood. This is primarily on account of lack of well-defined structure of cis-elements and a universal regulatory code. To the best of our knowledge, this is the first report on characterization of sequence and functional diversification of paralogous and homeologous promoter elements associated with MIR164 from Brassica. The availability of whole genome sequence allowed us to identify and isolate a total of 42 homologous copies of MIR164 from diploid species-Brassica rapa (A-genome), Brassica nigra (B-genome), Brassica oleracea (C-genome), and allopolyploids-Brassica juncea (AB-genome), Brassica carinata (BC-genome) and Brassica napus (AC-genome). Additionally, we retrieved homologous sequences based on comparative genomics from Arabidopsis lyrata, Capsella rubella, and Thellungiella halophila, spanning ca. 45 million years of evolutionary history of Brassicaceae. Sequence comparison across Brassicaceae revealed lineage-, karyotype, species-, and sub-genome specific changes providing a snapshot of evolutionary dynamics of miRNA promoters in polyploids. Tree topology of cis-elements associated with MIR164 was found to re-capitulate the species and family evolutionary history. Phylogenetic shadowing identified transcription factor binding sites (TFBS) conserved across Brassicaceae, of which, some are already known as regulators of MIR164 expression. Some of the TFBS were found to be distributed in a sub-genome specific (e.g., SOX specific to promoter of MIR164c from MF2 sub-genome), lineage-specific (YABBY binding motif, specific to C. rubella in MIR164b), or species-specific (e.g., VOZ in A. thaliana MIR164a) manner which might contribute towards genetic and adaptive variation. Reporter activity driven by promoters associated with MIR164 paralogs and homeologs was majorly in agreement with known role of miR164 in leaf shaping, regulation of lateral root development and senescence, and one previously un-described novel role in trichome. The impact of polyploidy was most profound when reporter activity across three MIR164c homeologs were compared that revealed negligible overlap, whereas reporter activity among two homeologs of MIR164a displays significant overlap. A copy number dependent cis-regulatory divergence thus exists in MIR164 genes in Brassica juncea. The full extent of regulatory diversification towards adaptive strategies will only be known when future endeavors analyze the promoter function under duress of stress and hormonal regimes.

  18. Sequencing of small RNAs of the fern Pleopeltis minima (Polypodiaceae) offers insight into the evolution of the microrna repertoire in land plants

    PubMed Central

    Berruezo, Florencia; de Souza, Flávio S. J.; Picca, Pablo I.; Nemirovsky, Sergio I.; Martínez Tosar, Leandro; Rivero, Mercedes; Mentaberry, Alejandro N.

    2017-01-01

    MicroRNAs (miRNAs) are short, single stranded RNA molecules that regulate the stability and translation of messenger RNAs in diverse eukaryotic groups. Several miRNA genes are of ancient origin and have been maintained in the genomes of animal and plant taxa for hundreds of millions of years, playing key roles in development and physiology. In the last decade, genome and small RNA (sRNA) sequencing of several plant species have helped unveil the evolutionary history of land plants. Among these, the fern group (monilophytes) occupies a key phylogenetic position, as it represents the closest extant cousin taxon of seed plants, i.e. gymno- and angiosperms. However, in spite of their evolutionary, economic and ecological importance, no fern genome has been sequenced yet and few genomic resources are available for this group. Here, we sequenced the small RNA fraction of an epiphytic South American fern, Pleopeltis minima (Polypodiaceae), and compared it to plant miRNA databases, allowing for the identification of miRNA families that are shared by all land plants, shared by all vascular plants (tracheophytes) or shared by euphyllophytes (ferns and seed plants) only. Using the recently described transcriptome of another fern, Lygodium japonicum, we also estimated the degree of conservation of fern miRNA targets in relation to other plant groups. Our results pinpoint the origin of several miRNA families in the land plant evolutionary tree with more precision and are a resource for future genomic and functional studies of fern miRNAs. PMID:28494025

  19. Comparative Genomic Analysis Reveals Multiple Long Terminal Repeats, Lineage-Specific Amplification, and Frequent Interelement Recombination for Cassandra Retrotransposon in Pear (Pyrus bretschneideri Rehd.)

    PubMed Central

    Yin, Hao; Du, Jianchang; Li, Leiting; Jin, Cong; Fan, Lian; Li, Meng; Wu, Jun; Zhang, Shaoling

    2014-01-01

    Cassandra transposable elements belong to a specific group of terminal-repeat retrotransposons in miniature (TRIM). Although Cassandra TRIM elements have been found in almost all vascular plants, detailed investigations on the nature, abundance, amplification timeframe, and evolution have not been performed in an individual genome. We therefore conducted a comprehensive analysis of Cassandra retrotransposons using the newly sequenced pear genome along with four other Rosaceae species, including apple, peach, mei, and woodland strawberry. Our data reveal several interesting findings for this particular retrotransposon family: 1) A large number of the intact copies contain three, four, or five long terminal repeats (LTRs) (∼20% in pear); 2) intact copies and solo LTRs with or without target site duplications are both common (∼80% vs. 20%) in each genome; 3) the elements exhibit an overall unbiased distribution among the chromosomes; 4) the elements are most successfully amplified in pear (5,032 copies); and 5) the evolutionary relationships of these elements vary among different lineages, species, and evolutionary time. These results indicate that Cassandra retrotransposons contain more complex structures (elements with multiple LTRs) than what we have known previously, and that frequent interelement unequal recombination followed by transposition may play a critical role in shaping and reshaping host genomes. Thus this study provides insights into the property, propensity, and molecular mechanisms governing the formation and amplification of Cassandra retrotransposons, and enhances our understanding of the structural variation, evolutionary history, and transposition process of LTR retrotransposons in plants. PMID:24899073

  20. Sequencing of small RNAs of the fern Pleopeltis minima (Polypodiaceae) offers insight into the evolution of the microrna repertoire in land plants.

    PubMed

    Berruezo, Florencia; de Souza, Flávio S J; Picca, Pablo I; Nemirovsky, Sergio I; Martínez Tosar, Leandro; Rivero, Mercedes; Mentaberry, Alejandro N; Zelada, Alicia M

    2017-01-01

    MicroRNAs (miRNAs) are short, single stranded RNA molecules that regulate the stability and translation of messenger RNAs in diverse eukaryotic groups. Several miRNA genes are of ancient origin and have been maintained in the genomes of animal and plant taxa for hundreds of millions of years, playing key roles in development and physiology. In the last decade, genome and small RNA (sRNA) sequencing of several plant species have helped unveil the evolutionary history of land plants. Among these, the fern group (monilophytes) occupies a key phylogenetic position, as it represents the closest extant cousin taxon of seed plants, i.e. gymno- and angiosperms. However, in spite of their evolutionary, economic and ecological importance, no fern genome has been sequenced yet and few genomic resources are available for this group. Here, we sequenced the small RNA fraction of an epiphytic South American fern, Pleopeltis minima (Polypodiaceae), and compared it to plant miRNA databases, allowing for the identification of miRNA families that are shared by all land plants, shared by all vascular plants (tracheophytes) or shared by euphyllophytes (ferns and seed plants) only. Using the recently described transcriptome of another fern, Lygodium japonicum, we also estimated the degree of conservation of fern miRNA targets in relation to other plant groups. Our results pinpoint the origin of several miRNA families in the land plant evolutionary tree with more precision and are a resource for future genomic and functional studies of fern miRNAs.

  1. Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis.

    PubMed

    Loots, Gabriela G

    2008-01-01

    Despite remarkable recent advances in genomics that have enabled us to identify most of the genes in the human genome, comparable efforts to define transcriptional cis-regulatory elements that control gene expression are lagging behind. The difficulty of this task stems from two equally important problems: our knowledge of how regulatory elements are encoded in genomes remains elementary, and there is a vast genomic search space for regulatory elements, since most of mammalian genomes are noncoding. Comparative genomic approaches are having a remarkable impact on the study of transcriptional regulation in eukaryotes and currently represent the most efficient and reliable methods of predicting noncoding sequences likely to control the patterns of gene expression. By subjecting eukaryotic genomic sequences to computational comparisons and subsequent experimentation, we are inching our way toward a more comprehensive catalog of common regulatory motifs that lie behind fundamental biological processes. We are still far from comprehending how the transcriptional regulatory code is encrypted in the human genome and providing an initial global view of regulatory gene networks, but collectively, the continued development of comparative and experimental approaches will rapidly expand our knowledge of the transcriptional regulome.

  2. Transcriptomics and molecular evolutionary rate analysis of the bladderwort (Utricularia), a carnivorous plant with a minimal genome

    PubMed Central

    2011-01-01

    Background The carnivorous plant Utricularia gibba (bladderwort) is remarkable in having a minute genome, which at ca. 80 megabases is approximately half that of Arabidopsis. Bladderworts show an incredible diversity of forms surrounding a defined theme: tiny, bladder-like suction traps on terrestrial, epiphytic, or aquatic plants with a diversity of unusual vegetative forms. Utricularia plants, which are rootless, are also anomalous in physiological features (respiration and carbon distribution), and highly enhanced molecular evolutionary rates in chloroplast, mitochondrial and nuclear ribosomal sequences. Despite great interest in the genus, no genomic resources exist for Utricularia, and the substitution rate increase has received limited study. Results Here we describe the sequencing and analysis of the Utricularia gibba transcriptome. Three different organs were surveyed, the traps, the vegetative shoot bodies, and the inflorescence stems. We also examined the bladderwort transcriptome under diverse stress conditions. We detail aspects of functional classification, tissue similarity, nitrogen and phosphorus metabolism, respiration, DNA repair, and detoxification of reactive oxygen species (ROS). Long contigs of plastid and mitochondrial genomes, as well as sequences for 100 individual nuclear genes, were compared with those of other plants to better establish information on molecular evolutionary rates. Conclusion The Utricularia transcriptome provides a detailed genomic window into processes occurring in a carnivorous plant. It contains a deep representation of the complex metabolic pathways that characterize a putative minimal plant genome, permitting its use as a source of genomic information to explore the structural, functional, and evolutionary diversity of the genus. Vegetative shoots and traps are the most similar organs by functional classification of their transcriptome, the traps expressing hydrolytic enzymes for prey digestion that were previously thought to be encoded by bacteria. Supporting physiological data, global gene expression analysis shows that traps significantly over-express genes involved in respiration and that phosphate uptake might occur mainly in traps, whereas nitrogen uptake could in part take place in vegetative parts. Expression of DNA repair and ROS detoxification enzymes may be indicative of a response to increased respiration. Finally, evidence from the bladderwort transcriptome, direct measurement of ROS in situ, and cross-species comparisons of organellar genomes and multiple nuclear genes supports the hypothesis that increased nucleotide substitution rates throughout the plant may be due to the mutagenic action of amplified ROS production. PMID:21639913

  3. Methods of Combinatorial Optimization to Reveal Factors Affecting Gene Length

    PubMed Central

    Bolshoy, Alexander; Tatarinova, Tatiana

    2012-01-01

    In this paper we present a novel method for genome ranking according to gene lengths. The main outcomes described in this paper are the following: the formulation of the genome ranking problem, presentation of relevant approaches to solve it, and the demonstration of preliminary results from prokaryotic genomes ordering. Using a subset of prokaryotic genomes, we attempted to uncover factors affecting gene length. We have demonstrated that hyperthermophilic species have shorter genes as compared with mesophilic organisms, which probably means that environmental factors affect gene length. Moreover, these preliminary results show that environmental factors group together in ranking evolutionary distant species. PMID:23300345

  4. Genomics of pear and other Rosaceae fruit trees

    PubMed Central

    Yamamoto, Toshiya; Terakami, Shingo

    2016-01-01

    The family Rosaceae includes many economically important fruit trees, such as pear, apple, peach, cherry, quince, apricot, plum, raspberry, and loquat. Over the past few years, whole-genome sequences have been released for Chinese pear, European pear, apple, peach, Japanese apricot, and strawberry. These sequences help us to conduct functional and comparative genomics studies and to develop new cultivars with desirable traits by marker-assisted selection in breeding programs. These genomics resources also allow identification of evolutionary relationships in Rosaceae, development of genome-wide SNP and SSR markers, and construction of reference genetic linkage maps, which are available through the Genome Database for the Rosaceae website. Here, we review the recent advances in genomics studies and their practical applications for Rosaceae fruit trees, particularly pear, apple, peach, and cherry. PMID:27069399

  5. Genomics of pear and other Rosaceae fruit trees.

    PubMed

    Yamamoto, Toshiya; Terakami, Shingo

    2016-01-01

    The family Rosaceae includes many economically important fruit trees, such as pear, apple, peach, cherry, quince, apricot, plum, raspberry, and loquat. Over the past few years, whole-genome sequences have been released for Chinese pear, European pear, apple, peach, Japanese apricot, and strawberry. These sequences help us to conduct functional and comparative genomics studies and to develop new cultivars with desirable traits by marker-assisted selection in breeding programs. These genomics resources also allow identification of evolutionary relationships in Rosaceae, development of genome-wide SNP and SSR markers, and construction of reference genetic linkage maps, which are available through the Genome Database for the Rosaceae website. Here, we review the recent advances in genomics studies and their practical applications for Rosaceae fruit trees, particularly pear, apple, peach, and cherry.

  6. Increased alignment sensitivity improves the usage of genome alignments for comparative gene annotation.

    PubMed

    Sharma, Virag; Hiller, Michael

    2017-08-21

    Genome alignments provide a powerful basis to transfer gene annotations from a well-annotated reference genome to many other aligned genomes. The completeness of these annotations crucially depends on the sensitivity of the underlying genome alignment. Here, we investigated the impact of the genome alignment parameters and found that parameters with a higher sensitivity allow the detection of thousands of novel alignments between orthologous exons that have been missed before. In particular, comparisons between species separated by an evolutionary distance of >0.75 substitutions per neutral site, like human and other non-placental vertebrates, benefit from increased sensitivity. To systematically test if increased sensitivity improves comparative gene annotations, we built a multiple alignment of 144 vertebrate genomes and used this alignment to map human genes to the other 143 vertebrates with CESAR. We found that higher alignment sensitivity substantially improves the completeness of comparative gene annotations by adding on average 2382 and 7440 novel exons and 117 and 317 novel genes for mammalian and non-mammalian species, respectively. Our results suggest a more sensitive alignment strategy that should generally be used for genome alignments between distantly-related species. Our 144-vertebrate genome alignment and the comparative gene annotations (https://bds.mpi-cbg.de/hillerlab/144VertebrateAlignment_CESAR/) are a valuable resource for comparative genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    PubMed

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Baculovirus phylogeny and evolution.

    PubMed

    Herniou, Elisabeth A; Jehle, Johannes A

    2007-10-01

    The family Baculoviridae represents one of the largest and most diverse groups of viruses and a unique model for studying the forces driving the evolution and biodiversity of double-stranded DNA viruses with large genomes. With the advent of comparative genomics, the phylogenetic relationships of baculoviruses have been put on solid bases. This, as well as improved bioinformatic approaches, has provided a detailed picture of baculovirus phylogeny and evolution. According to the present knowledge, baculoviruses can be classified into at least four evolutionary lineages: the most ancestral dipteran nucleopolyhedroviruses, the hymenopteran nucleopolyhedroviruses and the lepidopteran nucleopolyhedroviruses and granuloviruses. Despite the growing understanding of baculovirus phylogeny and macro-evolution, our knowledge of the micro-evolutionary processes within baculovirus species and virus populations is still limited. Here we present the state of the art on baculovirus phylogeny and evolution.

  9. Does the evolutionary conservation of microsatellite loci imply function?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shriver, M.D.; Deka, R.; Ferrell, R.E.

    Microsatellites are highly polymorphic tandem arrays of short (1-6 bp) sequence motifs which have been found widely distributed in the genomes of all eukaryotes. We have analyzed allele frequency data on 16 microsatellite loci typed in the great apes (human, chimp, orangutan, and gorilla). The majority of these loci (13) were isolated from human genomic libraries; three were cloned from chimpanzee genomic DNA. Most of these loci are not only present in all apes species, but are polymorphic with comparable levels of heterozygosity and have alleles which overlap in size. The extent of divergence of allele frequencies among these fourmore » species were studies using the stepwise-weighted genetic distance (Dsw), which was previously shown to conform to linearity with evolutionary time since divergence for loci where mutations exist in a stepwise fashion. The phylogenetic tree of the great apes constructed from this distance matrix was consistent with the expected topology, with a high bootstrap confidence (82%) for the human/chimp clade. However, the allele frequency distributions of these species are 10 times more similar to each other than expected when they were calibrated with a conservative estimate of the time since separation of humans and the apes. These results are in agreement with sequence-based surveys of microsatellites which have demonstrated that they are highly (90%) conserved over short periods of evolutionary time (< 10 million years) and moderately (30%) conserved over long periods of evolutionary time (> 60-80 million years). This evolutionary conservation has prompted some authors to speculate that there are functional constraints on microsatellite loci. In contrast, the presence of directional bias of mutations with constraints and/or selection against aberrant sized alleles can explain these results.« less

  10. The Genome of the Anaerobic Fungus Orpinomyces sp. Strain C1A Reveals the Unique Evolutionary History of a Remarkable Plant Biomass Degrader

    PubMed Central

    Youssef, Noha H.; Couger, M. B.; Struchtemeyer, Christopher G.; Liggenstoffer, Audra S.; Prade, Rolf A.; Najar, Fares Z.; Atiyeh, Hasan K.; Wilkins, Mark R.

    2013-01-01

    Anaerobic gut fungi represent a distinct early-branching fungal phylum (Neocallimastigomycota) and reside in the rumen, hindgut, and feces of ruminant and nonruminant herbivores. The genome of an anaerobic fungal isolate, Orpinomyces sp. strain C1A, was sequenced using a combination of Illumina and PacBio single-molecule real-time (SMRT) technologies. The large genome (100.95 Mb, 16,347 genes) displayed extremely low G+C content (17.0%), large noncoding intergenic regions (73.1%), proliferation of microsatellite repeats (4.9%), and multiple gene duplications. Comparative genomic analysis identified multiple genes and pathways that are absent in Dikarya genomes but present in early-branching fungal lineages and/or nonfungal Opisthokonta. These included genes for posttranslational fucosylation, the production of specific intramembrane proteases and extracellular protease inhibitors, the formation of a complete axoneme and intraflagellar trafficking machinery, and a near-complete focal adhesion machinery. Analysis of the lignocellulolytic machinery in the C1A genome revealed an extremely rich repertoire, with evidence of horizontal gene acquisition from multiple bacterial lineages. Experimental analysis indicated that strain C1A is a remarkable biomass degrader, capable of simultaneous saccharification and fermentation of the cellulosic and hemicellulosic fractions in multiple untreated grasses and crop residues examined, with the process significantly enhanced by mild pretreatments. This capability, acquired during its separate evolutionary trajectory in the rumen, along with its resilience and invasiveness compared to prokaryotic anaerobes, renders anaerobic fungi promising agents for consolidated bioprocessing schemes in biofuels production. PMID:23709508

  11. The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader.

    PubMed

    Youssef, Noha H; Couger, M B; Struchtemeyer, Christopher G; Liggenstoffer, Audra S; Prade, Rolf A; Najar, Fares Z; Atiyeh, Hasan K; Wilkins, Mark R; Elshahed, Mostafa S

    2013-08-01

    Anaerobic gut fungi represent a distinct early-branching fungal phylum (Neocallimastigomycota) and reside in the rumen, hindgut, and feces of ruminant and nonruminant herbivores. The genome of an anaerobic fungal isolate, Orpinomyces sp. strain C1A, was sequenced using a combination of Illumina and PacBio single-molecule real-time (SMRT) technologies. The large genome (100.95 Mb, 16,347 genes) displayed extremely low G+C content (17.0%), large noncoding intergenic regions (73.1%), proliferation of microsatellite repeats (4.9%), and multiple gene duplications. Comparative genomic analysis identified multiple genes and pathways that are absent in Dikarya genomes but present in early-branching fungal lineages and/or nonfungal Opisthokonta. These included genes for posttranslational fucosylation, the production of specific intramembrane proteases and extracellular protease inhibitors, the formation of a complete axoneme and intraflagellar trafficking machinery, and a near-complete focal adhesion machinery. Analysis of the lignocellulolytic machinery in the C1A genome revealed an extremely rich repertoire, with evidence of horizontal gene acquisition from multiple bacterial lineages. Experimental analysis indicated that strain C1A is a remarkable biomass degrader, capable of simultaneous saccharification and fermentation of the cellulosic and hemicellulosic fractions in multiple untreated grasses and crop residues examined, with the process significantly enhanced by mild pretreatments. This capability, acquired during its separate evolutionary trajectory in the rumen, along with its resilience and invasiveness compared to prokaryotic anaerobes, renders anaerobic fungi promising agents for consolidated bioprocessing schemes in biofuels production.

  12. Comparing Patterns of Natural Selection across Species Using Selective Signatures

    PubMed Central

    Shapiro, B. Jesse; Alm, Eric J

    2008-01-01

    Comparing gene expression profiles over many different conditions has led to insights that were not obvious from single experiments. In the same way, comparing patterns of natural selection across a set of ecologically distinct species may extend what can be learned from individual genome-wide surveys. Toward this end, we show how variation in protein evolutionary rates, after correcting for genome-wide effects such as mutation rate and demographic factors, can be used to estimate the level and types of natural selection acting on genes across different species. We identify unusually rapidly and slowly evolving genes, relative to empirically derived genome-wide and gene family-specific background rates for 744 core protein families in 30 γ-proteobacterial species. We describe the pattern of fast or slow evolution across species as the “selective signature” of a gene. Selective signatures represent a profile of selection across species that is predictive of gene function: pairs of genes with correlated selective signatures are more likely to share the same cellular function, and genes in the same pathway can evolve in concert. For example, glycolysis and phenylalanine metabolism genes evolve rapidly in Idiomarina loihiensis, mirroring an ecological shift in carbon source from sugars to amino acids. In a broader context, our results suggest that the genomic landscape is organized into functional modules even at the level of natural selection, and thus it may be easier than expected to understand the complex evolutionary pressures on a cell. PMID:18266472

  13. Comparative Phylogenomics Uncovers the Impact of Symbiotic Associations on Host Genome Evolution

    PubMed Central

    Delaux, Pierre-Marc; Varala, Kranthi; Edger, Patrick P.; Coruzzi, Gloria M.; Pires, J. Chris; Ané, Jean-Michel

    2014-01-01

    Mutualistic symbioses between eukaryotes and beneficial microorganisms of their microbiome play an essential role in nutrition, protection against disease, and development of the host. However, the impact of beneficial symbionts on the evolution of host genomes remains poorly characterized. Here we used the independent loss of the most widespread plant–microbe symbiosis, arbuscular mycorrhization (AM), as a model to address this question. Using a large phenotypic approach and phylogenetic analyses, we present evidence that loss of AM symbiosis correlates with the loss of many symbiotic genes in the Arabidopsis lineage (Brassicales). Then, by analyzing the genome and/or transcriptomes of nine other phylogenetically divergent non-host plants, we show that this correlation occurred in a convergent manner in four additional plant lineages, demonstrating the existence of an evolutionary pattern specific to symbiotic genes. Finally, we use a global comparative phylogenomic approach to track this evolutionary pattern among land plants. Based on this approach, we identify a set of 174 highly conserved genes and demonstrate enrichment in symbiosis-related genes. Our findings are consistent with the hypothesis that beneficial symbionts maintain purifying selection on host gene networks during the evolution of entire lineages. PMID:25032823

  14. Contrasting evolutionary genome dynamics between domesticated and wild yeasts

    PubMed Central

    Yue, Jia-Xing; Li, Jing; Aigrain, Louise; Hallin, Johan; Persson, Karl; Oliver, Karen; Bergström, Anders; Coupland, Paul; Warringer, Jonas; Lagomarsino, Marco Consentino; Fischer, Gilles; Durbin, Richard; Liti, Gianni

    2017-01-01

    Structural rearrangements have long been recognized as an important source of genetic variation with implications in phenotypic diversity and disease, yet their detailed evolutionary dynamics remain elusive. Here, we use long-read sequencing to generate end-to-end genome assemblies for 12 strains representing major subpopulations of the partially domesticated yeast Saccharomyces cerevisiae and its wild relative Saccharomyces paradoxus. These population-level high-quality genomes with comprehensive annotation allow for the first time a precise definition of chromosomal boundaries between cores and subtelomeres and a high-resolution view of evolutionary genome dynamics. In chromosomal cores, S. paradoxus exhibits faster accumulation of balanced rearrangements (inversions, reciprocal translocations and transpositions) whereas S. cerevisiae accumulates unbalanced rearrangements (novel insertions, deletions and duplications) more rapidly. In subtelomeres, both species show extensive interchromosomal reshuffling, with a higher tempo in S. cerevisiae. Such striking contrasts between wild and domesticated yeasts likely reflect the influence of human activities on structural genome evolution. PMID:28416820

  15. Structural Genomics: Correlation Blocks, Population Structure, and Genome Architecture

    PubMed Central

    Hu, Xin-Sheng; Yeh, Francis C.; Wang, Zhiquan

    2011-01-01

    An integration of the pattern of genome-wide inter-site associations with evolutionary forces is important for gaining insights into the genomic evolution in natural or artificial populations. Here, we assess the inter-site correlation blocks and their distributions along chromosomes. A correlation block is broadly termed as the DNA segment within which strong correlations exist between genetic diversities at any two sites. We bring together the population genetic structure and the genomic diversity structure that have been independently built on different scales and synthesize the existing theories and methods for characterizing genomic structure at the population level. We discuss how population structure could shape correlation blocks and their patterns within and between populations. Effects of evolutionary forces (selection, migration, genetic drift, and mutation) on the pattern of genome-wide correlation blocks are discussed. In eukaryote organisms, we briefly discuss the associations between the pattern of correlation blocks and genome assembly features in eukaryote organisms, including the impacts of multigene family, the perturbation of transposable elements, and the repetitive nongenic sequences and GC-rich isochores. Our reviews suggest that the observable pattern of correlation blocks can refine our understanding of the ecological and evolutionary processes underlying the genomic evolution at the population level. PMID:21886455

  16. Comparing genomes with rearrangements and segmental duplications.

    PubMed

    Shao, Mingfu; Moret, Bernard M E

    2015-06-15

    Large-scale evolutionary events such as genomic rearrange.ments and segmental duplications form an important part of the evolution of genomes and are widely studied from both biological and computational perspectives. A basic computational problem is to infer these events in the evolutionary history for given modern genomes, a task for which many algorithms have been proposed under various constraints. Algorithms that can handle both rearrangements and content-modifying events such as duplications and losses remain few and limited in their applicability. We study the comparison of two genomes under a model including general rearrangements (through double-cut-and-join) and segmental duplications. We formulate the comparison as an optimization problem and describe an exact algorithm to solve it by using an integer linear program. We also devise a sufficient condition and an efficient algorithm to identify optimal substructures, which can simplify the problem while preserving optimality. Using the optimal substructures with the integer linear program (ILP) formulation yields a practical and exact algorithm to solve the problem. We then apply our algorithm to assign in-paralogs and orthologs (a necessary step in handling duplications) and compare its performance with that of the state-of-the-art method MSOAR, using both simulations and real data. On simulated datasets, our method outperforms MSOAR by a significant margin, and on five well-annotated species, MSOAR achieves high accuracy, yet our method performs slightly better on each of the 10 pairwise comparisons. http://lcbb.epfl.ch/softwares/coser. © The Author 2015. Published by Oxford University Press.

  17. Whole-genome de novo sequencing reveals unique genes that contributed to the adaptive evolution of the Mikado pheasant.

    PubMed

    Lee, Chien-Yueh; Hsieh, Ping-Han; Chiang, Li-Mei; Chattopadhyay, Amrita; Li, Kuan-Yi; Lee, Yi-Fang; Lu, Tzu-Pin; Lai, Liang-Chuan; Lin, En-Chung; Lee, Hsinyu; Ding, Shih-Torng; Tsai, Mong-Hsun; Chen, Chien-Yu; Chuang, Eric Y

    2018-05-01

    The Mikado pheasant (Syrmaticus mikado) is a nearly endangered species indigenous to high-altitude regions of Taiwan. This pheasant provides an opportunity to investigate evolutionary processes following geographic isolation. Currently, the genetic background and adaptive evolution of the Mikado pheasant remain unclear. We present the draft genome of the Mikado pheasant, which consists of 1.04 Gb of DNA and 15,972 annotated protein-coding genes. The Mikado pheasant displays expansion and positive selection of genes related to features that contribute to its adaptive evolution, such as energy metabolism, oxygen transport, hemoglobin binding, radiation response, immune response, and DNA repair. To investigate the molecular evolution of the major histocompatibility complex (MHC) across several avian species, 39 putative genes spanning 227 kb on a contiguous region were annotated and manually curated. The MHC loci of the pheasant revealed a high level of synteny, several rapidly evolving genes, and inverse regions compared to the same loci in the chicken. The complete mitochondrial genome was also sequenced, assembled, and compared against four long-tailed pheasants. The results from molecular clock analysis suggest that ancestors of the Mikado pheasant migrated from the north to Taiwan about 3.47 million years ago. This study provides a valuable genomic resource for the Mikado pheasant, insights into its adaptation to high altitude, and the evolutionary history of the genus Syrmaticus, which could potentially be useful for future studies that investigate molecular evolution, genomics, ecology, and immunogenetics.

  18. Use of comparative genomics approaches to characterize interspecies differences in response to environmental chemicals: Challenges, opportunities, and research needs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burgess-Herbert, Sarah L., E-mail: sarah.burgess@alum.mit.edu; Euling, Susan Y.

    A critical challenge for environmental chemical risk assessment is the characterization and reduction of uncertainties introduced when extrapolating inferences from one species to another. The purpose of this article is to explore the challenges, opportunities, and research needs surrounding the issue of how genomics data and computational and systems level approaches can be applied to inform differences in response to environmental chemical exposure across species. We propose that the data, tools, and evolutionary framework of comparative genomics be adapted to inform interspecies differences in chemical mechanisms of action. We compare and contrast existing approaches, from disciplines as varied as evolutionarymore » biology, systems biology, mathematics, and computer science, that can be used, modified, and combined in new ways to discover and characterize interspecies differences in chemical mechanism of action which, in turn, can be explored for application to risk assessment. We consider how genetic, protein, pathway, and network information can be interrogated from an evolutionary biology perspective to effectively characterize variations in biological processes of toxicological relevance among organisms. We conclude that comparative genomics approaches show promise for characterizing interspecies differences in mechanisms of action, and further, for improving our understanding of the uncertainties inherent in extrapolating inferences across species in both ecological and human health risk assessment. To achieve long-term relevance and consistent use in environmental chemical risk assessment, improved bioinformatics tools, computational methods robust to data gaps, and quantitative approaches for conducting extrapolations across species are critically needed. Specific areas ripe for research to address these needs are recommended.« less

  19. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

    PubMed

    Francis, Warren R; Wörheide, Gert

    2017-06-01

    One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Comparative Genomics of Mycobacteria: Some Answers, Yet More New Questions

    PubMed Central

    Behr, Marcel A.

    2015-01-01

    Comparative genomic studies permit a genus-level perspective on the distinction between environmental mycobacteria and Mycobacterium tuberculosis, as well as a species-level assessment of genetic variability within M. tuberculosis. Both of these strata of evolutionary analysis serve to generate hypotheses regarding the genomic basis of M. tuberculosis virulence. In contrasting lessons from macroevolutionary study and microevolutionary study, one can form predictions about which segments of the genome are likely to be essential for or dispensable for the pathogenesis of tuberculosis. Although some of these predictions have been experimentally verified, notable exceptions challenge the direct link between these virulence factors and the capacity of M. tuberculosis to successfully cause disease and propagate between human hosts. These unexpected findings serve as the stimulus for further studies, using genomic comparisons and other approaches, to better define the remarkable success of this recalcitrant pathogen. PMID:25395374

  1. Next-Generation Sequencing of Two Mitochondrial Genomes from Family Pompilidae (Hymenoptera: Vespoidea) Reveal Novel Patterns of Gene Arrangement

    PubMed Central

    Chen, Peng-Yan; Zheng, Bo-Ying; Liu, Jing-Xian; Wei, Shu-Jun

    2016-01-01

    Animal mitochondrial genomes have provided large and diverse datasets for evolutionary studies. Here, the first two representative mitochondrial genomes from the family Pompilidae (Hymenoptera: Vespoidea) were determined using next-generation sequencing. The sequenced region of these two mitochondrial genomes from the species Auplopus sp. and Agenioideus sp. was 16,746 bp long with an A + T content of 83.12% and 16,596 bp long with an A + T content of 78.64%, respectively. In both species, all of the 37 typical mitochondrial genes were determined. The secondary structure of tRNA genes and rRNA genes were predicted and compared with those of other insects. Atypical trnS1 using abnormal anticodons TCT and lacking D-stem pairings was identified. There were 49 helices belonging to six domains in rrnL and 30 helices belonging to three domains in rrns present. Compared with the ancestral organization, four and two tRNA genes were rearranged in mitochondrial genomes of Auplopus and Agenioideus, respectively. In both species, trnM was shuffled upstream of the trnI-trnQ-trnM cluster, and trnA was translocated from the cluster trnA-trnR-trnN-trnS1-trnE-trnF to the region between nad1 and trnL1, which is novel to the Vespoidea. In Auplopus, the tRNA cluster trnW-trnC-trnY was shuffled to trnW-trnY-trnC. Phylogenetic analysis within Vespoidea revealed that Pompilidae and Mutillidae formed a sister lineage, and then sistered Formicidae. The genomes presented in this study have enriched the knowledge base of molecular markers, which is valuable in respect to studies about the gene rearrangement mechanism, genomic evolutionary processes and phylogeny of Hymenoptera. PMID:27727175

  2. Performance and Scalability of Discriminative Metrics for Comparative Gene Identification in 12 Drosophila Genomes

    PubMed Central

    Lin, Michael F.; Deoras, Ameya N.; Rasmussen, Matthew D.; Kellis, Manolis

    2008-01-01

    Comparative genomics of multiple related species is a powerful methodology for the discovery of functional genomic elements, and its power should increase with the number of species compared. Here, we use 12 Drosophila genomes to study the power of comparative genomics metrics to distinguish between protein-coding and non-coding regions. First, we study the relative power of different comparative metrics and their relationship to single-species metrics. We find that even relatively simple multi-species metrics robustly outperform advanced single-species metrics, especially for shorter exons (≤240 nt), which are common in animal genomes. Moreover, the two capture largely independent features of protein-coding genes, with different sensitivity/specificity trade-offs, such that their combinations lead to even greater discriminatory power. In addition, we study how discovery power scales with the number and phylogenetic distance of the genomes compared. We find that species at a broad range of distances are comparably effective informants for pairwise comparative gene identification, but that these are surpassed by multi-species comparisons at similar evolutionary divergence. In particular, while pairwise discovery power plateaued at larger distances and never outperformed the most advanced single-species metrics, multi-species comparisons continued to benefit even from the most distant species with no apparent saturation. Last, we find that genes in functional categories typically considered fast-evolving can nonetheless be recovered at very high rates using comparative methods. Our results have implications for comparative genomics analyses in any species, including the human. PMID:18421375

  3. A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types.

    PubMed

    Cheng, Feixiong; Liu, Chuang; Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

    2015-09-01

    Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics.

  4. A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types

    PubMed Central

    Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

    2015-01-01

    Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics. PMID:26352260

  5. Evolution and population genomics of the Lyme borreliosis pathogen, Borrelia burgdorferi.

    PubMed

    Seifert, Stephanie N; Khatchikian, Camilo E; Zhou, Wei; Brisson, Dustin

    2015-04-01

    Population genomic studies have the potential to address many unresolved questions about microbial pathogens by facilitating the identification of genes underlying ecologically important traits, such as novel virulence factors and adaptations to humans or other host species. Additionally, this framework improves estimations of population demography and evolutionary history to accurately reconstruct recent epidemics and identify the molecular and environmental factors that resulted in the outbreak. The Lyme disease bacterium, Borrelia burgdorferi, exemplifies the power and promise of the application of population genomics to microbial pathogens. We discuss here the future of evolutionary studies in B. burgdorferi, focusing on the primary evolutionary forces of horizontal gene transfer, natural selection, and migration, as investigations transition from analyses of single genes to genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Evolutionary Perspectives on Diversity of Lignocellulose Decay Mechanisms in Basidionycetes (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Hibbett, David

    2018-05-18

    David Hibbett from Clark University on "Evolutionary Perspectives on Diversity of Lignocellulose Decay Mechanisms in Basidiomycetes" at the 7th Annual Genomics of Energy & Environment Meeting on March 21, 2012 in Walnut Creek, California.

  7. Comparative genomic analysis of retrogene repertoire in two green algae Volvox carteri and Chlamydomonas reinhardtii.

    PubMed

    Jąkalski, Marcin; Takeshita, Kazutaka; Deblieck, Mathieu; Koyanagi, Kanako O; Makałowska, Izabela; Watanabe, Hidemi; Makałowski, Wojciech

    2016-08-04

    Retroposition, one of the processes of copying the genetic material, is an important RNA-mediated mechanism leading to the emergence of new genes. Because the transcription controlling segments are usually not copied to the new location in this mechanism, the duplicated gene copies (retrocopies) become pseudogenized. However, few can still survive, e.g. by recruiting novel regulatory elements from the region of insertion. Subsequently, these duplicated genes can contribute to the formation of lineage-specific traits and phenotypic diversity. Despite the numerous studies of the functional retrocopies (retrogenes) in animals and plants, very little is known about their presence in green algae, including morphologically diverse species. The current availability of the genomes of both uni- and multicellular algae provides a good opportunity to conduct a genome-wide investigation in order to fill the knowledge gap in retroposition phenomenon in this lineage. Here we present a comparative genomic analysis of uni- and multicellular algae, Chlamydomonas reinhardtii and Volvox carteri, respectively, to explore their retrogene complements. By adopting a computational approach, we identified 141 retrogene candidates in total in both genomes, with their fraction being significantly higher in the multicellular Volvox. Majority of the retrogene candidates showed signatures of functional constraints, thus indicating their functionality. Detailed analyses of the identified retrogene candidates, their parental genes, and homologs of both, revealed that most of the retrogene candidates were derived from ancient retroposition events in the common ancestor of the two algae and that the parental genes were subsequently lost from the respective lineages, making many retrogenes 'orphan'. We revealed that the genomes of the green algae have maintained many possibly functional retrogenes in spite of experiencing various molecular evolutionary events during a long evolutionary time after the retroposition events. Our first report about the retrogene set in the green algae provides a good foundation for any future investigation of the repertoire of retrogenes and facilitates the assessment of the evolutionary impact of retroposition on diverse morphological traits in this lineage. This article was reviewed by William Martin and Piotr Zielenkiewicz.

  8. Multi-Platform Next-Generation Sequencing of the Domestic Turkey (Meleagris gallopavo): Genome Assembly and Analysis

    PubMed Central

    Aslam, Luqman; Beal, Kathryn; Ann Blomberg, Le; Bouffard, Pascal; Burt, David W.; Crasta, Oswald; Crooijmans, Richard P. M. A.; Cooper, Kristal; Coulombe, Roger A.; De, Supriyo; Delany, Mary E.; Dodgson, Jerry B.; Dong, Jennifer J.; Evans, Clive; Frederickson, Karin M.; Flicek, Paul; Florea, Liliana; Folkerts, Otto; Groenen, Martien A. M.; Harkins, Tim T.; Herrero, Javier; Hoffmann, Steve; Megens, Hendrik-Jan; Jiang, Andrew; de Jong, Pieter; Kaiser, Pete; Kim, Heebal; Kim, Kyu-Won; Kim, Sungwon; Langenberger, David; Lee, Mi-Kyung; Lee, Taeheon; Mane, Shrinivasrao; Marcais, Guillaume; Marz, Manja; McElroy, Audrey P.; Modise, Thero; Nefedov, Mikhail; Notredame, Cédric; Paton, Ian R.; Payne, William S.; Pertea, Geo; Prickett, Dennis; Puiu, Daniela; Qioa, Dan; Raineri, Emanuele; Ruffier, Magali; Salzberg, Steven L.; Schatz, Michael C.; Scheuring, Chantel; Schmidt, Carl J.; Schroeder, Steven; Searle, Stephen M. J.; Smith, Edward J.; Smith, Jacqueline; Sonstegard, Tad S.; Stadler, Peter F.; Tafer, Hakim; Tu, Zhijian (Jake); Van Tassell, Curtis P.; Vilella, Albert J.; Williams, Kelly P.; Yorke, James A.; Zhang, Liqing; Zhang, Hong-Bin; Zhang, Xiaojun; Zhang, Yang; Reed, Kent M.

    2010-01-01

    A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest. PMID:20838655

  9. The population genomics of rhesus macaques (Macaca mulatta) based on whole-genome sequences

    PubMed Central

    Xue, Cheng; Raveendran, Muthuswamy; Harris, R. Alan; Fawcett, Gloria L.; Liu, Xiaoming; White, Simon; Dahdouli, Mahmoud; Rio Deiros, David; Below, Jennifer E.; Salerno, William; Cox, Laura; Fan, Guoping; Ferguson, Betsy; Horvath, Julie; Johnson, Zach; Kanthaswamy, Sree; Kubisch, H. Michael; Liu, Dahai; Platt, Michael; Smith, David G.; Sun, Binghua; Vallender, Eric J.; Wang, Feng; Wiseman, Roger W.; Chen, Rui; Muzny, Donna M.; Gibbs, Richard A.; Yu, Fuli; Rogers, Jeffrey

    2016-01-01

    Rhesus macaques (Macaca mulatta) are the most widely used nonhuman primate in biomedical research, have the largest natural geographic distribution of any nonhuman primate, and have been the focus of much evolutionary and behavioral investigation. Consequently, rhesus macaques are one of the most thoroughly studied nonhuman primate species. However, little is known about genome-wide genetic variation in this species. A detailed understanding of extant genomic variation among rhesus macaques has implications for the use of this species as a model for studies of human health and disease, as well as for evolutionary population genomics. Whole-genome sequencing analysis of 133 rhesus macaques revealed more than 43.7 million single-nucleotide variants, including thousands predicted to alter protein sequences, transcript splicing, and transcription factor binding sites. Rhesus macaques exhibit 2.5-fold higher overall nucleotide diversity and slightly elevated putative functional variation compared with humans. This functional variation in macaques provides opportunities for analyses of coding and noncoding variation, and its cellular consequences. Despite modestly higher levels of nonsynonymous variation in the macaques, the estimated distribution of fitness effects and the ratio of nonsynonymous to synonymous variants suggest that purifying selection has had stronger effects in rhesus macaques than in humans. Demographic reconstructions indicate this species has experienced a consistently large but fluctuating population size. Overall, the results presented here provide new insights into the population genomics of nonhuman primates and expand genomic information directly relevant to primate models of human disease. PMID:27934697

  10. Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System

    PubMed Central

    Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

    2017-01-01

    Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252

  11. The Variable Regions of Lactobacillus rhamnosus Genomes Reveal the Dynamic Evolution of Metabolic and Host-Adaptation Repertoires

    PubMed Central

    Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P.; Smokvina, Tamara; de Vos, Willem M.; Knol, Jan; Kleerebezem, Michiel

    2016-01-01

    Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence–absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains’ core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423

  12. How evolutionary principles improve the understanding of human health and disease.

    PubMed

    Gluckman, Peter D; Low, Felicia M; Buklijas, Tatjana; Hanson, Mark A; Beedle, Alan S

    2011-03-01

    An appreciation of the fundamental principles of evolutionary biology provides new insights into major diseases and enables an integrated understanding of human biology and medicine. However, there is a lack of awareness of their importance amongst physicians, medical researchers, and educators, all of whom tend to focus on the mechanistic (proximate) basis for disease, excluding consideration of evolutionary (ultimate) reasons. The key principles of evolutionary medicine are that selection acts on fitness, not health or longevity; that our evolutionary history does not cause disease, but rather impacts on our risk of disease in particular environments; and that we are now living in novel environments compared to those in which we evolved. We consider these evolutionary principles in conjunction with population genetics and describe several pathways by which evolutionary processes can affect disease risk. These perspectives provide a more cohesive framework for gaining insights into the determinants of health and disease. Coupled with complementary insights offered by advances in genomic, epigenetic, and developmental biology research, evolutionary perspectives offer an important addition to understanding disease. Further, there are a number of aspects of evolutionary medicine that can add considerably to studies in other domains of contemporary evolutionary studies.

  13. How evolutionary principles improve the understanding of human health and disease

    PubMed Central

    Gluckman, Peter D; Low, Felicia M; Buklijas, Tatjana; Hanson, Mark A; Beedle, Alan S

    2011-01-01

    An appreciation of the fundamental principles of evolutionary biology provides new insights into major diseases and enables an integrated understanding of human biology and medicine. However, there is a lack of awareness of their importance amongst physicians, medical researchers, and educators, all of whom tend to focus on the mechanistic (proximate) basis for disease, excluding consideration of evolutionary (ultimate) reasons. The key principles of evolutionary medicine are that selection acts on fitness, not health or longevity; that our evolutionary history does not cause disease, but rather impacts on our risk of disease in particular environments; and that we are now living in novel environments compared to those in which we evolved. We consider these evolutionary principles in conjunction with population genetics and describe several pathways by which evolutionary processes can affect disease risk. These perspectives provide a more cohesive framework for gaining insights into the determinants of health and disease. Coupled with complementary insights offered by advances in genomic, epigenetic, and developmental biology research, evolutionary perspectives offer an important addition to understanding disease. Further, there are a number of aspects of evolutionary medicine that can add considerably to studies in other domains of contemporary evolutionary studies. PMID:25567971

  14. Characterization of the rainbow trout transcriptome using Sanger and 454-Pyrosequencing approaches

    USDA-ARS?s Scientific Manuscript database

    BACKGROUND: Rainbow trout is an important fish species for aquaculture and a model species for research investigations associated with carcinogenesis, comparative immunology, toxicology and the evolutionary biology. However, to date there is no genome reference sequence to facilitate the development...

  15. Characterization of the rainbow trout transcriptome using Sanger and 454-pyrosequencing approaches

    USDA-ARS?s Scientific Manuscript database

    Background: Rainbow trout is an important fish for aquaculture and recreational fisheries and serves as a model species for research investigations associated with carcinogenesis, comparative immunology, toxicology and the evolutionary biology. However, to date there is no genome reference sequence...

  16. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    PubMed Central

    Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

    2009-01-01

    Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249

  17. Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

    PubMed Central

    2014-01-01

    Background Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of “living fossils.” As arthropods, they belong to the Ecdysozoa, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Results Here we apply a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab, Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers grouped into 1,876 distinct genetic intervals and 5,775 candidate conserved protein coding genes. Conclusions Comparison with other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications 300 million years ago, followed by extensive chromosome fusion. These results provide a counter-example to the often noted correlation between whole genome duplication and evolutionary radiations. The new, low-cost genetic mapping method for obtaining a chromosome-scale view of non-model organism genomes that we demonstrate here does not require laboratory culture, and is potentially applicable to a broad range of other species. PMID:24987520

  18. Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

    PubMed Central

    Itoh, Takeshi; Tanaka, Tsuyoshi; Barrero, Roberto A.; Yamasaki, Chisato; Fujii, Yasuyuki; Hilton, Phillip B.; Antonio, Baltazar A.; Aono, Hideo; Apweiler, Rolf; Bruskiewich, Richard; Bureau, Thomas; Burr, Frances; Costa de Oliveira, Antonio; Fuks, Galina; Habara, Takuya; Haberer, Georg; Han, Bin; Harada, Erimi; Hiraki, Aiko T.; Hirochika, Hirohiko; Hoen, Douglas; Hokari, Hiroki; Hosokawa, Satomi; Hsing, Yue; Ikawa, Hiroshi; Ikeo, Kazuho; Imanishi, Tadashi; Ito, Yukiyo; Jaiswal, Pankaj; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Kawashima, Hiroaki; Khurana, Jitendra P.; Kikuchi, Shoshi; Komatsu, Setsuko; Koyanagi, Kanako O.; Kubooka, Hiromi; Lieberherr, Damien; Lin, Yao-Cheng; Lonsdale, David; Matsumoto, Takashi; Matsuya, Akihiro; McCombie, W. Richard; Messing, Joachim; Miyao, Akio; Mulder, Nicola; Nagamura, Yoshiaki; Nam, Jongmin; Namiki, Nobukazu; Numa, Hisataka; Nurimoto, Shin; O’Donovan, Claire; Ohyanagi, Hajime; Okido, Toshihisa; OOta, Satoshi; Osato, Naoki; Palmer, Lance E.; Quetier, Francis; Raghuvanshi, Saurabh; Saichi, Naomi; Sakai, Hiroaki; Sakai, Yasumichi; Sakata, Katsumi; Sakurai, Tetsuya; Sato, Fumihiko; Sato, Yoshiharu; Schoof, Heiko; Seki, Motoaki; Shibata, Michie; Shimizu, Yuji; Shinozaki, Kazuo; Shinso, Yuji; Singh, Nagendra K.; Smith-White, Brian; Takeda, Jun-ichi; Tanino, Motohiko; Tatusova, Tatiana; Thongjuea, Supat; Todokoro, Fusano; Tsugane, Mika; Tyagi, Akhilesh K.; Vanavichit, Apichart; Wang, Aihui; Wing, Rod A.; Yamaguchi, Kaori; Yamamoto, Mayu; Yamamoto, Naoyuki; Yu, Yeisoo; Zhang, Hao; Zhao, Qiang; Higo, Kenichi; Burr, Benjamin; Gojobori, Takashi; Sasaki, Takuji

    2007-01-01

    We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. PMID:17210932

  19. Evolutionary and biotechnology implications of plastid genome variation in the inverted-repeat-lacking clade of legumes.

    PubMed

    Sabir, Jamal; Schwarz, Erika; Ellison, Nicholas; Zhang, Jin; Baeshen, Nabih A; Mutwakil, Muhammed; Jansen, Robert; Ruhlman, Tracey

    2014-08-01

    Land plant plastid genomes (plastomes) provide a tractable model for evolutionary study in that they are relatively compact and gene dense. Among the groups that display an appropriate level of variation for structural features, the inverted-repeat-lacking clade (IRLC) of papilionoid legumes presents the potential to advance general understanding of the mechanisms of genomic evolution. Here, are presented six complete plastome sequences from economically important species of the IRLC, a lineage previously represented by only five completed plastomes. A number of characters are compared across the IRLC including gene retention and divergence, synteny, repeat structure and functional gene transfer to the nucleus. The loss of clpP intron 2 was identified in one newly sequenced member of IRLC, Glycyrrhiza glabra. Using deeply sequenced nuclear transcriptomes from two species helped clarify the nature of the functional transfer of accD to the nucleus in Trifolium, which likely occurred in the lineage leading to subgenus Trifolium. Legumes are second only to cereal crops in agricultural importance based on area harvested and total production. Genetic improvement via plastid transformation of IRLC crop species is an appealing proposition. Comparative analyses of intergenic spacer regions emphasize the need for complete genome sequences for developing transformation vectors for plastid genetic engineering of legume crops. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  20. Gleaning evolutionary insights from the genome sequence of a probiotic yeast Saccharomyces boulardii

    PubMed Central

    2013-01-01

    Background The yeast Saccharomyces boulardii is used worldwide as a probiotic to alleviate the effects of several gastrointestinal diseases and control antibiotics-associated diarrhea. While many studies report the probiotic effects of S. boulardii, no genome information for this yeast is currently available in the public domain. Results We report the 11.4 Mbp draft genome of this probiotic yeast. The draft genome was obtained by assembling Roche 454 FLX + shotgun data into 194 contigs with an N50 of 251 Kbp. We compare our draft genome with all other Saccharomyces cerevisiae genomes. Conclusions Our analysis confirms the close similarity of S. boulardii to S. cerevisiae strains and provides a framework to understand the probiotic effects of this yeast, which exhibits unique physiological and metabolic properties. PMID:24148866

  1. Gleaning evolutionary insights from the genome sequence of a probiotic yeast Saccharomyces boulardii.

    PubMed

    Khatri, Indu; Akhtar, Akil; Kaur, Kamaldeep; Tomar, Rajul; Prasad, Gandham Satyanarayana; Ramya, Thirumalai Nallan Chakravarthy; Subramanian, Srikrishna

    2013-10-22

    The yeast Saccharomyces boulardii is used worldwide as a probiotic to alleviate the effects of several gastrointestinal diseases and control antibiotics-associated diarrhea. While many studies report the probiotic effects of S. boulardii, no genome information for this yeast is currently available in the public domain. We report the 11.4 Mbp draft genome of this probiotic yeast. The draft genome was obtained by assembling Roche 454 FLX + shotgun data into 194 contigs with an N50 of 251 Kbp. We compare our draft genome with all other Saccharomyces cerevisiae genomes. Our analysis confirms the close similarity of S. boulardii to S. cerevisiae strains and provides a framework to understand the probiotic effects of this yeast, which exhibits unique physiological and metabolic properties.

  2. Evolutionary dynamics of protein domain architecture in plants

    PubMed Central

    2012-01-01

    Background Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes. Results Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications. Conclusions Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution. PMID:22252370

  3. Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7

    PubMed Central

    2014-01-01

    Background Although serotype O157:H7 is the predominant enterohemorrhagic Escherichia coli (EHEC), outbreaks of non-O157 EHEC that cause severe foodborne illness, including hemolytic uremic syndrome have increased worldwide. In fact, non-O157 serotypes are now estimated to cause over half of all the Shiga toxin-producing Escherichia coli (STEC) cases, and outbreaks of non-O157 EHEC infections are frequently associated with serotypes O26, O45, O103, O111, O121, and O145. Currently, there are no complete genomes for O145 in public databases. Results We determined the complete genome sequences of two O145 strains (EcO145), one linked to a US lettuce-associated outbreak (RM13514) and one to a Belgium ice-cream-associated outbreak (RM13516). Both strains contain one chromosome and two large plasmids, with genome sizes of 5,737,294 bp for RM13514 and 5,559,008 bp for RM13516. Comparative analysis of the two EcO145 genomes revealed a large core (5,173 genes) and a considerable amount of strain-specific genes. Additionally, the two EcO145 genomes display distinct chromosomal architecture, virulence gene profile, phylogenetic origin of Stx2a prophage, and methylation profile (methylome). Comparative analysis of EcO145 genomes to other completely sequenced STEC and other E. coli and Shigella genomes revealed that, unlike any other known non-O157 EHEC strain, EcO145 ascended from a common lineage with EcO157/EcO55. This evolutionary relationship was further supported by the pangenome analysis of the 10 EHEC str ains. Of the 4,192 EHEC core genes, EcO145 shares more genes with EcO157 than with the any other non-O157 EHEC strains. Conclusions Our data provide evidence that EcO145 and EcO157 evolved from a common lineage, but ultimately each serotype evolves via a lineage-independent nature to EHEC by acquisition of the core set of EHEC virulence factors, including the genes encoding Shiga toxin and the large virulence plasmid. The large variation between the two EcO145 genomes suggests a distinctive evolutionary path between the two outbreak strains. The distinct methylome between the two EcO145 strains is likely due to the presence of a BsuBI/PstI methyltransferase gene cassette in the Stx2a prophage of the strain RM13514, suggesting a role of horizontal gene transfer-mediated epigenetic alteration in the evolution of individual EHEC strains. PMID:24410921

  4. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein familiesmore » that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.« less

  5. Y-chromosome evolution: emerging insights into processes of Y-chromosome degeneration.

    PubMed

    Bachtrog, Doris

    2013-02-01

    The human Y chromosome is intriguing not only because it harbours the master-switch gene that determines gender but also because of its unusual evolutionary history. The Y chromosome evolved from an autosome, and its evolution has been characterized by massive gene decay. Recent whole-genome and transcriptome analyses of Y chromosomes in humans and other primates, in Drosophila species and in plants have shed light on the current gene content of the Y chromosome, its origins and its long-term fate. Furthermore, comparative analysis of young and old Y chromosomes has given further insights into the evolutionary and molecular forces triggering Y-chromosome degeneration and into the evolutionary destiny of the Y chromosome.

  6. Investigation of the Evolutionary Development of the Genus Bifidobacterium by Comparative Genomics

    PubMed Central

    Lugli, Gabriele Andrea; Milani, Christian; Turroni, Francesca; Duranti, Sabrina; Ferrario, Chiara; Viappiani, Alice; Mancabelli, Leonardo; Mangifesta, Marta; Taminiau, Bernard; Delcenserie, Véronique; van Sinderen, Douwe

    2014-01-01

    The Bifidobacterium genus currently encompasses 48 recognized taxa, which have been isolated from different ecosystems. However, the current phylogeny of bifidobacteria is hampered by the relative paucity of genotypic data. Here, we reassessed the taxonomy of this bacterial genus using genome-based approaches, which demonstrated that the previous taxonomic view of bifidobacteria contained several inconsistencies. In particular, high levels of genetic relatedness were shown to exist between particular Bifidobacterium taxa which would not justify their status as separate species. The results presented are here based on average nucleotide identity analysis involving the genome sequences for each type strain of the 48 bifidobacterial taxa, as well as phylogenetic comparative analysis of the predicted core genome of the Bifidobacterium genus. The results of this study demonstrate that the availability of complete genome sequences allows the reconstruction of a more robust bifidobacterial phylogeny than that obtained from a single gene-based sequence comparison, thus discouraging the assignment of a new or separate bifidobacterial taxon without such a genome-based validation. PMID:25107967

  7. Comparative Study of Lectin Domains in Model Species: New Insights into Evolutionary Dynamics

    PubMed Central

    Van Holle, Sofie; De Schutter, Kristof; Eggermont, Lore; Tsaneva, Mariya; Dang, Liuyi; Van Damme, Els J. M.

    2017-01-01

    Lectins are present throughout the plant kingdom and are reported to be involved in diverse biological processes. In this study, we provide a comparative analysis of the lectin families from model species in a phylogenetic framework. The analysis focuses on the different plant lectin domains identified in five representative core angiosperm genomes (Arabidopsis thaliana, Glycine max, Cucumis sativus, Oryza sativa ssp. japonica and Oryza sativa ssp. indica). The genomes were screened for genes encoding lectin domains using a combination of Basic Local Alignment Search Tool (BLAST), hidden Markov models, and InterProScan analysis. Additionally, phylogenetic relationships were investigated by constructing maximum likelihood phylogenetic trees. The results demonstrate that the majority of the lectin families are present in each of the species under study. Domain organization analysis showed that most identified proteins are multi-domain proteins, owing to the modular rearrangement of protein domains during evolution. Most of these multi-domain proteins are widespread, while others display a lineage-specific distribution. Furthermore, the phylogenetic analyses reveal that some lectin families evolved to be similar to the phylogeny of the plant species, while others share a closer evolutionary history based on the corresponding protein domain architecture. Our results yield insights into the evolutionary relationships and functional divergence of plant lectins. PMID:28587095

  8. The Genetic Architecture Underlying the Evolution of a Rare Piscivorous Life History Form in Brown Trout after Secondary Contact and Strong Introgression.

    PubMed

    Jacobs, Arne; Hughes, Martin R; Robinson, Paige C; Adams, Colin E; Elmer, Kathryn R

    2018-05-31

    Identifying the genetic basis underlying phenotypic divergence and reproductive isolation is a longstanding problem in evolutionary biology. Genetic signals of adaptation and reproductive isolation are often confounded by a wide range of factors, such as variation in demographic history or genomic features. Brown trout ( Salmo trutta ) in the Loch Maree catchment, Scotland, exhibit reproductively isolated divergent life history morphs, including a rare piscivorous (ferox) life history form displaying larger body size, greater longevity and delayed maturation compared to sympatric benthivorous brown trout. Using a dataset of 16,066 SNPs, we analyzed the evolutionary history and genetic architecture underlying this divergence. We found that ferox trout and benthivorous brown trout most likely evolved after recent secondary contact of two distinct glacial lineages, and identified 33 genomic outlier windows across the genome, of which several have most likely formed through selection. We further identified twelve candidate genes and biological pathways related to growth, development and immune response potentially underpinning the observed phenotypic differences. The identification of clear genomic signals divergent between life history phenotypes and potentially linked to reproductive isolation, through size assortative mating, as well as the identification of the underlying demographic history, highlights the power of genomic studies of young species pairs for understanding the factors shaping genetic differentiation.

  9. Reconstructing the complex evolutionary history of mobile plasmids in red algal genomes

    PubMed Central

    Lee, JunMo; Kim, Kyeong Mi; Yang, Eun Chan; Miller, Kathy Ann; Boo, Sung Min; Bhattacharya, Debashish; Yoon, Hwan Su

    2016-01-01

    The integration of foreign DNA into algal and plant plastid genomes is a rare event, with only a few known examples of horizontal gene transfer (HGT). Plasmids, which are well-studied drivers of HGT in prokaryotes, have been reported previously in red algae (Rhodophyta). However, the distribution of these mobile DNA elements and their sites of integration into the plastid (ptDNA), mitochondrial (mtDNA), and nuclear genomes of Rhodophyta remain unknown. Here we reconstructed the complex evolutionary history of plasmid-derived DNAs in red algae. Comparative analysis of 21 rhodophyte ptDNAs, including new genome data for 5 species, turned up 22 plasmid-derived open reading frames (ORFs) that showed syntenic and copy number variation among species, but were conserved within different individuals in three lineages. Several plasmid-derived homologs were found not only in ptDNA but also in mtDNA and in the nuclear genome of green plants, stramenopiles, and rhizarians. Phylogenetic and plasmid-derived ORF analyses showed that the majority of plasmid DNAs originated within red algae, whereas others were derived from cyanobacteria, other bacteria, and viruses. Our results elucidate the evolution of plasmid DNAs in red algae and suggest that they spread as parasitic genetic elements. This hypothesis is consistent with their sporadic distribution within Rhodophyta. PMID:27030297

  10. Evolutionary characterization of Ty3/gypsy-like LTR retrotransposons in the parasitic cestode Echinococcus granulosus.

    PubMed

    Bae, Young-An

    2016-11-01

    Cyclophyllidean cestodes including Echinococcus granulosus have a smaller genome and show characteristics such as loss of the gut, a segmented body plan, and accelerated growth rate in hosts compared with other tissue-invading helminths. In an effort to address the molecular mechanism relevant to genome shrinkage, the evolutionary status of long-terminal-repeat (LTR) retrotransposons, which are known as the most potent genomic modulators, was investigated in the E. granulosus draft genome. A majority of the E. granulosus LTR retrotransposons were classified into a novel characteristic clade, named Saci-2, of the Ty3/gypsy family, while the remaining elements belonged to the CsRn1 clade of identical family. Their nucleotide sequences were heavily corrupted by frequent base substitutions and segmental losses. The ceased mobile activity of the major retrotransposons and the following intrinsic DNA loss in their inactive progenies might have contributed to decrease in genome size. Apart from the degenerate copies, a gag gene originating from a CsRn1-like element exhibited substantial evidences suggesting its domestication including a preserved coding profile and transcriptional activity, the presence of syntenic orthologues in cestodes, and selective pressure acting on the gene. To my knowledge, the endogenized gag gene is reported for the first time in invertebrates, though its biological function remains elusive.

  11. Draft genome of the medaka fish: a comprehensive resource for medaka developmental genetics and vertebrate evolutionary biology.

    PubMed

    Takeda, Hiroyuki

    2008-06-01

    The medaka Oryzias latipes is a small egg-laying freshwater teleost, and has become an excellent model system for developmental genetics and evolutionary biology. The medaka genome is relatively small in size, approximately 800 Mb, and the genome sequencing project was recently completed by Japanese research groups, providing a high-quality draft genome sequence of the inbred Hd-rR strain of medaka. In this review, I present an overview of the medaka genome project including genome resources, followed by specific findings obtained with the medaka draft genome. In particular, I focus on the analysis that was done by taking advantage of the medaka system, such as the sex chromosome differentiation and the regional history of medaka species using single nucleotide polymorphisms as genomic markers.

  12. Rapid sequencing of the bamboo mitochondrial genome using Illumina technology and parallel episodic evolution of organelle genomes in grasses.

    PubMed

    Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

    2012-01-01

    Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects.

  13. Rapid Sequencing of the Bamboo Mitochondrial Genome Using Illumina Technology and Parallel Episodic Evolution of Organelle Genomes in Grasses

    PubMed Central

    Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

    2012-01-01

    Background Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. Methodology/Principal Findings We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Conclusions/Significance Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects. PMID:22272330

  14. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  15. Complete Chloroplast Genome of the Multifunctional Crop Globe Artichoke and Comparison with Other Asteraceae

    PubMed Central

    Curci, Pasquale L.; De Paola, Domenico; Danzi, Donatella; Vendramin, Giovanni G.; Sonnante, Gabriella

    2015-01-01

    With over 20,000 species, Asteraceae is the second largest plant family. High-throughput sequencing of nuclear and chloroplast genomes has allowed for a better understanding of the evolutionary relationships within large plant families. Here, the globe artichoke chloroplast (cp) genome was obtained by a combination of whole-genome and BAC clone high-throughput sequencing. The artichoke cp genome is 152,529 bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 25,155 bp, representing the longest IRs found in the Asteraceae family so far. The large (LSC) and the small (SSC) single-copy regions span 83,578 bp and 18,641 bp, respectively. The artichoke cp sequence was compared to the other eight Asteraceae complete cp genomes available, revealing an IR expansion at the SSC/IR boundary. This expansion consists of 17 bp of the ndhF gene generating an overlap between the ndhF and ycf1 genes. A total of 127 cp simple sequence repeats (cpSSRs) were identified in the artichoke cp genome, potentially suitable for future population studies in the Cynara genus. Parsimony-informative regions were evaluated and allowed to place a Cynara species within the Asteraceae family tree. The eight most informative coding regions were also considered and tested for “specific barcode” purpose in the Asteraceae family. Our results highlight the usefulness of cp genome sequencing in exploring plant genome diversity and retrieving reliable molecular resources for phylogenetic and evolutionary studies, as well as for specific barcodes in plants. PMID:25774672

  16. Complete chloroplast genome of the multifunctional crop globe artichoke and comparison with other Asteraceae.

    PubMed

    Curci, Pasquale L; De Paola, Domenico; Danzi, Donatella; Vendramin, Giovanni G; Sonnante, Gabriella

    2015-01-01

    With over 20,000 species, Asteraceae is the second largest plant family. High-throughput sequencing of nuclear and chloroplast genomes has allowed for a better understanding of the evolutionary relationships within large plant families. Here, the globe artichoke chloroplast (cp) genome was obtained by a combination of whole-genome and BAC clone high-throughput sequencing. The artichoke cp genome is 152,529 bp in length, consisting of two single-copy regions separated by a pair of inverted repeats (IRs) of 25,155 bp, representing the longest IRs found in the Asteraceae family so far. The large (LSC) and the small (SSC) single-copy regions span 83,578 bp and 18,641 bp, respectively. The artichoke cp sequence was compared to the other eight Asteraceae complete cp genomes available, revealing an IR expansion at the SSC/IR boundary. This expansion consists of 17 bp of the ndhF gene generating an overlap between the ndhF and ycf1 genes. A total of 127 cp simple sequence repeats (cpSSRs) were identified in the artichoke cp genome, potentially suitable for future population studies in the Cynara genus. Parsimony-informative regions were evaluated and allowed to place a Cynara species within the Asteraceae family tree. The eight most informative coding regions were also considered and tested for "specific barcode" purpose in the Asteraceae family. Our results highlight the usefulness of cp genome sequencing in exploring plant genome diversity and retrieving reliable molecular resources for phylogenetic and evolutionary studies, as well as for specific barcodes in plants.

  17. Low Frequency Variants, Collapsed Based on Biological Knowledge, Uncover Complexity of Population Stratification in 1000 Genomes Project Data

    PubMed Central

    Moore, Carrie B.; Wallace, John R.; Wolfe, Daniel J.; Frase, Alex T.; Pendergrass, Sarah A.; Weiss, Kenneth M.; Ritchie, Marylyn D.

    2013-01-01

    Analyses investigating low frequency variants have the potential for explaining additional genetic heritability of many complex human traits. However, the natural frequencies of rare variation between human populations strongly confound genetic analyses. We have applied a novel collapsing method to identify biological features with low frequency variant burden differences in thirteen populations sequenced by the 1000 Genomes Project. Our flexible collapsing tool utilizes expert biological knowledge from multiple publicly available database sources to direct feature selection. Variants were collapsed according to genetically driven features, such as evolutionary conserved regions, regulatory regions genes, and pathways. We have conducted an extensive comparison of low frequency variant burden differences (MAF<0.03) between populations from 1000 Genomes Project Phase I data. We found that on average 26.87% of gene bins, 35.47% of intergenic bins, 42.85% of pathway bins, 14.86% of ORegAnno regulatory bins, and 5.97% of evolutionary conserved regions show statistically significant differences in low frequency variant burden across populations from the 1000 Genomes Project. The proportion of bins with significant differences in low frequency burden depends on the ancestral similarity of the two populations compared and types of features tested. Even closely related populations had notable differences in low frequency burden, but fewer differences than populations from different continents. Furthermore, conserved or functionally relevant regions had fewer significant differences in low frequency burden than regions under less evolutionary constraint. This degree of low frequency variant differentiation across diverse populations and feature elements highlights the critical importance of considering population stratification in the new era of DNA sequencing and low frequency variant genomic analyses. PMID:24385916

  18. Ancient Origin of the Tryptophan Operon and the Dynamics of Evolutionary Change†

    PubMed Central

    Xie, Gary; Keyhani, Nemat O.; Bonner; Jensen, Roy A.

    2003-01-01

    The seven conserved enzymatic domains required for tryptophan (Trp) biosynthesis are encoded in seven genetic regions that are organized differently (whole-pathway operons, multiple partial-pathway operons, and dispersed genes) in prokaryotes. A comparative bioinformatics evaluation of the conservation and organization of the genes of Trp biosynthesis in prokaryotic operons should serve as an excellent model for assessing the feasibility of predicting the evolutionary histories of genes and operons associated with other biochemical pathways. These comparisons should provide a better understanding of possible explanations for differences in operon organization in different organisms at a genomics level. These analyses may also permit identification of some of the prevailing forces that dictated specific gene rearrangements during the course of evolution. Operons concerned with Trp biosynthesis in prokaryotes have been in a dynamic state of flux. Analysis of closely related organisms among the Bacteria at various phylogenetic nodes reveals many examples of operon scission, gene dispersal, gene fusion, gene scrambling, and gene loss from which the direction of evolutionary events can be deduced. Two milestone evolutionary events have been mapped to the 16S rRNA tree of Bacteria, one splitting the operon in two, and the other rejoining it by gene fusion. The Archaea, though less resolved due to a lesser genome representation, appear to exhibit more gene scrambling than the Bacteria. The trp operon appears to have been an ancient innovation; it was already present in the common ancestor of Bacteria and Archaea. Although the operon has been subjected, even in recent times, to dynamic changes in gene rearrangement, the ancestral gene order can be deduced with confidence. The evolutionary history of the genes of the pathway is discernible in rough outline as a vertical line of descent, with events of lateral gene transfer or paralogy enriching the analysis as interesting features that can be distinguished. As additional genomes are thoroughly analyzed, an increasingly refined resolution of the sequential evolutionary steps is clearly possible. These comparisons suggest that present-day trp operons that possess finely tuned regulatory features are under strong positive selection and are able to resist the disruptive evolutionary events that may be experienced by simpler, poorly regulated operons. PMID:12966138

  19. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.

  20. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  1. Human evolutionary genomics: ethical and interpretive issues.

    PubMed

    Vitti, Joseph J; Cho, Mildred K; Tishkoff, Sarah A; Sabeti, Pardis C

    2012-03-01

    Genome-wide computational studies can now identify targets of natural selection. The unique information about humans these studies reveal, and the media attention they attract, indicate the need for caution and precision in communicating results. This need is exacerbated by ways in which evolutionary and genetic considerations have been misapplied to support discriminatory policies, by persistent misconceptions of these fields and by the social sensitivity surrounding discussions of racial ancestry. We discuss the foundations, accomplishments and future directions of human evolutionary genomics, attending to ways in which the interpretation of good science can go awry, and offer suggestions for researchers to prevent misapplication of their work. Copyright © 2011 Elsevier Ltd. All rights reserved.

  2. The Evolution of the Human Genome

    PubMed Central

    Simonti, Corinne N.; Capra, John A.

    2015-01-01

    Human genomes hold a record of the evolutionary forces that have shaped our species. Advances in DNA sequencing, functional genomics, and population genetic modeling have deepened our understanding of human demographic history, natural selection, and many other long-studied topics. These advances have also revealed many previously underappreciated factors that influence the evolution of the human genome, including functional modifications to DNA and histones, conserved 3D topological chromatin domains, structural variation, and heterogeneous mutation patterns along the genome. Using evolutionary theory as a lens to study these phenomena will lead to significant breakthroughs in understanding what makes us human and why we get sick. PMID:26338498

  3. Persea americana (avocado): bringing ancient flowers to fruit in the genomics era.

    PubMed

    Chanderbali, André S; Albert, Victor A; Ashworth, Vanessa E T M; Clegg, Michael T; Litz, Richard E; Soltis, Douglas E; Soltis, Pamela S

    2008-04-01

    The avocado (Persea americana) is a major crop commodity worldwide. Moreover, avocado, a paleopolyploid, is an evolutionary "outpost" among flowering plants, representing a basal lineage (the magnoliid clade) near the origin of the flowering plants themselves. Following centuries of selective breeding, avocado germplasm has been characterized at the level of microsatellite and RFLP markers. Nonetheless, little is known beyond these general diversity estimates, and much work remains to be done to develop avocado as a major subtropical-zone crop. Among the goals of avocado improvement are to develop varieties with fruit that will "store" better on the tree, show uniform ripening and have better post-harvest storage. Avocado transcriptome sequencing, genome mapping and partial genomic sequencing will represent a major step toward the goal of sequencing the entire avocado genome, which is expected to aid in improving avocado varieties and production, as well as understanding the evolution of flowers from non-flowering seed plants (gymnosperms). Additionally, continued evolutionary and other comparative studies of flower and fruit development in different avocado strains can be accomplished at the gene expression level, including in comparison with avocado relatives, and these should provide important insights into the genetic regulation of fruit development in basal angiosperms.

  4. Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

    PubMed Central

    Desikan, Srinidhi; Narayanan, Sujatha

    2015-01-01

    Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019

  5. Looking for sexual selection in the female brain.

    PubMed

    Cummings, Molly E

    2012-08-19

    Female mate choice behaviour has significant evolutionary consequences, yet its mechanistic origins are not fully understood. Recent studies of female sensory systems have made great strides in identifying internal mechanisms governing female preferences. Only recently, however, have we begun to identify the dynamic genomic response associated with mate choice behaviour. Poeciliids provide a powerful comparative system to examine genomic responses governing mate choice and female preference behaviour, given the great range of mating systems: from female mate choice taxa with ornamental courting males to species lacking male ornamentation and exhibiting only male coercion. Furthermore, they exhibit laboratory-tractable preference responses without sexual contact that are decoupled from reproductive state, allowing investigators to isolate mechanisms in the brain without physiological confounds. Early investigations with poeciliid species (Xiphophorus nigrensis and Gambusia affinis) have identified putative candidate genes associated with female preference response and highlight a possible genomic pathway underlying female social interactions with males linked functionally with synaptic plasticity and learning processes. This network is positively correlated with female preference behaviour in the female mate choice species, but appears inhibited in the male coercive species. This behavioural genomics approach provides opportunity to elucidate the fundamental building blocks, and evolutionary dynamics, of sexual selection.

  6. The genome of the Antarctic-endemic copepod, Tigriopus kingsejongensis.

    PubMed

    Kang, Seunghyun; Ahn, Do-Hwan; Lee, Jun Hyuck; Lee, Sung Gu; Shin, Seung Chul; Lee, Jungeun; Min, Gi-Sik; Lee, Hyoungseok; Kim, Hyun-Woo; Kim, Sanghee; Park, Hyun

    2017-01-01

    The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi . The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis -specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota. © The Author 2017. Published by Oxford University Press.

  7. The genome of the Antarctic-endemic copepod, Tigriopus kingsejongensis

    PubMed Central

    Kang, Seunghyun; Ahn, Do-Hwan; Lee, Jun Hyuck; Lee, Sung Gu; Shin, Seung Chul; Lee, Jungeun; Min, Gi-Sik; Lee, Hyoungseok

    2017-01-01

    Abstract Background: The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. Findings: We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi. The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis-specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. Conclusions: The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota. PMID:28369352

  8. Evolutionary modes of emergence of short interspersed nuclear element (SINE) families in grasses.

    PubMed

    Kögler, Anja; Schmidt, Thomas; Wenke, Torsten

    2017-11-01

    Short interspersed nuclear elements (SINEs) are non-autonomous transposable elements which are propagated by retrotransposition and constitute an inherent part of the genome of most eukaryotic species. Knowledge of heterogeneous and highly abundant SINEs is crucial for de novo (or improvement of) annotation of whole genome sequences. We scanned Poaceae genome sequences of six important cereals (Oryza sativa, Triticum aestivum, Hordeum vulgare, Panicum virgatum, Sorghum bicolor, Zea mays) and Brachypodium distachyon to examine the diversity and evolution of SINE populations. We comparatively analyzed the structural features, distribution, evolutionary relation and abundance of 32 SINE families and subfamilies within grasses, comprising 11 052 individual copies. The investigation of activity profiles within the Poaceae provides insights into their species-specific diversification and amplification. We found that Poaceae SINEs (PoaS) fall into two length categories: simple SINEs of up to 180 bp and dimeric SINEs larger than 240 bp. Detailed analysis at the nucleotide level revealed that multimerization of related and unrelated SINE copies is an important evolutionary mechanism of SINE formation. We conclude that PoaS families diversify by massive reshuffling between SINE families, likely caused by insertion of truncated copies, and provide a model for this evolutionary scenario. Twenty-eight of 32 PoaS families and subfamilies show significant conservation, in particular either in the 5' or 3' regions, across Poaceae species and share large sequence stretches with one or more other PoaS families. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.

  9. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    PubMed

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Under pressure: evolutionary engineering of yeast strains for improved performance in fuels and chemicals production.

    PubMed

    Mans, Robert; Daran, Jean-Marc G; Pronk, Jack T

    2018-04-01

    Evolutionary engineering, which uses laboratory evolution to select for industrially relevant traits, is a popular strategy in the development of high-performing yeast strains for industrial production of fuels and chemicals. By integrating whole-genome sequencing, bioinformatics, classical genetics and genome-editing techniques, evolutionary engineering has also become a powerful approach for identification and reverse engineering of molecular mechanisms that underlie industrially relevant traits. New techniques enable acceleration of in vivo mutation rates, both across yeast genomes and at specific loci. Recent studies indicate that phenotypic trade-offs, which are often observed after evolution under constant conditions, can be mitigated by using dynamic cultivation regimes. Advances in research on synthetic regulatory circuits offer exciting possibilities to extend the applicability of evolutionary engineering to products of yeasts whose synthesis requires a net input of cellular energy. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  11. Incorporating evolutionary processes into population viability models.

    PubMed

    Pierson, Jennifer C; Beissinger, Steven R; Bragg, Jason G; Coates, David J; Oostermeijer, J Gerard B; Sunnucks, Paul; Schumaker, Nathan H; Trotter, Meredith V; Young, Andrew G

    2015-06-01

    We examined how ecological and evolutionary (eco-evo) processes in population dynamics could be better integrated into population viability analysis (PVA). Complementary advances in computation and population genomics can be combined into an eco-evo PVA to offer powerful new approaches to understand the influence of evolutionary processes on population persistence. We developed the mechanistic basis of an eco-evo PVA using individual-based models with individual-level genotype tracking and dynamic genotype-phenotype mapping to model emergent population-level effects, such as local adaptation and genetic rescue. We then outline how genomics can allow or improve parameter estimation for PVA models by providing genotypic information at large numbers of loci for neutral and functional genome regions. As climate change and other threatening processes increase in rate and scale, eco-evo PVAs will become essential research tools to evaluate the effects of adaptive potential, evolutionary rescue, and locally adapted traits on persistence. © 2014 Society for Conservation Biology.

  12. Evidence of evolutionary history and selective sweeps in the genome of Meishan pig reveals its genetic and phenotypic characterization

    USDA-ARS?s Scientific Manuscript database

    Meishan is a famous Chinese indigenous pig breed known for its extremely high fecundity. To explore if Meishan has unique evolutionary process and genome characteristics differing from other pig breeds, we systematically analyzed its genetic divergence, and demographic history by large-scale reseque...

  13. Understanding of evolutionary genomics of invasive species of rice

    USDA-ARS?s Scientific Manuscript database

    Red rice is an aggressive, weedy form of cultivated rice (Oryza sativa) that infests crop fields and is a primary factor limiting rice productivity in the U.S. and worldwide. As the weedy relative of a genomic model species, red rice is a model for understanding the genetic and evolutionary mechani...

  14. Deciphering the Diploid Ancestral Genome of the Mesohexaploid Brassica rapa[C][W

    PubMed Central

    Cheng, Feng; Mandáková, Terezie; Wu, Jian; Xie, Qi; Lysak, Martin A.; Wang, Xiaowu

    2013-01-01

    The genus Brassica includes several important agricultural and horticultural crops. Their current genome structures were shaped by whole-genome triplication followed by extensive diploidization. The availability of several crucifer genome sequences, especially that of Chinese cabbage (Brassica rapa), enables study of the evolution of the mesohexaploid Brassica genomes from their diploid progenitors. We reconstructed three ancestral subgenomes of B. rapa (n = 10) by comparing its whole-genome sequence to ancestral and extant Brassicaceae genomes. All three B. rapa paleogenomes apparently consisted of seven chromosomes, similar to the ancestral translocation Proto-Calepineae Karyotype (tPCK; n = 7), which is the evolutionarily younger variant of the Proto-Calepineae Karyotype (n = 7). Based on comparative analysis of genome sequences or linkage maps of Brassica oleracea, Brassica nigra, radish (Raphanus sativus), and other closely related species, we propose a two-step merging of three tPCK-like genomes to form the hexaploid ancestor of the tribe Brassiceae with 42 chromosomes. Subsequent diversification of the Brassiceae was marked by extensive genome reshuffling and chromosome number reduction mediated by translocation events and followed by loss and/or inactivation of centromeres. Furthermore, via interspecies genome comparison, we refined intervals for seven of the genomic blocks of the Ancestral Crucifer Karyotype (n = 8), thus revising the key reference genome for evolutionary genomics of crucifers. PMID:23653472

  15. Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention.

    PubMed

    Johnston, Iain G; Williams, Ben P

    2016-02-24

    Since their endosymbiotic origin, mitochondria have lost most of their genes. Although many selective mechanisms underlying the evolution of mitochondrial genomes have been proposed, a data-driven exploration of these hypotheses is lacking, and a quantitatively supported consensus remains absent. We developed HyperTraPS, a methodology coupling stochastic modeling with Bayesian inference, to identify the ordering of evolutionary events and suggest their causes. Using 2015 complete mitochondrial genomes, we inferred evolutionary trajectories of mtDNA gene loss across the eukaryotic tree of life. We find that proteins comprising the structural cores of the electron transport chain are preferentially encoded within mitochondrial genomes across eukaryotes. A combination of high GC content and high protein hydrophobicity is required to explain patterns of mtDNA gene retention; a model that accounts for these selective pressures can also predict the success of artificial gene transfer experiments in vivo. This work provides a general method for data-driven inference of the ordering of evolutionary and progressive events, here identifying the distinct features shaping mitochondrial genomes of present-day species. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Genome size evolution in Ontario ferns (Polypodiidae): evolutionary correlations with cell size, spore size, and habitat type and an absence of genome downsizing.

    PubMed

    Henry, Thomas A; Bainard, Jillian D; Newmaster, Steven G

    2014-10-01

    Genome size is known to correlate with a number of traits in angiosperms, but less is known about the phenotypic correlates of genome size in ferns. We explored genome size variation in relation to a suite of morphological and ecological traits in ferns. Thirty-six fern taxa were collected from wild populations in Ontario, Canada. 2C DNA content was measured using flow cytometry. We tested for genome downsizing following polyploidy using a phylogenetic comparative analysis to explore the correlation between 1Cx DNA content and ploidy. There was no compelling evidence for the occurrence of widespread genome downsizing during the evolution of Ontario ferns. The relationship between genome size and 11 morphological and ecological traits was explored using a phylogenetic principal component regression analysis. Genome size was found to be significantly associated with cell size, spore size, spore type, and habitat type. These results are timely as past and recent studies have found conflicting support for the association between ploidy/genome size and spore size in fern polyploid complexes; this study represents the first comparative analysis of the trend across a broad taxonomic group of ferns.

  17. Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

    PubMed

    Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

    2016-12-01

    In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  18. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world

    PubMed Central

    Koonin, Eugene V.; Wolf, Yuri I.

    2008-01-01

    The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution. PMID:18948295

  19. Characterization of irritans mariner-like elements in the olive fruit fly Bactrocera oleae (Diptera: Tephritidae): evolutionary implications.

    PubMed

    Ben Lazhar-Ajroud, Wafa; Caruso, Aurore; Mezghani, Maha; Bouallegue, Maryem; Tastard, Emmanuelle; Denis, Françoise; Rouault, Jacques-Deric; Makni, Hanem; Capy, Pierre; Chénais, Benoît; Makni, Mohamed; Casse, Nathalie

    2016-08-01

    Genomic variation among species is commonly driven by transposable element (TE) invasion; thus, the pattern of TEs in a genome allows drawing an evolutionary history of the studied species. This paper reports in vitro and in silico detection and characterization of irritans mariner-like elements (MLEs) in the genome and transcriptome of Bactrocera oleae (Rossi) (Diptera: Tephritidae). Eleven irritans MLE sequences have been isolated in vitro using terminal inverted repeats (TIRs) as primers, and 215 have been extracted in silico from the sequenced genome of B. oleae. Additionally, the sequenced genomes of Bactrocera tryoni (Froggatt) and Bactrocera cucurbitae (Diptera: Tephritidae) have been explored to identify irritans MLEs. A total of 129 sequences from B. tryoni have been extracted, while the genome of B. cucurbitae appears probably devoid of irritans MLEs. All detected irritans MLEs are defective due to several mutations and are clustered together in a monophyletic group suggesting a common ancestor. The evolutionary history and dynamics of these TEs are discussed in relation with the phylogenetic distribution of their hosts. The knowledge on the structure, distribution, dynamic, and evolution of irritans MLEs in Bactrocera species contributes to the understanding of both their evolutionary history and the invasion history of their hosts. This could also be the basis for genetic control strategies using transposable elements.

  20. Prevalent Role of Gene Features in Determining Evolutionary Fates of Whole-Genome Duplication Duplicated Genes in Flowering Plants1[W][OA

    PubMed Central

    Jiang, Wen-kai; Liu, Yun-long; Xia, En-hua; Gao, Li-zhi

    2013-01-01

    The evolution of genes and genomes after polyploidization has been the subject of extensive studies in evolutionary biology and plant sciences. While a significant number of duplicated genes are rapidly removed during a process called fractionation, which operates after the whole-genome duplication (WGD), another considerable number of genes are retained preferentially, leading to the phenomenon of biased gene retention. However, the evolutionary mechanisms underlying gene retention after WGD remain largely unknown. Through genome-wide analyses of sequence and functional data, we comprehensively investigated the relationships between gene features and the retention probability of duplicated genes after WGDs in six plant genomes, Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa), soybean (Glycine max), rice (Oryza sativa), sorghum (Sorghum bicolor), and maize (Zea mays). The results showed that multiple gene features were correlated with the probability of gene retention. Using a logistic regression model based on principal component analysis, we resolved evolutionary rate, structural complexity, and GC3 content as the three major contributors to gene retention. Cluster analysis of these features further classified retained genes into three distinct groups in terms of gene features and evolutionary behaviors. Type I genes are more prone to be selected by dosage balance; type II genes are possibly subject to subfunctionalization; and type III genes may serve as potential targets for neofunctionalization. This study highlights that gene features are able to act jointly as primary forces when determining the retention and evolution of WGD-derived duplicated genes in flowering plants. These findings thus may help to provide a resolution to the debate on different evolutionary models of gene fates after WGDs. PMID:23396833

  1. Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions.

    PubMed

    Hoban, Sean; Kelley, Joanna L; Lotterhos, Katie E; Antolin, Michael F; Bradburd, Gideon; Lowry, David B; Poss, Mary L; Reed, Laura K; Storfer, Andrew; Whitlock, Michael C

    2016-10-01

    Uncovering the genetic and evolutionary basis of local adaptation is a major focus of evolutionary biology. The recent development of cost-effective methods for obtaining high-quality genome-scale data makes it possible to identify some of the loci responsible for adaptive differences among populations. Two basic approaches for identifying putatively locally adaptive loci have been developed and are broadly used: one that identifies loci with unusually high genetic differentiation among populations (differentiation outlier methods) and one that searches for correlations between local population allele frequencies and local environments (genetic-environment association methods). Here, we review the promises and challenges of these genome scan methods, including correcting for the confounding influence of a species' demographic history, biases caused by missing aspects of the genome, matching scales of environmental data with population structure, and other statistical considerations. In each case, we make suggestions for best practices for maximizing the accuracy and efficiency of genome scans to detect the underlying genetic basis of local adaptation. With attention to their current limitations, genome scan methods can be an important tool in finding the genetic basis of adaptive evolutionary change.

  2. Peregrine and saker falcon genome sequences provide insights into evolution of a predatory lifestyle.

    PubMed

    Zhan, Xiangjiang; Pan, Shengkai; Wang, Junyi; Dixon, Andrew; He, Jing; Muller, Margit G; Ni, Peixiang; Hu, Li; Liu, Yuan; Hou, Haolong; Chen, Yuanping; Xia, Jinquan; Luo, Qiong; Xu, Pengwei; Chen, Ying; Liao, Shengguang; Cao, Changchang; Gao, Shukun; Wang, Zhaobao; Yue, Zhen; Li, Guoqing; Yin, Ye; Fox, Nick C; Wang, Jun; Bruford, Michael W

    2013-05-01

    As top predators, falcons possess unique morphological, physiological and behavioral adaptations that allow them to be successful hunters: for example, the peregrine is renowned as the world's fastest animal. To examine the evolutionary basis of predatory adaptations, we sequenced the genomes of both the peregrine (Falco peregrinus) and saker falcon (Falco cherrug), and we present parallel, genome-wide evidence for evolutionary innovation and selection for a predatory lifestyle. The genomes, assembled using Illumina deep sequencing with greater than 100-fold coverage, are both approximately 1.2 Gb in length, with transcriptome-assisted prediction of approximately 16,200 genes for both species. Analysis of 8,424 orthologs in both falcons, chicken, zebra finch and turkey identified consistent evidence for genome-wide rapid evolution in these raptors. SNP-based inference showed contrasting recent demographic trajectories for the two falcons, and gene-based analysis highlighted falcon-specific evolutionary novelties for beak development and olfaction and specifically for homeostasis-related genes in the arid environment-adapted saker.

  3. Evolutionary signals of selection on cognition from the great tit genome and methylome

    PubMed Central

    Laine, Veronika N.; Gossmann, Toni I.; Schachtschneider, Kyle M.; Garroway, Colin J.; Madsen, Ole; Verhoeven, Koen J. F.; de Jager, Victor; Megens, Hendrik-Jan; Warren, Wesley C.; Minx, Patrick; Crooijmans, Richard P. M. A.; Corcoran, Pádraic; Adriaensen, Frank; Belda, Eduardo; Bushuev, Andrey; Cichon, Mariusz; Charmantier, Anne; Dingemanse, Niels; Doligez, Blandine; Eeva, Tapio; Erikstad, Kjell Einar; Fedorov, Slava; Hau, Michaela; Hille, Sabine; Hinde, Camilla; Kempenaers, Bart; Kerimov, Anvar; Krist, Milos; Mand, Raivo; Matthysen, Erik; Nager, Reudi; Norte, Claudia; Orell, Markku; Richner, Heinz; Slagsvold, Tore; Tilgar, Vallo; Tinbergen, Joost; Torok, Janos; Tschirren, Barbara; Yuta, Tera; Sheldon, Ben C.; Slate, Jon; Zeng, Kai; van Oers, Kees; Visser, Marcel E.; Groenen, Martien A. M.

    2016-01-01

    For over 50 years, the great tit (Parus major) has been a model species for research in evolutionary, ecological and behavioural research; in particular, learning and cognition have been intensively studied. Here, to provide further insight into the molecular mechanisms behind these important traits, we de novo assemble a great tit reference genome and whole-genome re-sequence another 29 individuals from across Europe. We show an overrepresentation of genes related to neuronal functions, learning and cognition in regions under positive selection, as well as increased CpG methylation in these regions. In addition, great tit neuronal non-CpG methylation patterns are very similar to those observed in mammals, suggesting a universal role in neuronal epigenetic regulation which can affect learning-, memory- and experience-induced plasticity. The high-quality great tit genome assembly will play an instrumental role in furthering the integration of ecological, evolutionary, behavioural and genomic approaches in this model species. PMID:26805030

  4. Genomics of Actinobacteria: Tracing the Evolutionary History of an Ancient Phylum†

    PubMed Central

    Ventura, Marco; Canchaya, Carlos; Tauch, Andreas; Chandra, Govind; Fitzgerald, Gerald F.; Chater, Keith F.; van Sinderen, Douwe

    2007-01-01

    Summary: Actinobacteria constitute one of the largest phyla among Bacteria and represent gram-positive bacteria with a high G+C content in their DNA. This bacterial group includes microorganisms exhibiting a wide spectrum of morphologies, from coccoid to fragmenting hyphal forms, as well as possessing highly variable physiological and metabolic properties. Furthermore, Actinobacteria members have adopted different lifestyles, and can be pathogens (e.g., Corynebacterium, Mycobacterium, Nocardia, Tropheryma, and Propionibacterium), soil inhabitants (Streptomyces), plant commensals (Leifsonia), or gastrointestinal commensals (Bifidobacterium). The divergence of Actinobacteria from other bacteria is ancient, making it impossible to identify the phylogenetically closest bacterial group to Actinobacteria. Genome sequence analysis has revolutionized every aspect of bacterial biology by enhancing the understanding of the genetics, physiology, and evolutionary development of bacteria. Various actinobacterial genomes have been sequenced, revealing a wide genomic heterogeneity probably as a reflection of their biodiversity. This review provides an account of the recent explosion of actinobacterial genomics data and an attempt to place this in a biological and evolutionary context. PMID:17804669

  5. Genomic organization and evolution of the Atlantic salmon hemoglobin repertoire

    PubMed Central

    2010-01-01

    Background The genomes of salmonids are considered pseudo-tetraploid undergoing reversion to a stable diploid state. Given the genome duplication and extensive biological data available for salmonids, they are excellent model organisms for studying comparative genomics, evolutionary processes, fates of duplicated genes and the genetic and physiological processes associated with complex behavioral phenotypes. The evolution of the tetrapod hemoglobin genes is well studied; however, little is known about the genomic organization and evolution of teleost hemoglobin genes, particularly those of salmonids. The Atlantic salmon serves as a representative salmonid species for genomics studies. Given the well documented role of hemoglobin in adaptation to varied environmental conditions as well as its use as a model protein for evolutionary analyses, an understanding of the genomic structure and organization of the Atlantic salmon α and β hemoglobin genes is of great interest. Results We identified four bacterial artificial chromosomes (BACs) comprising two hemoglobin gene clusters spanning the entire α and β hemoglobin gene repertoire of the Atlantic salmon genome. Their chromosomal locations were established using fluorescence in situ hybridization (FISH) analysis and linkage mapping, demonstrating that the two clusters are located on separate chromosomes. The BACs were sequenced and assembled into scaffolds, which were annotated for putatively functional and pseudogenized hemoglobin-like genes. This revealed that the tail-to-tail organization and alternating pattern of the α and β hemoglobin genes are well conserved in both clusters, as well as that the Atlantic salmon genome houses substantially more hemoglobin genes, including non-Bohr β globin genes, than the genomes of other teleosts that have been sequenced. Conclusions We suggest that the most parsimonious evolutionary path leading to the present organization of the Atlantic salmon hemoglobin genes involves the loss of a single hemoglobin gene cluster after the whole genome duplication (WGD) at the base of the teleost radiation but prior to the salmonid-specific WGD, which then produced the duplicated copies seen today. We also propose that the relatively high number of hemoglobin genes as well as the presence of non-Bohr β hemoglobin genes may be due to the dynamic life history of salmon and the diverse environmental conditions that the species encounters. Data deposition: BACs S0155C07 and S0079J05 (fps135): GenBank GQ898924; BACs S0055H05 and S0014B03 (fps1046): GenBank GQ898925 PMID:20923558

  6. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    PubMed Central

    2009-01-01

    Background The availability of the complete chicken (Gallus gallus) genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH) and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo) and the first analysis of copy number variants (CNVs) in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos), an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots"). Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies. PMID:19656363

  7. Annotation and Classification of CRISPR-Cas Systems

    PubMed Central

    Makarova, Kira S.; Koonin, Eugene V.

    2018-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods. PMID:25981466

  8. Annotation and Classification of CRISPR-Cas Systems.

    PubMed

    Makarova, Kira S; Koonin, Eugene V

    2015-01-01

    The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods.

  9. Comparative Genome Analysis Reveals Adaptation to the Ectophytic Lifestyle of Sooty Blotch and Flyspeck Fungi.

    PubMed

    Xu, Chao; Zhang, Rong; Sun, Guangyu; Gleason, Mark L

    2017-11-01

    Sooty blotch and flyspeck (SBFS) fungi are a distinctive group of plant pathogens which, although phylogenetically diverse, occupy an exclusively surface-dwelling niche. They cause economic losses by superficially blemishing the fruit of several tree crops, principally apple, in moist temperate regions worldwide. In this study, we performed genome-wide comparative analyses separately within three pairs of species of ascomycete pathogens; each pair contained an SBFS species as well as a closely related but plant-penetrating parasite (PPP) species. Our results showed that all three of the SBFS pathogens had significantly smaller genome sizes, gene numbers and repeat ratios than their counterpart PPPs. The pathogenicity-related genes encoding MFS transporters, secreted proteins (mainly effectors and peptidases), plant cell wall degrading enzymes, and secondary metabolism enzymes were also drastically reduced in the SBFS fungi compared with their PPP relatives. We hypothesize that the above differences in genome composition are due largely to different levels of acquisition, loss, expansion, and contraction of gene families and emergence of orphan genes. Furthermore, results suggested that horizontal gene transfer may have played a role, although limited, in the divergent evolutionary paths of SBFS pathogens and PPPs; repeat-induced point mutation could have inhibited the propagation of transposable elements and expansion of gene families in the SBFS group, given that this mechanism is stronger in the SBFS fungi than in their PPP relatives. These results substantially broaden understanding of evolutionary mechanisms of adaptation of fungi to the epicuticular niche of plants. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. The Phylogeny of Rickettsia Using Different Evolutionary Signatures: How Tree-Like is Bacterial Evolution?

    PubMed Central

    Murray, Gemma G. R.; Weinert, Lucy A.; Rhule, Emma L.; Welch, John J.

    2016-01-01

    Rickettsia is a genus of intracellular bacteria whose hosts and transmission strategies are both impressively diverse, and this is reflected in a highly dynamic genome. Some previous studies have described the evolutionary history of Rickettsia as non-tree-like, due to incongruity between phylogenetic reconstructions using different portions of the genome. Here, we reconstruct the Rickettsia phylogeny using whole-genome data, including two new genomes from previously unsampled host groups. We find that a single topology, which is supported by multiple sources of phylogenetic signal, well describes the evolutionary history of the core genome. We do observe extensive incongruence between individual gene trees, but analyses of simulations over a single topology and interspersed partitions of sites show that this is more plausibly attributed to systematic error than to horizontal gene transfer. Some conflicting placements also result from phylogenetic analyses of accessory genome content (i.e., gene presence/absence), but we argue that these are also due to systematic error, stemming from convergent genome reduction, which cannot be accommodated by existing phylogenetic methods. Our results show that, even within a single genus, tests for gene exchange based on phylogenetic incongruence may be susceptible to false positives. PMID:26559010

  11. The mitochondrial genome of booklouse, Liposcelis sculptilis (Psocoptera: Liposcelididae) and the evolutionary timescale of Liposcelis

    PubMed Central

    Shi, Yan; Chu, Qing; Wei, Dan-Dan; Qiu, Yuan-Jian; Shang, Feng; Dou, Wei; Wang, Jin-Jun

    2016-01-01

    Bilateral animals are featured by an extremely compact mitochondrial (mt) genome with 37 genes on a single circular chromosome. To date, the complete mt genome has only been determined for four species of Liposcelis, a genus with economic importance, including L. entomophila, L. decolor, L. bostrychophila, and L. paeta. They belong to A, B, or D group of Liposcelis, respectively. Unlike most bilateral animals, L. bostrychophila, L. entomophila and L. paeta have a bitipartite mt genome with genes on two chromosomes. However, the mt genome of L. decolor has the typical mt chromosome of bilateral animals. Here, we sequenced the mt genome of L. sculptilis, and identified 35 genes, which were on a single chromosome. The mt genome fragmentation is not shared by the D group of Liposcelis and the single chromosome of L. sculptilis differed from those of booklice known in gene content and gene arrangement. We inferred that different evolutionary patterns and rate existed in Liposcelis. Further, we reconstructed the evolutionary history of 21 psocodean taxa with phylogenetic analyses, which suggested that Liposcelididae and Phthiraptera have evolved 134 Ma and the sucking lice diversified in the Late Cretaceous. PMID:27470659

  12. Understanding Vegetative Desiccation Tolerance using Integrated Functional Genomics Approaches within a Comparative Evolutionary Framework

    USDA-ARS?s Scientific Manuscript database

    Desiccation tolerance (DT) is defined as the equilibration of protoplasmic water potential with that of the surrounding air (generally dry) without loss of viability upon rehydration. Vegetative DT is widespread amongst mosses and lichens, but is relatively rare in vascular plants (0.15%). Recent st...

  13. Extreme sensitivity to ultraviolet light in the fungal pathogen causing white-nose syndrome of bats

    Treesearch

    Jonathan M. Palmer; Kevin P. Drees; Jeffrey T. Foster; Daniel L. Lindner

    2018-01-01

    Bat white-nose syndrome (WNS), caused by the fungal pathogen Pseudogymnoascus destructans, has decimated North American hibernating bats since its emergence in 2006. Here, we utilize comparative genomics to examine the evolutionary history of this pathogen in comparison to six closely related nonpathogenic species....

  14. The evolutionary processes of mitochondrial and chloroplast genomes differ from those of nuclear genomes

    NASA Astrophysics Data System (ADS)

    Korpelainen, Helena

    2004-11-01

    This paper first introduces our present knowledge of the origin of mitochondria and chloroplasts, and the organization and inheritance patterns of their genomes, and then carries on to review the evolutionary processes influencing mitochondrial and chloroplast genomes. The differences in evolutionary phenomena between the nuclear and cytoplasmic genomes are highlighted. It is emphasized that varying inheritance patterns and copy numbers among different types of genomes, and the potential advantage achieved through the transfer of many cytoplasmic genes to the nucleus, have important implications for the evolution of nuclear, mitochondrial and chloroplast genomes. Cytoplasmic genes transferred to the nucleus have joined the more strictly controlled genetic system of the nuclear genome, including also sexual recombination, while genes retained within the cytoplasmic organelles can be involved in selection and drift processes both within and among individuals. Within-individual processes can be either intra- or intercellular. In the case of heteroplasmy, which is attributed to mutations or biparental inheritance, within-individual selection on cytoplasmic DNA may provide a mechanism by which the organism can adapt rapidly. The inheritance of cytoplasmic genomes is not universally maternal. The presence of a range of inheritance patterns indicates that different strategies have been adopted by different organisms. On the other hand, the variability occasionally observed in the inheritance mechanisms of cytoplasmic genomes reduces heritability and increases environmental components in phenotypic features and, consequently, decreases the potential for adaptive evolution.

  15. High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA.

    PubMed

    Wang, Wenqin; Messing, Joachim

    2011-01-01

    Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  16. High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA

    PubMed Central

    Wang, Wenqin; Messing, Joachim

    2011-01-01

    Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. Conclusions This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power. PMID:21931804

  17. Evolutionary force of AT-rich repeats to trap genomic and episomal DNAs into the rice genome: lessons from endogenous pararetrovirus.

    PubMed

    Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji

    2012-12-01

    In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  18. Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism

    PubMed Central

    Yin, Ling; An, Yunhe; Qu, Junjie; Li, Xinlong; Zhang, Yali; Dry, Ian; Wu, Huijuan; Lu, Jiang

    2017-01-01

    Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3 Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host. PMID:28417959

  19. Phylogenetics and Differentiation of Salmonella Newport Lineages by Whole Genome Sequencing

    PubMed Central

    Cao, Guojie; Meng, Jianghong; Strain, Errol; Stones, Robert; Pettengill, James; Zhao, Shaohua; McDermott, Patrick; Brown, Eric; Allard, Marc

    2013-01-01

    Salmonella Newport has ranked in the top three Salmonella serotypes associated with foodborne outbreaks from 1995 to 2011 in the United States. In the current study, we selected 26 S. Newport strains isolated from diverse sources and geographic locations and then conducted 454 shotgun pyrosequencing procedures to obtain 16–24 × coverage of high quality draft genomes for each strain. Comparative genomic analysis of 28 S. Newport strains (including 2 reference genomes) and 15 outgroup genomes identified more than 140,000 informative SNPs. A resulting phylogenetic tree consisted of four sublineages and indicated that S. Newport had a clear geographic structure. Strains from Asia were divergent from those from the Americas. Our findings demonstrated that analysis using whole genome sequencing data resulted in a more accurate picture of phylogeny compared to that using single genes or small sets of genes. We selected loci around the mutS gene of S. Newport to differentiate distinct lineages, including those between invH and mutS genes at the 3′ end of Salmonella Pathogenicity Island 1 (SPI-1), ste fimbrial operon, and Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR) associated-proteins (cas). These genes in the outgroup genomes held high similarity with either S. Newport Lineage II or III at the same loci. S. Newport Lineages II and III have different evolutionary histories in this region and our data demonstrated genetic flow and homologous recombination events around mutS. The findings suggested that S. Newport Lineages II and III diverged early in the serotype evolution and have evolved largely independently. Moreover, we identified genes that could delineate sublineages within the phylogenetic tree and that could be used as potential biomarkers for trace-back investigations during outbreaks. Thus, whole genome sequencing data enabled us to better understand the genetic background of pathogenicity and evolutionary history of S. Newport and also provided additional markers for epidemiological response. PMID:23409020

  20. The function and evolution of the Aspergillus genome

    PubMed Central

    Gibbons, John G.; Rokas, Antonis

    2012-01-01

    Species in the filamentous fungal genus Aspergillus display a wide diversity of lifestyles and are of great importance to humans. The decoding of genome sequences from a dozen species that vary widely in their degree of evolutionary affinity has galvanized studies of the function and evolution of the Aspergillus genome in clinical, industrial, and agricultural environments. Here, we synthesize recent key findings that shed light on the architecture of the Aspergillus genome, on the molecular foundations of the genus’ astounding dexterity and diversity in secondary metabolism, and on the genetic underpinnings of virulence in Aspergillus fumigatus, one of the most lethal fungal pathogens. Many of these insights dramatically expand our knowledge of fungal and microbial eukaryote genome evolution and function and argue that Aspergillus constitutes a superb model clade for the study of functional and comparative genomics. PMID:23084572

  1. Single cell genome analysis of an uncultured heterotrophic stramenopile

    NASA Astrophysics Data System (ADS)

    Roy, Rajat S.; Price, Dana C.; Schliep, Alexander; Cai, Guohong; Korobeynikov, Anton; Yoon, Hwan Su; Yang, Eun Chan; Bhattacharya, Debashish

    2014-04-01

    A broad swath of eukaryotic microbial biodiversity cannot be cultivated in the lab and is therefore inaccessible to conventional genome-wide comparative methods. One promising approach to study these lineages is single cell genomics (SCG), whereby an individual cell is captured from nature and genome data are produced from the amplified total DNA. Here we tested the efficacy of SCG to generate a draft genome assembly from a single sample, in this case a cell belonging to the broadly distributed MAST-4 uncultured marine stramenopiles. Using de novo gene prediction, we identified 6,996 protein-encoding genes in the MAST-4 genome. This genetic inventory was sufficient to place the cell within the ToL using multigene phylogenetics and provided preliminary insights into the complex evolutionary history of horizontal gene transfer (HGT) in the MAST-4 lineage.

  2. Complete mitochondrial genome sequences of Atlantic representatives of the invasive Pacific coral species Tubastraea coccinea and T. tagusensis (Scleractinia, Dendrophylliidae): Implications for species identification.

    PubMed

    Capel, K C C; Migotto, A E; Zilberberg, C; Lin, M F; Forsman, Z; Miller, D J; Kitahara, M V

    2016-09-30

    Members of the azooxanthellate coral genus Tubastraea are invasive species with particular concern because they have become established and are fierce competitors in the invaded areas in many parts of the world. Pacific Tubastraea species are spreading fast throughout the Atlantic Ocean, occupying over 95% of the available substrate in some areas and out-competing native endemic species. Approximately half of all known coral species are azooxanthellate but these are seriously under-represented compared to zooxanthellate corals in terms of the availability of mitochondrial (mt) genome data. In the present study, the complete mt DNA sequences of Atlantic individuals of the invasive scleractinian species Tubastraea coccinea and Tubastraea tagusensis were determined and compared to the GenBank reference sequence available for a Pacific "T. coccinea" individual. At 19,094bp (compared to 19,070bp for the GenBank specimen), the mt genomes assembled for the Atlantic T. coccinea and T. tagusensis were among the longest sequence determined to date for "Complex" scleractinians. Comparisons of genomes data showed that the "T. coccinea" sequence deposited on GenBank was more closely related to that from Dendrophyllia arbuscula than to the Atlantic Tubastraea spp., in terms of genome length and base pair similarities. This was confirmed by phylogenetic analysis, suggesting that the former was misidentified and might actually be a member from the genus Dendrophyllia. In addition, although in general the COX1 locus has a slow evolutionary rate in Scleractinia, it was the most variable region of the Tubastraea mt genome and can be used as markers for genus or species identification. Given the limited data available for azooxanthellate corals, the results presented here represent an important contribution to our understanding of phylogenetic relationships and the evolutionary history of the Scleractinia. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora).

    PubMed

    Nie, Xiaojun; Lv, Shuzuo; Zhang, Yingxin; Du, Xianghong; Wang, Le; Biradar, Siddanagouda S; Tan, Xiufang; Wan, Fanghao; Weining, Song

    2012-01-01

    Crofton weed (Ageratina adenophora) is one of the most hazardous invasive plant species, which causes serious economic losses and environmental damages worldwide. However, the sequence resource and genome information of A. adenophora are rather limited, making phylogenetic identification and evolutionary studies very difficult. Here, we report the complete sequence of the A. adenophora chloroplast (cp) genome based on Illumina sequencing. The A. adenophora cp genome is 150, 689 bp in length including a small single-copy (SSC) region of 18, 358 bp and a large single-copy (LSC) region of 84, 815 bp separated by a pair of inverted repeats (IRs) of 23, 755 bp. The genome contains 130 unique genes and 18 duplicated in the IR regions, with the gene content and organization similar to other Asteraceae cp genomes. Comparative analysis identified five DNA regions (ndhD-ccsA, psbI-trnS, ndhF-ycf1, ndhI-ndhG and atpA-trnR) containing parsimony-informative characters higher than 2%, which may be potential informative markers for barcoding and phylogenetic analysis. Repeat structure, codon usage and contraction of the IR were also investigated to reveal the pattern of evolution. Phylogenetic analysis demonstrated a sister relationship between A. adenophora and Guizotia abyssinica and supported a monophyly of the Asterales. We have assembled and analyzed the chloroplast genome of A. adenophora in this study, which was the first sequenced plastome in the Eupatorieae tribe. The complete chloroplast genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family.

  4. Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

    PubMed Central

    Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.

    2012-01-01

    High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309

  5. Divergence time, historical biogeography and evolutionary rate estimation of the order Bangiales (Rhodophyta) inferred from multilocus data

    NASA Astrophysics Data System (ADS)

    Xu, Kuipeng; Tang, Xianghai; Wang, Lu; Yu, Xinzi; Sun, Peipei; Mao, Yunxiang

    2017-08-01

    Bangiales is the only order of the Bangiophyceae and has been suggested to be monophyletic. This order contains approximately 190 species and is distributed worldwide. Previous molecular studies have produced robust phylogenies among the red algae, but the divergence times, historical biogeography and evolutionary rates of Bangiales have rarely been studied. Phylogenetic relationships within the Bangiales were examined using the concatenated gene sets from all available organellar genomes. This analysis has revealed the topology ((( Bangia, Porphyra ) Pyropia ) Wildemania ). Molecular dating indicates that Bangiales diversified approximately 246.40 million years ago (95% highest posterior density (HPD)= 194.78u2013318.24 Ma, posterior probability (PP)=0.99) in the Late Permian and Early Triassic, and that the ancestral species most likely originated from eastern Gondwanaland (currently New Zealand and Australia) and subsequently began to spread and evolve worldwide. Based on pairwise comparisons, we found a slower rate of nucleotide substitutions and lower rates of diversification in Bangiales relative to Florideophyceae. Compared with Viridiplantae (green algae and land plants), the evolutionary rates of Bangiales and other Rhodophyte groups were found to be dramatically faster, by more than 3-fold for plastid genome (ptDNA) and 15-fold for mitochondrial genome (mtDNA). In addition, an average 2.5-fold lower dN/dS was found for the algae than for the land plants, which indicates purifying selection of the algae.

  6. Genetic Architecture of Parallel Pelvic Reduction in Ninespine Sticklebacks

    PubMed Central

    Shikano, Takahito; Laine, Veronika N.; Herczeg, Gábor; Vilkki, Johanna; Merilä, Juha

    2013-01-01

    Teleost fish genomes are known to be evolving faster than those of other vertebrate taxa. Thus, fish are suited to address the extent to which the same vs. different genes are responsible for similar phenotypic changes in rapidly evolving genomes of evolutionary independent lineages. To gain insights into the genetic basis and evolutionary processes behind parallel phenotypic changes within and between species, we identified the genomic regions involved in pelvic reduction in Northern European ninespine sticklebacks (Pungitius pungitius) and compared them to those of North American ninespine and threespine sticklebacks (Gasterosteus aculeatus). To this end, we conducted quantitative trait locus (QTL) mapping using 283 F2 progeny from an interpopulation cross. Phenotypic analyses indicated that pelvic reduction is a recessive trait and is inherited in a simple Mendelian fashion. Significant QTL for pelvic spine and girdle lengths were identified in the region of the Pituitary homeobox transcription factor 1 (Pitx1) gene, also responsible for pelvic reduction in threespine sticklebacks. The fact that no QTL was observed in the region identified in the mapping study of North American ninespine sticklebacks suggests that an alternative QTL for pelvic reduction has emerged in this species within the past 1.6 million years after the split between Northern European and North American populations. In general, our study provides empirical support for the view that alternative genetic mechanisms that lead to similar phenotypes can evolve over short evolutionary time scales. PMID:23979937

  7. Genomic Changes Associated with the Evolutionary Transitions of Nostoc to a Plant Symbiont.

    PubMed

    Warshan, Denis; Liaimer, Anton; Pederson, Eric; Kim, Sea-Yong; Shapiro, Nicole; Woyke, Tanja; Altermark, Bjørn; Pawlowski, Katharina; Weyman, Philip D; Dupont, Christopher L; Rasmussen, Ulla

    2018-05-01

    Cyanobacteria belonging to the genus Nostoc comprise free-living strains and also facultative plant symbionts. Symbiotic strains can enter into symbiosis with taxonomically diverse range of host plants. Little is known about genomic changes associated with evolutionary transition of Nostoc from free-living to plant symbiont. Here, we compared the genomes derived from 11 symbiotic Nostoc strains isolated from different host plants and infer phylogenetic relationships between strains. Phylogenetic reconstructions of 89 Nostocales showed that symbiotic Nostoc strains with a broad host range, entering epiphytic and intracellular or extracellular endophytic interactions, form a monophyletic clade indicating a common evolutionary history. A polyphyletic origin was found for Nostoc strains which enter only extracellular symbioses, and inference of transfer events implied that this trait was likely acquired several times in the evolution of the Nostocales. Symbiotic Nostoc strains showed enriched functions in transport and metabolism of organic sulfur, chemotaxis and motility, as well as the uptake of phosphate, branched-chain amino acids, and ammonium. The genomes of the intracellular clade differ from that of other Nostoc strains, with a gain/enrichment of genes encoding proteins to generate l-methionine from sulfite and pathways for the degradation of the plant metabolites vanillin and vanillate, and of the macromolecule xylan present in plant cell walls. These compounds could function as C-sources for members of the intracellular clade. Molecular clock analysis indicated that the intracellular clade emerged ca. 600 Ma, suggesting that intracellular Nostoc symbioses predate the origin of land plants and the emergence of their extant hosts.

  8. Genomic Changes Associated with the Evolutionary Transitions of Nostoc to a Plant Symbiont

    PubMed Central

    Liaimer, Anton; Pederson, Eric; Kim, Sea-Yong; Shapiro, Nicole; Woyke, Tanja; Altermark, Bjørn; Pawlowski, Katharina; Weyman, Philip D; Dupont, Christopher L

    2018-01-01

    Abstract Cyanobacteria belonging to the genus Nostoc comprise free-living strains and also facultative plant symbionts. Symbiotic strains can enter into symbiosis with taxonomically diverse range of host plants. Little is known about genomic changes associated with evolutionary transition of Nostoc from free-living to plant symbiont. Here, we compared the genomes derived from 11 symbiotic Nostoc strains isolated from different host plants and infer phylogenetic relationships between strains. Phylogenetic reconstructions of 89 Nostocales showed that symbiotic Nostoc strains with a broad host range, entering epiphytic and intracellular or extracellular endophytic interactions, form a monophyletic clade indicating a common evolutionary history. A polyphyletic origin was found for Nostoc strains which enter only extracellular symbioses, and inference of transfer events implied that this trait was likely acquired several times in the evolution of the Nostocales. Symbiotic Nostoc strains showed enriched functions in transport and metabolism of organic sulfur, chemotaxis and motility, as well as the uptake of phosphate, branched-chain amino acids, and ammonium. The genomes of the intracellular clade differ from that of other Nostoc strains, with a gain/enrichment of genes encoding proteins to generate l-methionine from sulfite and pathways for the degradation of the plant metabolites vanillin and vanillate, and of the macromolecule xylan present in plant cell walls. These compounds could function as C-sources for members of the intracellular clade. Molecular clock analysis indicated that the intracellular clade emerged ca. 600 Ma, suggesting that intracellular Nostoc symbioses predate the origin of land plants and the emergence of their extant hosts. PMID:29554291

  9. Complete mitochondrial genome and taxonomic revision of Cardiodactylus muiri Otte, 2007 (Gryllidae: Eneopterinae: Lebinthini).

    PubMed

    Dong, Jiajia; Vicente, Natallia; Chintauan-Marquier, Ioana C; Ramadi, Cahyo; Dettai, Agnès; Robillard, Tony

    2017-05-15

    In the present study, we report the high-coverage complete mitochondrial genome (mitogenome) of the cricket Cardiodactylus muiri Otte, 2007. The mitogenome was sequenced using a long-PCR approach on an Ion Torrent Personal Genome Machine (PGM) for next generation sequencing technology. The total length of the amplified mitogenome is 16,328 bp, representing 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes and one noncoding region (D-loop region). The new sets of long-PCR primers reported here are invaluable resources for future comparative evolutionary genomic studies in Orthopteran insects. The new mitogenome sequence is compared with published cricket mitogenomes. In the taxonomic part, we present new records for the species and describe life-history traits, habitat and male calling song of the species; based on observation of new material, the species Cardiodactylus buru Gorochov & Robillard, 2014 is synonymized under C. muiri.

  10. Meta genome-wide network from functional linkages of genes in human gut microbial ecosystems.

    PubMed

    Ji, Yan; Shi, Yixiang; Wang, Chuan; Dai, Jianliang; Li, Yixue

    2013-03-01

    The human gut microbial ecosystem (HGME) exerts an important influence on the human health. In recent researches, meta-genomics provided deep insights into the HGME in terms of gene contents, metabolic processes and genome constitutions of meta-genome. Here we present a novel methodology to investigate the HGME on the basis of a set of functionally coupled genes regardless of their genome origins when considering the co-evolution properties of genes. By analyzing these coupled genes, we showed some basic properties of HGME significantly associated with each other, and further constructed a protein interaction map of human gut meta-genome to discover some functional modules that may relate with essential metabolic processes. Compared with other studies, our method provides a new idea to extract basic function elements from meta-genome systems and investigate complex microbial environment by associating its biological traits with co-evolutionary fingerprints encoded in it.

  11. Comparative genome-wide polymorphic microsatellite markers in Antarctic penguins through next generation sequencing

    PubMed Central

    Vianna, Juliana A.; Noll, Daly; Mura-Jornet, Isidora; Valenzuela-Guerra, Paulina; González-Acuña, Daniel; Navarro, Cristell; Loyola, David E.; Dantas, Gisele P. M.

    2017-01-01

    Abstract Microsatellites are valuable molecular markers for evolutionary and ecological studies. Next generation sequencing is responsible for the increasing number of microsatellites for non-model species. Penguins of the Pygoscelis genus are comprised of three species: Adélie (P. adeliae), Chinstrap (P. antarcticus) and Gentoo penguin (P. papua), all distributed around Antarctica and the sub-Antarctic. The species have been affected differently by climate change, and the use of microsatellite markers will be crucial to monitor population dynamics. We characterized a large set of genome-wide microsatellites and evaluated polymorphisms in all three species. SOLiD reads were generated from the libraries of each species, identifying a large amount of microsatellite loci: 33,677, 35,265 and 42,057 for P. adeliae, P. antarcticus and P. papua, respectively. A large number of dinucleotide (66,139), trinucleotide (29,490) and tetranucleotide (11,849) microsatellites are described. Microsatellite abundance, diversity and orthology were characterized in penguin genomes. We evaluated polymorphisms in 170 tetranucleotide loci, obtaining 34 polymorphic loci in at least one species and 15 polymorphic loci in all three species, which allow to perform comparative studies. Polymorphic markers presented here enable a number of ecological, population, individual identification, parentage and evolutionary studies of Pygoscelis, with potential use in other penguin species. PMID:28898354

  12. Metabolism and evolution: A comparative study of reconstructed genome-level metabolic networks

    NASA Astrophysics Data System (ADS)

    Almaas, Eivind

    2008-03-01

    The availability of high-quality annotations of sequenced genomes has made it possible to generate organism-specific comprehensive maps of cellular metabolism. Currently, more than twenty such metabolic reconstructions are publicly available, with the majority focused on bacteria. A typical metabolic reconstruction for a bacterium results in a complex network containing hundreds of metabolites (nodes) and reactions (links), while some even contain more than a thousand. The constrain-based optimization approach of flux-balance analysis (FBA) is used to investigate the functional characteristics of such large-scale metabolic networks, making it possible to estimate an organism's growth behavior in a wide variety of nutrient environments, as well as its robustness to gene loss. We have recently completed the genome-level metabolic reconstruction of Yersinia pseudotuberculosis, as well as the three Yersinia pestis biovars Antiqua, Mediaevalis, and Orientalis. While Y. pseudotuberculosis typically only causes fever and abdominal pain that can mimic appendicitis, the evolutionary closely related Y. pestis strains are the aetiological agents of the bubonic plague. In this presentation, I will discuss our results and conclusions from a comparative study on the evolution of metabolic function in the four Yersiniae networks using FBA and related techniques, and I will give particular focus to the interplay between metabolic network topology and evolutionary flexibility.

  13. Analyses of Evolutionary Characteristics of the Hemagglutinin-Esterase Gene of Influenza C Virus during a Period of 68 Years Reveals Evolutionary Patterns Different from Influenza A and B Viruses.

    PubMed

    Furuse, Yuki; Matsuzaki, Yoko; Nishimura, Hidekazu; Oshitani, Hitoshi

    2016-11-26

    Infections with the influenza C virus causing respiratory symptoms are common, particularly among children. Since isolation and detection of the virus are rarely performed, compared with influenza A and B viruses, the small number of available sequences of the virus makes it difficult to analyze its evolutionary dynamics. Recently, we reported the full genome sequence of 102 strains of the virus. Here, we exploited the data to elucidate the evolutionary characteristics and phylodynamics of the virus compared with influenza A and B viruses. Along with our data, we obtained public sequence data of the hemagglutinin-esterase gene of the virus; the dataset consists of 218 unique sequences of the virus collected from 14 countries between 1947 and 2014. Informatics analyses revealed that (1) multiple lineages have been circulating globally; (2) there have been weak and infrequent selective bottlenecks; (3) the evolutionary rate is low because of weak positive selection and a low capability to induce mutations; and (4) there is no significant positive selection although a few mutations affecting its antigenicity have been induced. The unique evolutionary dynamics of the influenza C virus must be shaped by multiple factors, including virological, immunological, and epidemiological characteristics.

  14. Analyses of Evolutionary Characteristics of the Hemagglutinin-Esterase Gene of Influenza C Virus during a Period of 68 Years Reveals Evolutionary Patterns Different from Influenza A and B Viruses

    PubMed Central

    Furuse, Yuki; Matsuzaki, Yoko; Nishimura, Hidekazu; Oshitani, Hitoshi

    2016-01-01

    Infections with the influenza C virus causing respiratory symptoms are common, particularly among children. Since isolation and detection of the virus are rarely performed, compared with influenza A and B viruses, the small number of available sequences of the virus makes it difficult to analyze its evolutionary dynamics. Recently, we reported the full genome sequence of 102 strains of the virus. Here, we exploited the data to elucidate the evolutionary characteristics and phylodynamics of the virus compared with influenza A and B viruses. Along with our data, we obtained public sequence data of the hemagglutinin-esterase gene of the virus; the dataset consists of 218 unique sequences of the virus collected from 14 countries between 1947 and 2014. Informatics analyses revealed that (1) multiple lineages have been circulating globally; (2) there have been weak and infrequent selective bottlenecks; (3) the evolutionary rate is low because of weak positive selection and a low capability to induce mutations; and (4) there is no significant positive selection although a few mutations affecting its antigenicity have been induced. The unique evolutionary dynamics of the influenza C virus must be shaped by multiple factors, including virological, immunological, and epidemiological characteristics. PMID:27898037

  15. GreenPhylDB v2.0: comparative and functional genomics in plants.

    PubMed

    Rouard, Mathieu; Guignon, Valentin; Aluome, Christelle; Laporte, Marie-Angélique; Droc, Gaëtan; Walde, Christian; Zmasek, Christian M; Périn, Christophe; Conte, Matthieu G

    2011-01-01

    GreenPhylDB is a database designed for comparative and functional genomics based on complete genomes. Version 2 now contains sixteen full genomes of members of the plantae kingdom, ranging from algae to angiosperms, automatically clustered into gene families. Gene families are manually annotated and then analyzed phylogenetically in order to elucidate orthologous and paralogous relationships. The database offers various lists of gene families including plant, phylum and species specific gene families. For each gene cluster or gene family, easy access to gene composition, protein domains, publications, external links and orthologous gene predictions is provided. Web interfaces have been further developed to improve the navigation through information related to gene families. New analysis tools are also available, such as a gene family ontology browser that facilitates exploration. GreenPhylDB is a component of the South Green Bioinformatics Platform (http://southgreen.cirad.fr/) and is accessible at http://greenphyl.cirad.fr. It enables comparative genomics in a broad taxonomy context to enhance the understanding of evolutionary processes and thus tends to speed up gene discovery.

  16. Viruses and mobile elements as drivers of evolutionary transitions

    PubMed Central

    2016-01-01

    The history of life is punctuated by evolutionary transitions which engender emergence of new levels of biological organization that involves selection acting at increasingly complex ensembles of biological entities. Major evolutionary transitions include the origin of prokaryotic and then eukaryotic cells, multicellular organisms and eusocial animals. All or nearly all cellular life forms are hosts to diverse selfish genetic elements with various levels of autonomy including plasmids, transposons and viruses. I present evidence that, at least up to and including the origin of multicellularity, evolutionary transitions are driven by the coevolution of hosts with these genetic parasites along with sharing of ‘public goods’. Selfish elements drive evolutionary transitions at two distinct levels. First, mathematical modelling of evolutionary processes, such as evolution of primitive replicator populations or unicellular organisms, indicates that only increasing organizational complexity, e.g. emergence of multicellular aggregates, can prevent the collapse of the host–parasite system under the pressure of parasites. Second, comparative genomic analysis reveals numerous cases of recruitment of genes with essential functions in cellular life forms, including those that enable evolutionary transitions. This article is part of the themed issue ‘The major synthetic evolutionary transitions’. PMID:27431520

  17. Viruses and mobile elements as drivers of evolutionary transitions.

    PubMed

    Koonin, Eugene V

    2016-08-19

    The history of life is punctuated by evolutionary transitions which engender emergence of new levels of biological organization that involves selection acting at increasingly complex ensembles of biological entities. Major evolutionary transitions include the origin of prokaryotic and then eukaryotic cells, multicellular organisms and eusocial animals. All or nearly all cellular life forms are hosts to diverse selfish genetic elements with various levels of autonomy including plasmids, transposons and viruses. I present evidence that, at least up to and including the origin of multicellularity, evolutionary transitions are driven by the coevolution of hosts with these genetic parasites along with sharing of 'public goods'. Selfish elements drive evolutionary transitions at two distinct levels. First, mathematical modelling of evolutionary processes, such as evolution of primitive replicator populations or unicellular organisms, indicates that only increasing organizational complexity, e.g. emergence of multicellular aggregates, can prevent the collapse of the host-parasite system under the pressure of parasites. Second, comparative genomic analysis reveals numerous cases of recruitment of genes with essential functions in cellular life forms, including those that enable evolutionary transitions.This article is part of the themed issue 'The major synthetic evolutionary transitions'. © 2016 The Authors.

  18. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

    2014-01-01

    Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

  19. Lineage-specific evolution of the vertebrate Otopetrin gene family revealed by comparative genomic analyses

    PubMed Central

    2011-01-01

    Background Mutations in the Otopetrin 1 gene (Otop1) in mice and fish produce an unusual bilateral vestibular pathology that involves the absence of otoconia without hearing impairment. The encoded protein, Otop1, is the only functionally characterized member of the Otopetrin Domain Protein (ODP) family; the extended sequence and structural preservation of ODP proteins in metazoans suggest a conserved functional role. Here, we use the tools of sequence- and cytogenetic-based comparative genomics to study the Otop1 and the Otop2-Otop3 genes and to establish their genomic context in 25 vertebrates. We extend our evolutionary study to include the gene mutated in Usher syndrome (USH) subtype 1G (Ush1g), both because of the head-to-tail clustering of Ush1g with Otop2 and because Otop1 and Ush1g mutations result in inner ear phenotypes. Results We established that OTOP1 is the boundary gene of an inversion polymorphism on human chromosome 4p16 that originated in the common human-chimpanzee lineage more than 6 million years ago. Other lineage-specific evolutionary events included a three-fold expansion of the Otop genes in Xenopus tropicalis and of Ush1g in teleostei fish. The tight physical linkage between Otop2 and Ush1g is conserved in all vertebrates. To further understand the functional organization of the Ushg1-Otop2 locus, we deduced a putative map of binding sites for CCCTC-binding factor (CTCF), a mammalian insulator transcription factor, from genome-wide chromatin immunoprecipitation-sequencing (ChIP-seq) data in mouse and human embryonic stem (ES) cells combined with detection of CTCF-binding motifs. Conclusions The results presented here clarify the evolutionary history of the vertebrate Otop and Ush1g families, and establish a framework for studying the possible interaction(s) of Ush1g and Otop in developmental pathways. PMID:21261979

  20. Consistency of gene starts among Burkholderia genomes

    PubMed Central

    2011-01-01

    Background Evolutionary divergence in the position of the translational start site among orthologous genes can have significant functional impacts. Divergence can alter the translation rate, degradation rate, subcellular location, and function of the encoded proteins. Results Existing Genbank gene maps for Burkholderia genomes suggest that extensive divergence has occurred--53% of ortholog sets based on Genbank gene maps had inconsistent gene start sites. However, most of these inconsistencies appear to be gene-calling errors. Evolutionary divergence was the most plausible explanation for only 17% of the ortholog sets. Correcting probable errors in the Genbank gene maps decreased the percentage of ortholog sets with inconsistent starts by 68%, increased the percentage of ortholog sets with extractable upstream intergenic regions by 32%, increased the sequence similarity of intergenic regions and predicted proteins, and increased the number of proteins with identifiable signal peptides. Conclusions Our findings highlight an emerging problem in comparative genomics: single-digit percent errors in gene predictions can lead to double-digit percentages of inconsistent ortholog sets. The work demonstrates a simple approach to evaluate and improve the quality of gene maps. PMID:21342528

  1. Elucidating the triplicated ancestral genome structure of radish based on chromosome-level comparison with the Brassica genomes.

    PubMed

    Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan

    2016-07-01

    This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.

  2. Pan-Genome Analysis Links the Hereditary Variation of Leptospirillum ferriphilum With Its Evolutionary Adaptation

    PubMed Central

    Zhang, Xian; Liu, Xueduan; Yang, Fei; Chen, Lv

    2018-01-01

    Niche adaptation has long been recognized to drive intra-species differentiation and speciation, yet knowledge about its relatedness with hereditary variation of microbial genomes is relatively limited. Using Leptospirillum ferriphilum species as a case study, we present a detailed analysis of genomic features of five recognized strains. Genome-to-genome distance calculation preliminarily determined the roles of spatial distance and environmental heterogeneity that potentially contribute to intra-species variation within L. ferriphilum species at the genome level. Mathematical models were further constructed to extrapolate the expansion of L. ferriphilum genomes (an ‘open’ pan-genome), indicating the emergence of novel genes with new sequenced genomes. The identification of diverse mobile genetic elements (MGEs) (such as transposases, integrases, and phage-associated genes) revealed the prevalence of horizontal gene transfer events, which is an important evolutionary mechanism that provides avenues for the recruitment of novel functionalities and further for the genetic divergence of microbial genomes. Comprehensive analysis also demonstrated that the genome reduction by gene loss in a broad sense might contribute to the observed diversification. We thus inferred a plausible explanation to address this observation: the community-dependent adaptation that potentially economizes the limiting resources of the entire community. Now that the introduction of new genes is accompanied by a parallel abandonment of some other ones, our results provide snapshots on the biological fitness cost of environmental adaptation within the L. ferriphilum genomes. In short, our genome-wide analyses bridge the relation between genetic variation of L. ferriphilum with its evolutionary adaptation. PMID:29636744

  3. Prasinoviruses reveal a complex evolutionary history and a patchy environmental distribution

    NASA Astrophysics Data System (ADS)

    Finke, J. F.; Suttle, C.

    2016-02-01

    Prasinophytes constitute a group of eukaryotic phytoplankton that has a global distribution and is a major component of coastal and oceanic communities. Members of this group are infected by large double-stranded DNA viruses that can be significant agents of mortality, and which show evidence of substantial horizontal transfer of genes from their hosts and other organisms. However, information on the genetic diversity of these viruses and their environmental distribution is limited. This study examines the genetic repertoire, phylogeny and environmental distribution of large double-stranded DNA viruses infecting Micromonas pusilla and other prasinophytes. The genomes of viruses infecting M. pusilla were sequenced and compared to those of viruses infecting other prasinophytes, revealing a relatively small set of core genes and a larger flexible pan genome. Comparing genomes among prasinoviruses highlights their variable genetic content and complex evolutionary history. While some of the pan genome is clearly host derived, many open reading frames are most similar to those found in other eukaryotes and bacteria. Gene content of the viruses is is congruent with phylogenetic analysis of viral DNA polymerase sequences and indicates that two clades of M. pusilla viruses are less related to each other than to other prasinoviruses. Moreover, the environmental distribution of prasinovirus DNA polymerase sequences indicates a complex pattern of virus-host interactions in nature. Ultimately, these patterns are influenced by the genetic repertoire encoded by prasinoviruses, and the distribution of the hosts they infect.

  4. The origin and evolution of Basigin(BSG) gene: A comparative genomic and phylogenetic analysis.

    PubMed

    Zhu, Xinyan; Wang, Shenglan; Shao, Mingjie; Yan, Jie; Liu, Fei

    2017-07-01

    Basigin (BSG), also known as extracellular matrix metalloproteinase inducer (EMMPRIN) or cluster of differentiation 147 (CD147), plays various fundamental roles in the intercellular recognition involved in immunologic phenomena, differentiation, and development. In this study, we aimed to compare the similarities and differences of BSG among organisms and explore possible evolutionary relationships based on the comparison result. We used the extensive BLAST tool to search the metazoan genomes, N-glycosylation sites, the transmembrane region and other functional sites. We then identified BSG homologs from genomic sequences and analyzed their phylogenetic relationships. We identified that BSG genes exist not only in the vertebrate metazoans but also in the invertebrate metazoans such as Amphioxus B. floridae, D. melanogaster, A. mellifera, S. japonicum, C. gigas, and T. patagoniensis. After sequence analysis, we confirmed that only vertebrate metazoans and Cephalochordate (amphioxus B. floridae) have the classic structure (a signal peptide, two Ig-like domains (IgC2 and IgI), a transmembrane region, and an intracellular domain). The invertebrate metazoans (excluding amphioxus B. floridae) lack the N-terminal signal peptides and IgC2 domain. We then generated a phylogenetic tree, genome organization comparison, and chromosomal disposition analysis based on the biological information obtained from the NCBI and Ensembl databases. Finally, we established the possible evolutionary scenario of the BSG gene, which showed the restricted exon rearrangement that has occurred during evolution, forming the present-day BSG gene. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Genetic Variation and Adaptation in Africa: Implications for Human Evolution and Disease

    PubMed Central

    Gomez, Felicia; Hirbo, Jibril; Tishkoff, Sarah A.

    2014-01-01

    Because modern humans originated in Africa and have adapted to diverse environments, African populations have high levels of genetic and phenotypic diversity. Thus, genomic studies of diverse African ethnic groups are essential for understanding human evolutionary history and how this leads to differential disease risk in all humans. Comparative studies of genetic diversity within and between African ethnic groups creates an opportunity to reconstruct some of the earliest events in human population history and are useful for identifying patterns of genetic variation that have been influenced by recent natural selection. Here we describe what is currently known about genetic variation and evolutionary history of diverse African ethnic groups. We also describe examples of recent natural selection in African genomes and how these data are informative for understanding the frequency of many genetic traits, including those that cause disease susceptibility in African populations and populations of recent African descent. PMID:24984772

  6. Genomic Features of the Damselfly Calopteryx splendens Representing a Sister Clade to Most Insect Orders

    PubMed Central

    Ioannidis, Panagiotis; Simao, Felipe A.; Waterhouse, Robert M.; Manni, Mosè; Seppey, Mathieu; Robertson, Hugh M.; Misof, Bernhard; Niehuis, Oliver

    2017-01-01

    Insects comprise the most diverse and successful animal group with over one million described species that are found in almost every terrestrial and limnic habitat, with many being used as important models in genetics, ecology, and evolutionary research. Genome sequencing projects have greatly expanded the sampling of species from many insect orders, but genomic resources for species of certain insect lineages have remained relatively limited to date. To address this paucity, we sequenced the genome of the banded demoiselle, Calopteryx splendens, a damselfly (Odonata: Zygoptera) belonging to Palaeoptera, the clade containing the first winged insects. The 1.6 Gbp C. splendens draft genome assembly is one of the largest insect genomes sequenced to date and encodes a predicted set of 22,523 protein-coding genes. Comparative genomic analyses with other sequenced insects identified a relatively small repertoire of C. splendens detoxification genes, which could explain its previously noted sensitivity to habitat pollution. Intriguingly, this repertoire includes a cytochrome P450 gene not previously described in any insect genome. The C. splendens immune gene repertoire appears relatively complete and features several genes encoding novel multi-domain peptidoglycan recognition proteins. Analysis of chemosensory genes revealed the presence of both gustatory and ionotropic receptors, as well as the insect odorant receptor coreceptor gene (OrCo) and at least four partner odorant receptors (ORs). This represents the oldest known instance of a complete OrCo/OR system in insects, and provides the molecular underpinning for odonate olfaction. The C. splendens genome improves the sampling of insect lineages that diverged before the radiation of Holometabola and offers new opportunities for molecular-level evolutionary, ecological, and behavioral studies. PMID:28137743

  7. Comparative analysis of gene regulatory networks: from network reconstruction to evolution.

    PubMed

    Thompson, Dawn; Regev, Aviv; Roy, Sushmita

    2015-01-01

    Regulation of gene expression is central to many biological processes. Although reconstruction of regulatory circuits from genomic data alone is therefore desirable, this remains a major computational challenge. Comparative approaches that examine the conservation and divergence of circuits and their components across strains and species can help reconstruct circuits as well as provide insights into the evolution of gene regulatory processes and their adaptive contribution. In recent years, advances in genomic and computational tools have led to a wealth of methods for such analysis at the sequence, expression, pathway, module, and entire network level. Here, we review computational methods developed to study transcriptional regulatory networks using comparative genomics, from sequence to functional data. We highlight how these methods use evolutionary conservation and divergence to reliably detect regulatory components as well as estimate the extent and rate of divergence. Finally, we discuss the promise and open challenges in linking regulatory divergence to phenotypic divergence and adaptation.

  8. Insights on the evolution of metabolic networks of unicellular translationally biased organisms from transcriptomic data and sequence analysis.

    PubMed

    Carbone, Alessandra; Madden, Richard

    2005-10-01

    Codon bias is related to metabolic functions in translationally biased organisms, and two facts are argued about. First, genes with high codon bias describe in meaningful ways the metabolic characteristics of the organism; important metabolic pathways corresponding to crucial characteristics of the lifestyle of an organism, such as photosynthesis, nitrification, anaerobic versus aerobic respiration, sulfate reduction, methanogenesis, and others, happen to involve especially biased genes. Second, gene transcriptional levels of sets of experiments representing a significant variation of biological conditions strikingly confirm, in the case of Saccharomyces cerevisiae, that metabolic preferences are detectable by purely statistical analysis: the high metabolic activity of yeast during fermentation is encoded in the high bias of enzymes involved in the associated pathways, suggesting that this genome was affected by a strong evolutionary pressure that favored a predominantly fermentative metabolism of yeast in the wild. The ensemble of metabolic pathways involving enzymes with high codon bias is rather well defined and remains consistent across many species, even those that have not been considered as translationally biased, such as Helicobacter pylori, for instance, reveal some weak form of translational bias for this genome. We provide numerical evidence, supported by experimental data, of these facts and conclude that the metabolic networks of translationally biased genomes, observable today as projections of eons of evolutionary pressure, can be analyzed numerically and predictions of the role of specific pathways during evolution can be derived. The new concepts of Comparative Pathway Index, used to compare organisms with respect to their metabolic networks, and Evolutionary Pathway Index, used to detect evolutionarily meaningful bias in the genetic code from transcriptional data, are introduced.

  9. Gene loss, adaptive evolution and the co-evolution of plumage coloration genes with opsins in birds.

    PubMed

    Borges, Rui; Khan, Imran; Johnson, Warren E; Gilbert, M Thomas P; Zhang, Guojie; Jarvis, Erich D; O'Brien, Stephen J; Antunes, Agostinho

    2015-10-06

    The wide range of complex photic systems observed in birds exemplifies one of their key evolutionary adaptions, a well-developed visual system. However, genomic approaches have yet to be used to disentangle the evolutionary mechanisms that govern evolution of avian visual systems. We performed comparative genomic analyses across 48 avian genomes that span extant bird phylogenetic diversity to assess evolutionary changes in the 17 representatives of the opsin gene family and five plumage coloration genes. Our analyses suggest modern birds have maintained a repertoire of up to 15 opsins. Synteny analyses indicate that PARA and PARIE pineal opsins were lost, probably in conjunction with the degeneration of the parietal organ. Eleven of the 15 avian opsins evolved in a non-neutral pattern, confirming the adaptive importance of vision in birds. Visual conopsins sw1, sw2 and lw evolved under negative selection, while the dim-light RH1 photopigment diversified. The evolutionary patterns of sw1 and of violet/ultraviolet sensitivity in birds suggest that avian ancestors had violet-sensitive vision. Additionally, we demonstrate an adaptive association between the RH2 opsin and the MC1R plumage color gene, suggesting that plumage coloration has been photic mediated. At the intra-avian level we observed some unique adaptive patterns. For example, barn owl showed early signs of pseudogenization in RH2, perhaps in response to nocturnal behavior, and penguins had amino acid deletions in RH2 sites responsible for the red shift and retinal binding. These patterns in the barn owl and penguins were convergent with adaptive strategies in nocturnal and aquatic mammals, respectively. We conclude that birds have evolved diverse opsin adaptations through gene loss, adaptive selection and coevolution with plumage coloration, and that differentiated selective patterns at the species level suggest novel photic pressures to influence evolutionary patterns of more-recent lineages.

  10. Adaptive introgression from distant Caribbean islands contributed to the diversification of a microendemic adaptive radiation of trophic specialist pupfishes

    PubMed Central

    2017-01-01

    Rapid diversification often involves complex histories of gene flow that leave variable and conflicting signatures of evolutionary relatedness across the genome. Identifying the extent and source of variation in these evolutionary relationships can provide insight into the evolutionary mechanisms involved in rapid radiations. Here we compare the discordant evolutionary relationships associated with species phenotypes across 42 whole genomes from a sympatric adaptive radiation of Cyprinodon pupfishes endemic to San Salvador Island, Bahamas and several outgroup pupfish species in order to understand the rarity of these trophic specialists within the larger radiation of Cyprinodon. 82% of the genome depicts close evolutionary relationships among the San Salvador Island species reflecting their geographic proximity, but the vast majority of variants fixed between specialist species lie in regions with discordant topologies. Top candidate adaptive introgression regions include signatures of selective sweeps and adaptive introgression of genetic variation from a single population in the northwestern Bahamas into each of the specialist species. Hard selective sweeps of genetic variation on San Salvador Island contributed 5 times more to speciation of trophic specialists than adaptive introgression of Caribbean genetic variation; however, four of the 11 introgressed regions came from a single distant island and were associated with the primary axis of oral jaw divergence within the radiation. For example, standing variation in a proto-oncogene (ski) known to have effects on jaw size introgressed into one San Salvador Island specialist from an island 300 km away approximately 10 kya. The complex emerging picture of the origins of adaptive radiation on San Salvador Island indicates that multiple sources of genetic variation contributed to the adaptive phenotypes of novel trophic specialists on the island. Our findings suggest that a suite of factors, including rare adaptive introgression, may be necessary for adaptive radiation in addition to ecological opportunity. PMID:28796803

  11. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

    PubMed

    Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

    2016-01-01

    Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).

  12. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances

    PubMed Central

    Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

    2016-01-01

    Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos). PMID:27846272

  13. Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution.

    PubMed

    Vierstra, Jeff; Rynes, Eric; Sandstrom, Richard; Zhang, Miaohua; Canfield, Theresa; Hansen, R Scott; Stehling-Sun, Sandra; Sabo, Peter J; Byron, Rachel; Humbert, Richard; Thurman, Robert E; Johnson, Audra K; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Giste, Erika; Haugen, Eric; Dunn, Douglas; Wilken, Matthew S; Josefowicz, Steven; Samstein, Robert; Chang, Kai-Hsin; Eichler, Evan E; De Bruijn, Marella; Reh, Thomas A; Skoultchi, Arthur; Rudensky, Alexander; Orkin, Stuart H; Papayannopoulou, Thalia; Treuting, Piper M; Selleri, Licia; Kaul, Rajinder; Groudine, Mark; Bender, M A; Stamatoyannopoulos, John A

    2014-11-21

    To study the evolutionary dynamics of regulatory DNA, we mapped >1.3 million deoxyribonuclease I-hypersensitive sites (DHSs) in 45 mouse cell and tissue types, and systematically compared these with human DHS maps from orthologous compartments. We found that the mouse and human genomes have undergone extensive cis-regulatory rewiring that combines branch-specific evolutionary innovation and loss with widespread repurposing of conserved DHSs to alternative cell fates, and that this process is mediated by turnover of transcription factor (TF) recognition elements. Despite pervasive evolutionary remodeling of the location and content of individual cis-regulatory regions, within orthologous mouse and human cell types the global fraction of regulatory DNA bases encoding recognition sites for each TF has been strictly conserved. Our findings provide new insights into the evolutionary forces shaping mammalian regulatory DNA landscapes. Copyright © 2014, American Association for the Advancement of Science.

  14. Determination of evolutionary relationships of outbreak-associated Listeria monocytogenes strains of serotypes 1/2a and 1/2b by whole-genome sequencing

    USDA-ARS?s Scientific Manuscript database

    We used whole-genome sequencing to determine evolutionary relationships among 20 outbreak-associated clinical isolates of Listeria monocytogenes serotypes 1/2a and 1/2b. Isolates from 6 of 11 outbreaks fell outside the clonal groups or “epidemic clones” that have been previously associated with outb...

  15. Tools for Accurate and Efficient Analysis of Complex Evolutionary Mechanisms in Microbial Genomes. Final Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nakhleh, Luay

    I proposed to develop computationally efficient tools for accurate detection and reconstruction of microbes' complex evolutionary mechanisms, thus enabling rapid and accurate annotation, analysis and understanding of their genomes. To achieve this goal, I proposed to address three aspects. (1) Mathematical modeling. A major challenge facing the accurate detection of HGT is that of distinguishing between these two events on the one hand and other events that have similar "effects." I proposed to develop a novel mathematical approach for distinguishing among these events. Further, I proposed to develop a set of novel optimization criteria for the evolutionary analysis of microbialmore » genomes in the presence of these complex evolutionary events. (2) Algorithm design. In this aspect of the project, I proposed to develop an array of e cient and accurate algorithms for analyzing microbial genomes based on the formulated optimization criteria. Further, I proposed to test the viability of the criteria and the accuracy of the algorithms in an experimental setting using both synthetic as well as biological data. (3) Software development. I proposed the nal outcome to be a suite of software tools which implements the mathematical models as well as the algorithms developed.« less

  16. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

    PubMed

    Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

    2017-07-31

    The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.

  17. Integrative View of α2,3-Sialyltransferases (ST3Gal) Molecular and Functional Evolution in Deuterostomes: Significance of Lineage-Specific Losses

    PubMed Central

    Petit, Daniel; Teppa, Elin; Mir, Anne-Marie; Vicogne, Dorothée; Thisse, Christine; Thisse, Bernard; Filloux, Cyril; Harduin-Lepers, Anne

    2015-01-01

    Sialyltransferases are responsible for the synthesis of a diverse range of sialoglycoconjugates predicted to be pivotal to deuterostomes’ evolution. In this work, we reconstructed the evolutionary history of the metazoan α2,3-sialyltransferases family (ST3Gal), a subset of sialyltransferases encompassing six subfamilies (ST3Gal I–ST3Gal VI) functionally characterized in mammals. Exploration of genomic and expressed sequence tag databases and search of conserved sialylmotifs led to the identification of a large data set of st3gal-related gene sequences. Molecular phylogeny and large scale sequence similarity network analysis identified four new vertebrate subfamilies called ST3Gal III-r, ST3Gal VII, ST3Gal VIII, and ST3Gal IX. To address the issue of the origin and evolutionary relationships of the st3gal-related genes, we performed comparative syntenic mapping of st3gal gene loci combined to ancestral genome reconstruction. The ten vertebrate ST3Gal subfamilies originated from genome duplication events at the base of vertebrates and are organized in three distinct and ancient groups of genes predating the early deuterostomes. Inferring st3gal gene family history identified also several lineage-specific gene losses, the significance of which was explored in a functional context. Toward this aim, spatiotemporal distribution of st3gal genes was analyzed in zebrafish and bovine tissues. In addition, molecular evolutionary analyses using specificity determining position and coevolved amino acid predictions led to the identification of amino acid residues with potential implication in functional divergence of vertebrate ST3Gal. We propose a detailed scenario of the evolutionary relationships of st3gal genes coupled to a conceptual framework of the evolution of ST3Gal functions. PMID:25534026

  18. Multiple capacitors for natural genetic variation in Drosophila melanogaster.

    PubMed

    Takahashi, Kazuo H

    2013-03-01

    Cryptic genetic variation (CGV) or a standing genetic variation that is not ordinarily expressed as a phenotype is released when the robustness of organisms is impaired under environmental or genetic perturbations. Evolutionary capacitors modulate the amount of genetic variation exposed to natural selection and hidden cryptically; they have a fundamental effect on the evolvability of traits on evolutionary timescales. In this study, I have demonstrated the effects of multiple genomic regions of Drosophila melanogaster on CGV in wing shape. I examined the effects of 61 genomic deficiencies on quantitative and qualitative natural genetic variation in the wing shape of D. melanogaster. I have identified 10 genomic deficiencies that do not encompass a known candidate evolutionary capacitor, Hsp90, exposing natural CGV differently depending on the location of the deficiencies in the genome. Furthermore, five genomic deficiencies uncovered qualitative CGV in wing morphology. These findings suggest that CGV in wing shape of wild-type D. melanogaster is regulated by multiple capacitors with divergent functions. Future analysis of genes encompassed by these genomic regions would help elucidate novel capacitor genes and better understand the general features of capacitors regarding natural genetic variation. © 2012 Blackwell Publishing Ltd.

  19. Comparative evolutionary genomics of Corynebacterium with special reference to codon and amino acid usage diversities.

    PubMed

    Pal, Shilpee; Sarkar, Indrani; Roy, Ayan; Mohapatra, Pradeep K Das; Mondal, Keshab C; Sen, Arnab

    2018-02-01

    The present study has been aimed to the comparative analysis of high GC composition containing Corynebacterium genomes and their evolutionary study by exploring codon and amino acid usage patterns. Phylogenetic study by MLSA approach, indel analysis and BLAST matrix differentiated Corynebacterium species in pathogenic and non-pathogenic clusters. Correspondence analysis on synonymous codon usage reveals that, gene length, optimal codon frequencies and tRNA abundance affect the gene expression of Corynebacterium. Most of the optimal codons as well as translationally optimal codons are C ending i.e. RNY (R-purine, N-any nucleotide base, and Y-pyrimidine) and reveal translational selection pressure on codon bias of Corynebacterium. Amino acid usage is affected by hydrophobicity, aromaticity, protein energy cost, etc. Highly expressed genes followed the cost minimization hypothesis and are less diverged at their synonymous positions of codons. Functional analysis of core genes shows significant difference in pathogenic and non-pathogenic Corynebacterium. The study reveals close relationship between non-pathogenic and opportunistic pathogenic Corynebaterium as well as between molecular evolution and survival niches of the organism.

  20. Neurogenomics and the role of a large mutational target on rapid behavioral change.

    PubMed

    Stanley, Craig E; Kulathinal, Rob J

    2016-11-08

    Behavior, while complex and dynamic, is among the most diverse, derived, and rapidly evolving traits in animals. The highly labile nature of heritable behavioral change is observed in such evolutionary phenomena as the emergence of converged behaviors in domesticated animals, the rapid evolution of preferences, and the routine development of ethological isolation between diverging populations and species. In fact, it is believed that nervous system development and its potential to evolve a seemingly infinite array of behavioral innovations played a major role in the successful diversification of metazoans, including our own human lineage. However, unlike other rapidly evolving functional systems such as sperm-egg interactions and immune defense, the genetic basis of rapid behavioral change remains elusive. Here we propose that the rapid divergence and widespread novelty of innate and adaptive behavior is primarily a function of its genomic architecture. Specifically, we hypothesize that the broad diversity of behavioral phenotypes present at micro- and macroevolutionary scales is promoted by a disproportionately large mutational target of neurogenic genes. We present evidence that these large neuro-behavioral targets are significant and ubiquitous in animal genomes and suggest that behavior's novelty and rapid emergence are driven by a number of factors including more selection on a larger pool of variants, a greater role of phenotypic plasticity, and/or unique molecular features present in large genes. We briefly discuss the origins of these large neurogenic genes, as they relate to the remarkable diversity of metazoan behaviors, and highlight key consequences on both behavioral traits and neurogenic disease across, respectively, evolutionary and ontogenetic time scales. Current approaches to studying the genetic mechanisms underlying rapid phenotypic change primarily focus on identifying signatures of Darwinian selection in protein-coding regions. In contrast, the large mutational target hypothesis places genomic architecture and a larger allelic pool at the forefront of rapid evolutionary change, particularly in genetic systems that are polygenic and regulatory in nature. Genomic data from brain and neural tissues in mammals as well as a preliminary survey of neurogenic genes from comparative genomic data support this hypothesis while rejecting both positive and relaxed selection on proteins or higher mutation rates. In mammals and invertebrates, neurogenic genes harbor larger protein-coding regions and possess a richer regulatory repertoire of miRNA targets and transcription factor binding sites. Overall, neurogenic genes cover a disproportionately large genomic fraction, providing a sizeable substrate for evolutionary, genetic, and molecular mechanisms to act upon. Readily available comparative and functional genomic data provide unexplored opportunities to test whether a distinct neurogenomic architecture can promote rapid behavioral change via several mechanisms unique to large genes, and which components of this large footprint are uniquely metazoan. The large mutational target hypothesis highlights the eminent roles of mutation and functional genomic architecture in generating rapid developmental and evolutionary change. It has broad implications on our understanding of the genetics of complex adaptive traits such as behavior by focusing on the importance of mutational input, from SNPs to alternative transcripts to transposable elements, on driving evolutionary rates of functional systems. Such functional divergence has important implications in promoting behavioral isolation across short- and long-term timescales. Due to genome-scaled polygenic adaptation, the large target effect also contributes to our inability to identify adapted behavioral candidate genes. The presence of large neurogenic genes, particularly in the mammalian brain and other neural tissues, further offers emerging insight into the etiology of neurodevelopmental and neurodegenerative diseases. The well-known correlation between neurological spectrum disorders in children and paternal age may simply be a direct result of aging fathers accumulating mutations across these large neurodevelopmental genes. The large mutational target hypothesis can also explain the rapid evolution of other functional systems covering a large genomic fraction such as male fertility and its preferential association with hybrid male sterility among closely related taxa. Overall, a focus on mutational potential may increase our power in understanding the genetic basis of complex phenotypes such as behavior while filling a general gap in understanding their evolution.

  1. Origin and Reticulate Evolutionary Process of Wheatgrass Elymus trachycaulus (Triticeae: Poaceae)

    PubMed Central

    Zuo, Hongwei; Wu, Panpan; Wu, Dexiang; Sun, Genlou

    2015-01-01

    To study origin and evolutionary dynamics of tetraploid Elymus trachycaulus that has been cytologically defined as containing StH genomes, thirteen accessions of E. trachycaulus were analyzed using two low-copy nuclear gene Pepc (phosphoenolpyruvate carboxylase) and Rpb2 (the second largest subunit of RNA polymerase II), and one chloroplast region trnL–trnF (spacer between the tRNA Leu (UAA) gene and the tRNA-Phe (GAA) gene). Our chloroplast data indicated that Pseudoroegneria (St genome) was the maternal donor of E. trachycaulus. Rpb2 data indicated that the St genome in E. trachycaulus was originated from either P. strigosa, P. stipifolia, P. spicata or P. geniculate. The Hordeum (H genome)-like sequences of E. trachycaulus are polyphyletic in the Pepc tree, suggesting that the H genome in E. trachycaulus was contributed by multiple sources, whether due to multiple origins or introgression resulting from subsequent hybridization. Failure to recovering St copy of Pepc sequence in most accessions of E. trachycaulus might be caused by genome convergent evolution in allopolyploids. Multiple copies of H-like Pepc sequence from each accession with relative large deletions and insertions might be caused by either instability of Pepc sequence in H- genome or incomplete concerted evolution. Our results highlighted complex evolutionary history of E. trachycaulus. PMID:25946188

  2. The genomic and epidemiological dynamics of human influenza A virus.

    PubMed

    Rambaut, Andrew; Pybus, Oliver G; Nelson, Martha I; Viboud, Cecile; Taubenberger, Jeffery K; Holmes, Edward C

    2008-05-29

    The evolutionary interaction between influenza A virus and the human immune system, manifest as 'antigenic drift' of the viral haemagglutinin, is one of the best described patterns in molecular evolution. However, little is known about the genome-scale evolutionary dynamics of this pathogen. Similarly, how genomic processes relate to global influenza epidemiology, in which the A/H3N2 and A/H1N1 subtypes co-circulate, is poorly understood. Here through an analysis of 1,302 complete viral genomes sampled from temperate populations in both hemispheres, we show that the genomic evolution of influenza A virus is characterized by a complex interplay between frequent reassortment and periodic selective sweeps. The A/H3N2 and A/H1N1 subtypes exhibit different evolutionary dynamics, with diverse lineages circulating in A/H1N1, indicative of weaker antigenic drift. These results suggest a sink-source model of viral ecology in which new lineages are seeded from a persistent influenza reservoir, which we hypothesize to be located in the tropics, to sink populations in temperate regions.

  3. Adaptation of Drosophila to a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles.

    PubMed

    Orozco-terWengel, Pablo; Kapun, Martin; Nolte, Viola; Kofler, Robert; Flatt, Thomas; Schlötterer, Christian

    2012-10-01

    The genomic basis of adaptation to novel environments is a fundamental problem in evolutionary biology that has gained additional importance in the light of the recent global change discussion. Here, we combined laboratory natural selection (experimental evolution) in Drosophila melanogaster with genome-wide next generation sequencing of DNA pools (Pool-Seq) to identify alleles that are favourable in a novel laboratory environment and traced their trajectories during the adaptive process. Already after 15 generations, we identified a pronounced genomic response to selection, with almost 5000 single nucleotide polymorphisms (SNP; genome-wide false discovery rates < 0.005%) deviating from neutral expectation. Importantly, the evolutionary trajectories of the selected alleles were heterogeneous, with the alleles falling into two distinct classes: (i) alleles that continuously rise in frequency; and (ii) alleles that at first increase rapidly but whose frequencies then reach a plateau. Our data thus suggest that the genomic response to selection can involve a large number of selected SNPs that show unexpectedly complex evolutionary trajectories, possibly due to nonadditive effects. © 2012 Blackwell Publishing Ltd.

  4. Genome-Based Microbial Taxonomy Coming of Age.

    PubMed

    Hugenholtz, Philip; Skarshewski, Adam; Parks, Donovan H

    2016-06-01

    Reconstructing the complete evolutionary history of extant life on our planet will be one of the most fundamental accomplishments of scientific endeavor, akin to the completion of the periodic table, which revolutionized chemistry. The road to this goal is via comparative genomics because genomes are our most comprehensive and objective evolutionary documents. The genomes of plant and animal species have been systematically targeted over the past decade to provide coverage of the tree of life. However, multicellular organisms only emerged in the last 550 million years of more than three billion years of biological evolution and thus comprise a small fraction of total biological diversity. The bulk of biodiversity, both past and present, is microbial. We have only scratched the surface in our understanding of the microbial world, as most microorganisms cannot be readily grown in the laboratory and remain unknown to science. Ground-breaking, culture-independent molecular techniques developed over the past 30 years have opened the door to this so-called microbial dark matter with an accelerating momentum driven by exponential increases in sequencing capacity. We are on the verge of obtaining representative genomes across all life for the first time. However, historical use of morphology, biochemical properties, behavioral traits, and single-marker genes to infer organismal relationships mean that the existing highly incomplete tree is riddled with taxonomic errors. Concerted efforts are now needed to synthesize and integrate the burgeoning genomic data resources into a coherent universal tree of life and genome-based taxonomy. Copyright © 2016 Cold Spring Harbor Laboratory Press; all rights reserved.

  5. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-08

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

  6. Minke whale genome and aquatic adaptation in cetaceans

    PubMed Central

    Yim, Hyung-Soon; Cho, Yun Sung; Guang, Xuanmin; Kang, Sung Gyun; Jeong, Jae-Yeon; Cha, Sun-Shin; Oh, Hyun-Myung; Lee, Jae-Hak; Yang, Eun Chan; Kwon, Kae Kyoung; Kim, Yun Jae; Kim, Tae Wan; Kim, Wonduck; Jeon, Jeong Ho; Kim, Sang-Jin; Choi, Dong Han; Jho, Sungwoong; Kim, Hak-Min; Ko, Junsu; Kim, Hyunmin; Shin, Young-Ah; Jung, Hyun-Ju; Zheng, Yuan; Wang, Zhuo; Chen, Yan

    2014-01-01

    The shift from terrestrial to aquatic life by whales was a substantial evolutionary event. Here we report the whole-genome sequencing and de novo assembly of the minke whale genome, as well as the whole-genome sequences of three minke whales, a fin whale, a bottlenose dolphin and a finless porpoise. Our comparative genomic analysis identified an expansion in the whale lineage of gene families associated with stress-responsive proteins and anaerobic metabolism, whereas gene families related to body hair and sensory receptors were contracted. Our analysis also identified whale-specific mutations in genes encoding antioxidants and enzymes controlling blood pressure and salt concentration. Overall the whale-genome sequences exhibited distinct features that are associated with the physiological and morphological changes needed for life in an aquatic environment, marked by resistance to physiological stresses caused by a lack of oxygen, increased amounts of reactive oxygen species and high salt levels. PMID:24270359

  7. Minke whale genome and aquatic adaptation in cetaceans.

    PubMed

    Yim, Hyung-Soon; Cho, Yun Sung; Guang, Xuanmin; Kang, Sung Gyun; Jeong, Jae-Yeon; Cha, Sun-Shin; Oh, Hyun-Myung; Lee, Jae-Hak; Yang, Eun Chan; Kwon, Kae Kyoung; Kim, Yun Jae; Kim, Tae Wan; Kim, Wonduck; Jeon, Jeong Ho; Kim, Sang-Jin; Choi, Dong Han; Jho, Sungwoong; Kim, Hak-Min; Ko, Junsu; Kim, Hyunmin; Shin, Young-Ah; Jung, Hyun-Ju; Zheng, Yuan; Wang, Zhuo; Chen, Yan; Chen, Ming; Jiang, Awei; Li, Erli; Zhang, Shu; Hou, Haolong; Kim, Tae Hyung; Yu, Lili; Liu, Sha; Ahn, Kung; Cooper, Jesse; Park, Sin-Gi; Hong, Chang Pyo; Jin, Wook; Kim, Heui-Soo; Park, Chankyu; Lee, Kyooyeol; Chun, Sung; Morin, Phillip A; O'Brien, Stephen J; Lee, Hang; Kimura, Jumpei; Moon, Dae Yeon; Manica, Andrea; Edwards, Jeremy; Kim, Byung Chul; Kim, Sangsoo; Wang, Jun; Bhak, Jong; Lee, Hyun Sook; Lee, Jung-Hyun

    2014-01-01

    The shift from terrestrial to aquatic life by whales was a substantial evolutionary event. Here we report the whole-genome sequencing and de novo assembly of the minke whale genome, as well as the whole-genome sequences of three minke whales, a fin whale, a bottlenose dolphin and a finless porpoise. Our comparative genomic analysis identified an expansion in the whale lineage of gene families associated with stress-responsive proteins and anaerobic metabolism, whereas gene families related to body hair and sensory receptors were contracted. Our analysis also identified whale-specific mutations in genes encoding antioxidants and enzymes controlling blood pressure and salt concentration. Overall the whale-genome sequences exhibited distinct features that are associated with the physiological and morphological changes needed for life in an aquatic environment, marked by resistance to physiological stresses caused by a lack of oxygen, increased amounts of reactive oxygen species and high salt levels.

  8. Genomic Encyclopedia of Type Strains of the Genus Bifidobacterium

    PubMed Central

    Milani, Christian; Lugli, Gabriele Andrea; Duranti, Sabrina; Turroni, Francesca; Bottacini, Francesca; Mangifesta, Marta; Sanchez, Borja; Viappiani, Alice; Mancabelli, Leonardo; Taminiau, Bernard; Delcenserie, Véronique; Barrangou, Rodolphe; Margolles, Abelardo; van Sinderen, Douwe

    2014-01-01

    Bifidobacteria represent one of the dominant microbial groups that are present in the gut of various animals, being particularly prevalent during the suckling stage of life of humans and other mammals. However, the overall genome structure of this group of microorganisms remains largely unexplored. Here, we sequenced the genomes of 42 representative (sub)species across the Bifidobacterium genus and used this information to explore the overall genetic picture of this bacterial group. Furthermore, the genomic data described here were used to reconstruct the evolutionary development of the Bifidobacterium genus. This reconstruction suggests that its evolution was substantially influenced by genetic adaptations to obtain access to glycans, thereby representing a common and potent evolutionary force in shaping bifidobacterial genomes. PMID:25085493

  9. Evolution of the myosin heavy chain gene MYH14 and its intronic microRNA miR-499: muscle-specific miR-499 expression persists in the absence of the ancestral host gene.

    PubMed

    Bhuiyan, Sharmin Siddique; Kinoshita, Shigeharu; Wongwarangkana, Chaninya; Asaduzzaman, Md; Asakawa, Shuichi; Watabe, Shugo

    2013-07-06

    A novel sarcomeric myosin heavy chain gene, MYH14, was identified following the completion of the human genome project. MYH14 contains an intronic microRNA, miR-499, which is expressed in a slow/cardiac muscle specific manner along with its host gene; it plays a key role in muscle fiber-type specification in mammals. Interestingly, teleost fish genomes contain multiple MYH14 and miR-499 paralogs. However, the evolutionary history of MYH14 and miR-499 has not been studied in detail. In the present study, we identified MYH14/miR-499 loci on various teleost fish genomes and examined their evolutionary history by sequence and expression analyses. Synteny and phylogenetic analyses depict the evolutionary history of MYH14/miR-499 loci where teleost specific duplication and several subsequent rounds of species-specific gene loss events took place. Interestingly, miR-499 was not located in the MYH14 introns of certain teleost fish. An MYH14 paralog, lacking miR-499, exhibited an accelerated rate of evolution compared with those containing miR-499, suggesting a putative functional relationship between MYH14 and miR-499. In medaka, Oryzias latipes, miR-499 is present where MYH14 is completely absent in the genome. Furthermore, by using in situ hybridization and small RNA sequencing, miR-499 was expressed in the notochord at the medaka embryonic stage and slow/cardiac muscle at the larval and adult stages. Comparing the flanking sequences of MYH14/miR-499 loci between torafugu Takifugu rubripes, zebrafish Danio rerio, and medaka revealed some highly conserved regions, suggesting that cis-regulatory elements have been functionally conserved in medaka miR-499 despite the loss of its host gene. This study reveals the evolutionary history of the MYH14/miRNA-499 locus in teleost fish, indicating divergent distribution and expression of MYH14 and miR-499 genes in different teleost fish lineages. We also found that medaka miR-499 was even expressed in the absence of its host gene. To our knowledge, this is the first report that shows the conversion of intronic into non-intronic miRNA during the evolution of a teleost fish lineage.

  10. Evolutionary Meta-Analysis of Association Studies Reveals Ancient Constraints Affecting Disease Marker Discovery

    PubMed Central

    Dudley, Joel T.; Chen, Rong; Sanderford, Maxwell; Butte, Atul J.; Kumar, Sudhir

    2012-01-01

    Genome-wide disease association studies contrast genetic variation between disease cohorts and healthy populations to discover single nucleotide polymorphisms (SNPs) and other genetic markers revealing underlying genetic architectures of human diseases. Despite scores of efforts over the past decade, many reproducible genetic variants that explain substantial proportions of the heritable risk of common human diseases remain undiscovered. We have conducted a multispecies genomic analysis of 5,831 putative human risk variants for more than 230 disease phenotypes reported in 2,021 studies. We find that the current approaches show a propensity for discovering disease-associated SNPs (dSNPs) at conserved genomic positions because the effect size (odds ratio) and allelic P value of genetic association of an SNP relates strongly to the evolutionary conservation of their genomic position. We propose a new measure for ranking SNPs that integrates evolutionary conservation scores and the P value (E-rank). Using published data from a large case-control study, we demonstrate that E-rank method prioritizes SNPs with a greater likelihood of bona fide and reproducible genetic disease associations, many of which may explain greater proportions of genetic variance. Therefore, long-term evolutionary histories of genomic positions offer key practical utility in reassessing data from existing disease association studies, and in the design and analysis of future studies aimed at revealing the genetic basis of common human diseases. PMID:22389448

  11. RELATIONSHIP BETWEEN PHYLOGENETIC DISTRIBUTION AND GENOMIC FEATURES IN NEUROSPORA CRASSA

    USDA-ARS?s Scientific Manuscript database

    In the post-genome era, insufficient functional annotation of predicted genes greatly restricts the potential of mining genome data. We demonstrate that an evolutionary approach, which is independent of functional annotation, has great potential as a tool for genome analysis. We chose the genome o...

  12. Towards the delineation of the ancestral eutherian genome organization: comparative genome maps of human and the African elephant (Loxodonta africana) generated by chromosome painting.

    PubMed Central

    Frönicke, Lutz; Wienberg, Johannes; Stone, Gary; Adams, Lisa; Stanyon, Roscoe

    2003-01-01

    This study presents a whole-genome comparison of human and a representative of the Afrotherian clade, the African elephant, generated by reciprocal Zoo-FISH. An analysis of Afrotheria genomes is of special interest, because recent DNA sequence comparisons identify them as the oldest placental mammalian clade. Complete sets of whole-chromosome specific painting probes for the African elephant and human were constructed by degenerate oligonucleotide-primed PCR amplification of flow-sorted chromosomes. Comparative genome maps are presented based on their hybridization patterns. These maps show that the elephant has a moderately rearranged chromosome complement when compared to humans. The human paint probes identified 53 evolutionary conserved segments on the 27 autosomal elephant chromosomes and the X chromosome. Reciprocal experiments with elephant probes delineated 68 conserved segments in the human genome. The comparison with a recent aardvark and elephant Zoo-FISH study delineates new chromosomal traits which link the two Afrotherian species phylogenetically. In the absence of any morphological evidence the chromosome painting data offer the first non-DNA sequence support for an Afrotherian clade. The comparative human and elephant genome maps provide new insights into the karyotype organization of the proto-afrotherian, the ancestor of extant placental mammals, which most probably consisted of 2n=46 chromosomes. PMID:12965023

  13. The painted turtle, Chrysemys picta: a model system for vertebrate evolution, ecology, and human health.

    PubMed

    Valenzuela, Nicole

    2009-07-01

    Painted turtles (Chrysemys picta) are representatives of a vertebrate clade whose biology and phylogenetic position hold a key to our understanding of fundamental aspects of vertebrate evolution. These features make them an ideal emerging model system. Extensive ecological and physiological research provide the context in which to place new research advances in evolutionary genetics, genomics, evolutionary developmental biology, and ecological developmental biology which are enabled by current resources, such as a bacterial artificial chromosome (BAC) library of C. picta, and the imminent development of additional ones such as genome sequences and cDNA and expressed sequence tag (EST) libraries. This integrative approach will allow the research community to continue making advances to provide functional and evolutionary explanations for the lability of biological traits found not only among reptiles but vertebrates in general. Moreover, because humans and reptiles share a common ancestor, and given the ease of using nonplacental vertebrates in experimental biology compared with mammalian embryos, painted turtles are also an emerging model system for biomedical research. For example, painted turtles have been studied to understand many biological responses to overwintering and anoxia, as potential sentinels for environmental xenobiotics, and as a model to decipher the ecology and evolution of sexual development and reproduction. Thus, painted turtles are an excellent reptilian model system for studies with human health, environmental, ecological, and evolutionary significance.

  14. The molecular biology and evolution of feline immunodeficiency viruses of cougars

    PubMed Central

    Poss, Mary; Ross, Howard; Rodrigo, Allen; Terwee, Julie; VandeWoude, Sue; Biek, Roman

    2008-01-01

    Feline immunodeficiency virus (FIV) is a lentivirus that has been identified in many members of the family Felidae but domestic cats are the only FIV host in which infection results in disease. We studied FIVpco infection of cougars (Puma concolor) as a model for asymptomatic lentivirus infections to understand the mechanisms of host-virus coexistence. Several natural cougar populations were evaluated to determine if there are any consequences of FIVpco infection on cougar fecundity, survival, or susceptibility to other infections. We have sequenced full length viral genomes and conducted a detailed analysis of viral molecular evolution on these sequences and on genome fragments of serially sampled animals to determine the evolutionary forces experienced by this virus in cougars. In addition, we have evaluated the molecular genetics of FIVpco in a new host, domestic cats, to determine the evolutionary consequences to a host-adapted virus associated with cross-species infection. Our results indicate that there are no significant differences in survival, fecundity or susceptibility to other infections between FIVpco-infected and uninfected cougars. The molecular evolution of FIVpco is characterized by a slower evolutionary rate and an absence of positive selection, but also by proviral and plasma viral loads comparable to those of epidemic lentiviruses such as HIV-1 or FIVfca. Evolutionary and recombination rates and selection profiles change significantly when FIVpco replicates in a new host. PMID:18295904

  15. Neutral polymorphisms in putative housekeeping genes and tandem repeats unravels the population genetics and evolutionary history of Plasmodium vivax in India.

    PubMed

    Prajapati, Surendra K; Joshi, Hema; Carlton, Jane M; Rizvi, M Alam

    2013-01-01

    The evolutionary history and age of Plasmodium vivax has been inferred as both recent and ancient by several studies, mainly using mitochondrial genome diversity. Here we address the age of P. vivax on the Indian subcontinent using selectively neutral housekeeping genes and tandem repeat loci. Analysis of ten housekeeping genes revealed a substantial number of SNPs (n = 75) from 100 P. vivax isolates collected from five geographical regions of India. Neutrality tests showed a majority of the housekeeping genes were selectively neutral, confirming the suitability of housekeeping genes for inferring the evolutionary history of P. vivax. In addition, a genetic differentiation test using housekeeping gene polymorphism data showed a lack of geographical structuring between the five regions of India. The coalescence analysis of the time to the most recent common ancestor estimate yielded an ancient TMRCA (232,228 to 303,030 years) and long-term population history (79,235 to 104,008) of extant P. vivax on the Indian subcontinent. Analysis of 18 tandem repeat loci polymorphisms showed substantial allelic diversity and heterozygosity per locus, and analysis of potential bottlenecks revealed the signature of a stable P. vivax population, further corroborating our ancient age estimates. For the first time we report a comparable evolutionary history of P. vivax inferred by nuclear genetic markers (putative housekeeping genes) to that inferred from mitochondrial genome diversity.

  16. Gene context conservation of a higher order than operons.

    PubMed

    Lathe, W C; Snel, B; Bork, P

    2000-10-01

    Operons, co-transcribed and co-regulated contiguous sets of genes, are poorly conserved over short periods of evolutionary time. The gene order, gene content and regulatory mechanisms of operons can be very different, even in closely related species. Here, we present several lines of evidence which suggest that, although an operon and its individual genes and regulatory structures are rearranged when comparing the genomes of different species, this rearrangement is a conservative process. Genomic rearrangements invariably maintain individual genes in very specific functional and regulatory contexts. We call this conserved context an uber-operon.

  17. The Evolution of COP9 Signalosome in Unicellular and Multicellular Organisms.

    PubMed

    Barth, Emanuel; Hübler, Ron; Baniahmad, Aria; Marz, Manja

    2016-05-02

    The COP9 signalosome (CSN) is a highly conserved protein complex, recently being crystallized for human. In mammals and plants the COP9 complex consists of nine subunits, CSN 1-8 and CSNAP. The CSN regulates the activity of culling ring E3 ubiquitin and plays central roles in pleiotropy, cell cycle, and defense of pathogens. Despite the interesting and essential functions, a thorough analysis of the CSN subunits in evolutionary comparative perspective is missing. Here we compared 61 eukaryotic genomes including plants, animals, and yeasts genomes and show that the most conserved subunits of eukaryotes among the nine subunits are CSN2 and CSN5. This may indicate a strong evolutionary selection for these two subunits. Despite the strong conservation of the protein sequence, the genomic structures of the intron/exon boundaries indicate no conservation at genomic level. This suggests that the gene structure is exposed to a much less selection compared with the protein sequence. We also show the conservation of important active domains, such as PCI (proteasome lid-CSN-initiation factor) and MPN (MPR1/PAD1 amino-terminal). We identified novel exons and alternative splicing variants for all CSN subunits. This indicates another level of complexity of the CSN. Notably, most COP9-subunits were identified in all multicellular and unicellular eukaryotic organisms analyzed, but not in prokaryotes or archaeas. Thus, genes encoding CSN subunits present in all analyzed eukaryotes indicate the invention of the signalosome at the root of eukaryotes. The identification of alternative splice variants indicates possible "mini-complexes" or COP9 complexes with independent subunits containing potentially novel and not yet identified functions. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. The Evolution of COP9 Signalosome in Unicellular and Multicellular Organisms

    PubMed Central

    Barth, Emanuel; Hübler, Ron; Baniahmad, Aria; Marz, Manja

    2016-01-01

    The COP9 signalosome (CSN) is a highly conserved protein complex, recently being crystallized for human. In mammals and plants the COP9 complex consists of nine subunits, CSN 1–8 and CSNAP. The CSN regulates the activity of culling ring E3 ubiquitin and plays central roles in pleiotropy, cell cycle, and defense of pathogens. Despite the interesting and essential functions, a thorough analysis of the CSN subunits in evolutionary comparative perspective is missing. Here we compared 61 eukaryotic genomes including plants, animals, and yeasts genomes and show that the most conserved subunits of eukaryotes among the nine subunits are CSN2 and CSN5. This may indicate a strong evolutionary selection for these two subunits. Despite the strong conservation of the protein sequence, the genomic structures of the intron/exon boundaries indicate no conservation at genomic level. This suggests that the gene structure is exposed to a much less selection compared with the protein sequence. We also show the conservation of important active domains, such as PCI (proteasome lid-CSN-initiation factor) and MPN (MPR1/PAD1 amino-terminal). We identified novel exons and alternative splicing variants for all CSN subunits. This indicates another level of complexity of the CSN. Notably, most COP9-subunits were identified in all multicellular and unicellular eukaryotic organisms analyzed, but not in prokaryotes or archaeas. Thus, genes encoding CSN subunits present in all analyzed eukaryotes indicate the invention of the signalosome at the root of eukaryotes. The identification of alternative splice variants indicates possible “mini-complexes” or COP9 complexes with independent subunits containing potentially novel and not yet identified functions. PMID:27044515

  19. Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

    PubMed

    Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

    2014-01-01

    A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.

  20. Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

    PubMed

    Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

    2012-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.

  1. Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome

    PubMed Central

    Hamilton, Eileen P; Kapusta, Aurélie; Huvos, Piroska E; Bidwell, Shelby L; Zafar, Nikhat; Tang, Haibao; Hadjithomas, Michalis; Krishnakumar, Vivek; Badger, Jonathan H; Caler, Elisabet V; Russ, Carsten; Zeng, Qiandong; Fan, Lin; Levin, Joshua Z; Shea, Terrance; Young, Sarah K; Hegarty, Ryan; Daza, Riza; Gujja, Sharvari; Wortman, Jennifer R; Birren, Bruce W; Nusbaum, Chad; Thomas, Jainy; Carey, Clayton M; Pritham, Ellen J; Feschotte, Cédric; Noto, Tomoko; Mochizuki, Kazufumi; Papazyan, Romeo; Taverna, Sean D; Dear, Paul H; Cassidy-Hanley, Donna M; Xiong, Jie; Miao, Wei; Orias, Eduardo; Coyne, Robert S

    2016-01-01

    The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena’s germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum. DOI: http://dx.doi.org/10.7554/eLife.19090.001 PMID:27892853

  2. Microbial genome analysis: the COG approach.

    PubMed

    Galperin, Michael Y; Kristensen, David M; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

    2017-09-14

    For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  3. On an early gene for membrane-integral inorganic pyrophosphatase in the genome of an apparently pre-luca extremophile, the archaeon Candidatus Korarchaeum cryptofilum.

    PubMed

    Baltscheffsky, Herrick; Persson, Bengt

    2014-02-01

    A gene for membrane-integral inorganic pyrophosphatase (miPPase) was found in the composite genome of the extremophile archaeon Candidatus Korarchaeum cryptofilum (CKc). This korarchaeal genome shows unusual partial similarity to both major archaeal phyla Crenarchaeota and Euryarchaeota. Thus this Korarchaeote might have retained features that represent an ancestral archaeal form, existing before the occurrence of the evolutionary bifurcation into Crenarchaeota and Euryarchaeota. In addition, CKc lacks five genes that are common to early genomes at the LUCA border. These two properties independently suggest a pre-LUCA evolutionary position of this extremophile. Our finding of the miPPase gene in the CKc genome points to a role for the enzyme in the energy conversion of this very early archaeon. The structural features of its miPPase indicate that it can pump protons through membranes. An miPPase from the extremophile bacterium Caldicellulosiruptor saccharolyticus also has a sequence indicating a proton pump. Recent analysis of the three-dimensional structure of the miPPase from Vigna radiata has resulted in the recognition of a strongly acidic substrate (orthophosphate: Pi, pyrophosphate: PPi) binding pocket, containing 11 Asp and one Glu residues. Asp (aspartic acid) is an evolutionarily very early proteinaceous amino acid as compared to the later appearing Glu (glutamic acid). All the Asp residues are conserved in the miPPase of CKc, V. radiata and other miPPases. The high proportion of Asp, as compared to Glu, seems to strengthen our argument that biological energy conversion with binding and activities of orthophosphate (Pi) and energy-rich pyrophosphate (PPi) in connection with the origin and early evolution of life may have started with similar or even more primitive acidic peptide funnels and/or pockets.

  4. Organization and evolution of highly repeated satellite DNA sequences in plant chromosomes.

    PubMed

    Sharma, S; Raina, S N

    2005-01-01

    A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive". Copyright 2005 S. Karger AG, Basel.

  5. Comparative Genomics in Drosophila.

    PubMed

    Oti, Martin; Pane, Attilio; Sammeth, Michael

    2018-01-01

    Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered "junk" DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding sites, that due to the high degree of sequence variability usually impose increased challenges for bioinformatics approaches.

  6. Exploiting Genomic Knowledge in Optimising Molecular Breeding Programmes: Algorithms from Evolutionary Computing

    PubMed Central

    O'Hagan, Steve; Knowles, Joshua; Kell, Douglas B.

    2012-01-01

    Comparatively few studies have addressed directly the question of quantifying the benefits to be had from using molecular genetic markers in experimental breeding programmes (e.g. for improved crops and livestock), nor the question of which organisms should be mated with each other to best effect. We argue that this requires in silico modelling, an approach for which there is a large literature in the field of evolutionary computation (EC), but which has not really been applied in this way to experimental breeding programmes. EC seeks to optimise measurable outcomes (phenotypic fitnesses) by optimising in silico the mutation, recombination and selection regimes that are used. We review some of the approaches from EC, and compare experimentally, using a biologically relevant in silico landscape, some algorithms that have knowledge of where they are in the (genotypic) search space (G-algorithms) with some (albeit well-tuned ones) that do not (F-algorithms). For the present kinds of landscapes, F- and G-algorithms were broadly comparable in quality and effectiveness, although we recognise that the G-algorithms were not equipped with any ‘prior knowledge’ of epistatic pathway interactions. This use of algorithms based on machine learning has important implications for the optimisation of experimental breeding programmes in the post-genomic era when we shall potentially have access to the full genome sequence of every organism in a breeding population. The non-proprietary code that we have used is made freely available (via Supplementary information). PMID:23185279

  7. Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer.

    PubMed

    Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A

    2016-07-01

    Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.

  8. Genomic Diversity and Evolution of the Lyssaviruses

    PubMed Central

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  9. Genome size variation in the genus Avena.

    PubMed

    Yan, Honghai; Martin, Sara L; Bekele, Wubishet A; Latta, Robert G; Diederichsen, Axel; Peng, Yuanying; Tinker, Nicholas A

    2016-03-01

    Genome size is an indicator of evolutionary distance and a metric for genome characterization. Here, we report accurate estimates of genome size in 99 accessions from 26 species of Avena. We demonstrate that the average genome size of C genome diploid species (2C = 10.26 pg) is 15% larger than that of A genome species (2C = 8.95 pg), and that this difference likely accounts for a progression of size among tetraploid species, where AB < AC < CC (average 2C = 16.76, 18.60, and 21.78 pg, respectively). All accessions from three hexaploid species with the ACD genome configuration had similar genome sizes (average 2C = 25.74 pg). Genome size was mostly consistent within species and in general agreement with current information about evolutionary distance among species. Results also suggest that most of the polyploid species in Avena have experienced genome downsizing in relation to their diploid progenitors. Genome size measurements could provide additional quality control for species identification in germplasm collections, especially in cases where diploid and polyploid species have similar morphology.

  10. Complete genome of the cellulolytic thermophile Acidothermus cellulolyticus 11B provides insights into its ecophysiological and evolutionary adaptations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xie, Gary; Detter, John C; Bruce, David C

    We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus 11B, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudo genes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less

  11. Complete genome of the cellulolytic thermophile Acidothermus cellulolyticus 11B provides insights into its ecophysiological and evolutionary adaptations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xie, Gary; Detter, Chris; Bruce, David

    We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus lIB, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudogenes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less

  12. The landscape of inherited and de novo copy number variants in a plasmodium falciparum genetic cross

    PubMed Central

    2011-01-01

    Background Copy number is a major source of genome variation with important evolutionary implications. Consequently, it is essential to determine copy number variant (CNV) behavior, distributions and frequencies across genomes to understand their origins in both evolutionary and generational time frames. We use comparative genomic hybridization (CGH) microarray and the resolution provided by a segregating population of cloned progeny lines of the malaria parasite, Plasmodium falciparum, to identify and analyze the inheritance of 170 genome-wide CNVs. Results We describe CNVs in progeny clones derived from both Mendelian (i.e. inherited) and non-Mendelian mechanisms. Forty-five CNVs were present in the parent lines and segregated in the progeny population. Furthermore, extensive variation that did not conform to strict Mendelian inheritance patterns was observed. 124 CNVs were called in one or more progeny but in neither parent: we observed CNVs in more than one progeny clone that were not identified in either parent, located more frequently in the telomeric-subtelomeric regions of chromosomes and singleton de novo CNVs distributed evenly throughout the genome. Linkage analysis of CNVs revealed dynamic copy number fluctuations and suggested mechanisms that could have generated them. Five of 12 previously identified expression quantitative trait loci (eQTL) hotspots coincide with CNVs, demonstrating the potential for broad influence of CNV on the transcriptional program and phenotypic variation. Conclusions CNVs are a significant source of segregating and de novo genome variation involving hundreds of genes. Examination of progeny genome segments provides a framework to assess the extent and possible origins of CNVs. This segregating genetic system reveals the breadth, distribution and dynamics of CNVs in a surprisingly plastic parasite genome, providing a new perspective on the sources of diversity in parasite populations. PMID:21936954

  13. Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

    PubMed

    Walker, Michael B; King, Benjamin L; Paigen, Kenneth

    2012-01-01

    Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.

  14. Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks

    PubMed Central

    2011-01-01

    Background Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. Results A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Conclusions Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced. PMID:21849086

  15. Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks.

    PubMed

    Xie, Xueying; Jin, Jing; Mao, Yongyi

    2011-08-18

    Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced.

  16. Phylomedicine: An evolutionary telescope to explore and diagnose the universe of disease mutations

    PubMed Central

    Kumar, Sudhir; Dudley, Joel T.; Filipski, Alan; Liu, Li

    2011-01-01

    Modern technologies have made the sequencing of personal genomes routine. They have revealed thousands of nonsynonymous (amino-acid altering) single nucleotide variants (nSNVs) of protein coding DNA per genome. What do these variants foretell about an individual’s predisposition to diseases? The experimental technologies required to carry out such evaluations at a genomic scale are not yet available. Fortunately, the process of natural selection has lent us an almost infinite set of tests in nature. During the long-term evolution, new mutations and existing variations have been evaluated for their biological consequences in countless species, and outcomes were readily revealed by multispecies genome comparisons. We review studies that have investigated evolutionary characteristics and in silico functional diagnoses of nSNVs found in thousands of disease-associated genes. We conclude that the patterns of long-term evolutionary conservation and permissible divergence are essential and instructive modalities for functional assessment of human genetic variations. PMID:21764165

  17. Segmental Duplications in Euchromatic Regions of Human Chromosome 5: A Source of Evolutionary Instability and Transcriptional Innovation

    PubMed Central

    Courseaux, Anouk; Richard, Florence; Grosgeorge, Josiane; Ortola, Christine; Viale, Agnes; Turc-Carel, Claude; Dutrillaux, Bernard; Gaudray, Patrick; Nahon, Jean-Louis

    2003-01-01

    Recent analyses of the structure of pericentromeric and subtelomeric regions have revealed that these particular regions of human chromosomes are often composed of blocks of duplicated genomic segments that have been associated with rapid evolutionary turnover among the genomes of closely related primates. In the present study, we show that euchromatic regions of human chromosome 5—5p14, 5p13, 5q13, 5q15–5q21—also display such an accumulation of segmental duplications. The structure, organization and evolution of those primate-specific sequences were studied in detail by combining in silico and comparative FISH analyses on human, chimpanzee, gorilla, orangutang, macaca, and capuchin chromosomes. Our results lend support to a two-step model of transposition duplication in the euchromatic regions, with a founder insertional event at the time of divergence between Platyrrhini and Catarrhini (25–35 million years ago) and an apparent burst of inter- and intrachromosomal duplications in the Hominidae lineage. Furthermore, phylogenetic analysis suggests that the chronology and, likely, molecular mechanisms, differ regarding the region of primary insertion—euchromatic versus pericentromeric regions. Lastly, we show that as their counterparts located near the heterochromatic region, the euchromatic segmental duplications have consistently reshaped their region of insertion during primate evolution, creating putative mosaic genes, and they are obvious candidates for causing ectopic rearrangements that have contributed to evolutionary/genomic instability. [Supplemental material is available online at www.genome.org. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: D. Le Paslier, A. McKenzie, J. Melki, C. Sargent, J. Scharf and S. Selig.] PMID:12618367

  18. Bioinformatics analysis and genetic diversity of the poliovirus.

    PubMed

    Liu, Yanhan; Ma, Tengfei; Liu, Jianzhu; Zhao, Xiaona; Cheng, Ziqiang; Guo, Huijun; Wang, Shujing; Xu, Ruixue

    2014-12-01

    Poliomyelitis, a disease which can manifest as muscle paralysis, is caused by the poliovirus, which is a human enterovirus and member of the family Picornaviridae that usually transmits by the faecal-oral route. The viruses of the OPV (oral poliovirus attenuated-live vaccine) strains can mutate in the human intestine during replication and some of these mutations can lead to the recovery of serious neurovirulence. Informatics research of the poliovirus genome can be used to explain further the characteristics of this virus. In this study, sequences from 100 poliovirus isolates were acquired from GenBank. To determine the evolutionary relationship between the strains, we compared and analysed the sequences of the complete poliovirus genome and the VP1 region. The reconstructed phylogenetic trees for the complete sequences and the VP1 sequences were both divided into two branches, indicating that the genetic relationships of the whole poliovirus genome and the VP1 sequences are very similar. This branching indicates that the virulence and pathogenicity of poliomyelitis may be associated with the VP1 region. Sequence alignment of the VP1 region revealed numerous mutation sites in which mutation rates of >30 % were detected. In a group of strains recorded in the USA, mutation sites and mutation types were the same and this may be associated with their distribution in the evolutionary tree and their genetic relationship. In conclusion, the genetic evolutionary relationships of poliovirus isolate sequences are determined to a great extent by the VP1 protein, and poliovirus strains located on the same branch of the phylogenetic tree contain the same mutation spots and mutation types. Hence, the genetic characteristics of the VP1 region in the poliovirus genome should be analysed to identify the transmission route of poliovirus and provide the basis of viral immunity development. © 2014 The Authors.

  19. A Surrogate Approach to Study the Evolution of Noncoding DNA Elements That Organize Eukaryotic Genomes

    PubMed Central

    Vermaak, Danielle; Bayes, Joshua J.

    2009-01-01

    Comparative genomics provides a facile way to address issues of evolutionary constraint acting on different elements of the genome. However, several important DNA elements have not reaped the benefits of this new approach. Some have proved intractable to current day sequencing technology. These include centromeric and heterochromatic DNA, which are essential for chromosome segregation as well as gene regulation, but the highly repetitive nature of the DNA sequences in these regions make them difficult to assemble into longer contigs. Other sequences, like dosage compensation X chromosomal sites, origins of DNA replication, or heterochromatic sequences that encode piwi-associated RNAs, have proved difficult to study because they do not have recognizable DNA features that allow them to be described functionally or computationally. We have employed an alternate approach to the direct study of these DNA elements. By using proteins that specifically bind these noncoding DNAs as surrogates, we can indirectly assay the evolutionary constraints acting on these important DNA elements. We review the impact that such “surrogate strategies” have had on our understanding of the evolutionary constraints shaping centromeres, origins of DNA replication, and dosage compensation X chromosomal sites. These have begun to reveal that in contrast to the view that such structural DNA elements are either highly constrained (under purifying selection) or free to drift (under neutral evolution), some of them may instead be shaped by adaptive evolution and genetic conflicts (these are not mutually exclusive). These insights also help to explain why the same elements (e.g., centromeres and replication origins), which are so complex in some eukaryotic genomes, can be simple and well defined in other where similar conflicts do not exist. PMID:19635763

  20. G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences.

    PubMed

    Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M

    2018-05-01

    Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.

  1. Eco-Evo PVAs: Incorporating Eco-Evolutionary Processes into Population Viability Models

    EPA Science Inventory

    We synthesize how advances in computational methods and population genomics can be combined within an Ecological-Evolutionary (Eco-Evo) PVA model. Eco-Evo PVA models are powerful new tools for understanding the influence of evolutionary processes on plant and animal population pe...

  2. Dynamic Nucleotide Mutation Gradients and Control Region Usage in Squamate Reptile Mitochondrial Genomes

    PubMed Central

    Castoe, T.A.; Gu, W.; de Koning, A.P.J.; Daza, J.M.; Jiang, Z.J.; Parkinson, C.L.; Pollock, D.D.

    2010-01-01

    Gradients of nucleotide bias and substitution rates occur in vertebrate mitochondrial genomes due to the asymmetric nature of the replication process. The evolution of these gradients has previously been studied in detail in primates, but not in other vertebrate groups. From the primate study, the strengths of these gradients are known to evolve in ways that can substantially alter the substitution process, but it is unclear how rapidly they evolve over evolutionary time or how different they may be in different lineages or groups of vertebrates. Given the importance of mitochondrial genomes in phylogenetics and molecular evolutionary research, a better understanding of how asymmetric mitochondrial substitution gradients evolve would contribute key insights into how this gradient evolution may mislead evolutionary inferences, and how it may also be incorporated into new evolutionary models. Most snake mitochondrial genomes have an additional interesting feature, 2 nearly identical control regions, which vary among different species in the extent that they are used as origins of replication. Given the expanded sampling of complete snake genomes currently available, together with 2 additional snakes sequenced in this study, we reexamined gradient strength and CR usage in alethinophidian snakes as well as several lizards that possess dual CRs. Our results suggest that nucleotide substitution gradients (and corresponding nucleotide bias) and CR usage is highly labile over the ∼200 m.y. of squamate evolution, and demonstrates greater overall variability than previously shown in primates. The evidence for the existence of such gradients, and their ability to evolve rapidly and converge among unrelated species suggests that gradient dynamics could easily mislead phylogenetic and molecular evolutionary inferences, and argues strongly that these dynamics should be incorporated into phylogenetic models. PMID:20215734

  3. Evolutionary genomics: is Buchnera a bacterium or an organelle?

    PubMed

    Andersson, J O

    2000-11-30

    The first genome sequence of an intracellular bacterial symbiont of a eukaryotic cell has been determined. The Buchnera genome shares features with the genomes of both intracellular pathogenic bacteria and eukaryotic organelles, and it may represent an intermediate between the two.

  4. Evolutionary and comparative analyses of the soybean genome

    PubMed Central

    Cannon, Steven B.; Shoemaker, Randy C.

    2012-01-01

    The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplication that occurred ~13 million years ago (Mya); and fainter remnants of older polyploidies that occurred ~58 Mya and >130 Mya. The genome sequence has been used to identify the genetic basis for numerous traits, including disease resistance, nutritional characteristics, and developmental features. The genome sequence has provided a scaffold for placement of many genomic feature elements, both from within soybean and from related species. These may be accessed at several websites, including http://www.phytozome.net, http://soybase.org, http://comparative-legumes.org, and http://www.legumebase.brc.miyazaki-u.ac.jp. The taxonomic position of soybean in the Phaseoleae tribe of the legumes means that there are approximately two dozen other beans and relatives that have undergone independent domestication, and which may have traits that will be useful for transfer to soybean. Methods of translating information between species in the Phaseoleae range from design of markers for marker assisted selection, to transformation with Agrobacterium or with other experimental transformation methods. PMID:23136483

  5. Intraspecific and heteroplasmic variations, gene losses and inversions in the chloroplast genome of Astragalus membranaceus.

    PubMed

    Lei, Wanjun; Ni, Dapeng; Wang, Yujun; Shao, Junjie; Wang, Xincun; Yang, Dan; Wang, Jinsheng; Chen, Haimei; Liu, Chang

    2016-02-22

    Astragalus membranaceus is an important medicinal plant in Asia. Several of its varieties have been used interchangeably as raw materials for commercial production. High resolution genetic markers are in urgent need to distinguish these varieties. Here, we sequenced and analyzed the chloroplast genome of A. membranaceus (Fisch.) Bunge var. mongholicus (Bunge) P.K. Hsiao using the next generation DNA sequencing technology. The genome was assembled using Abyss and then subjected to gene prediction using CPGAVAS and repeat analysis using MISA, Tandem Repeats Finder, and REPuter. Finally, the genome was subjected phylogenetic and comparative genomic analyses. The complete genome is 123,582 bp long, containing only one copy of the inverted repeat. Gene prediction revealed 110 genes encoding 76 proteins, 30 tRNAs, and four rRNAs. Five intra-specific hypermutation loci were identified, three of which are heteroplasmic. Furthermore, three gene losses and two large inversions were identified. Comparative genomic analyses demonstrated the dynamic nature of the Papilionoideae chloroplast genomes, which showed occurrence of numerous hypermutation loci, frequent gene losses, and fragment inversions. Results obtained herein elucidate the complex evolutionary history of chloroplast genomes and have laid the foundation for the identification of genetic markers to distinguish A. membranaceus varieties.

  6. COGNAT: a web server for comparative analysis of genomic neighborhoods.

    PubMed

    Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y

    2017-11-22

    In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.

  7. Favorable genomic environments for cis-regulatory evolution: A novel theoretical framework.

    PubMed

    Maeso, Ignacio; Tena, Juan J

    2016-09-01

    Cis-regulatory changes are arguably the primary evolutionary source of animal morphological diversity. With the recent explosion of genome-wide comparisons of the cis-regulatory content in different animal species is now possible to infer general principles underlying enhancer evolution. However, these studies have also revealed numerous discrepancies and paradoxes, suggesting that the mechanistic causes and modes of cis-regulatory evolution are still not well understood and are probably much more complex than generally appreciated. Here, we argue that the mutational mechanisms and genomic regions generating new regulatory activities must comply with the constraints imposed by the molecular properties of cis-regulatory elements (CREs) and the organizational features of long-range chromatin interactions. Accordingly, we propose a new integrative evolutionary framework for cis-regulatory evolution based on two major premises for the origin of novel enhancer activity: (i) an accessible chromatin environment and (ii) compatibility with the 3D structure and interactions of pre-existing CREs. Mechanisms and DNA sequences not fulfilling these premises, will be less likely to have a measurable impact on gene expression and as such, will have a minor contribution to the evolution of gene regulation. Finally, we discuss current comparative cis-regulatory data under the light of this new evolutionary model, and propose that the two most prominent mechanisms for the evolution of cis-regulatory changes are the overprinting of ancestral CREs and the exaptation of transposable elements. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Detecting signatures of positive selection associated with musical aptitude in the human genome

    PubMed Central

    Liu, Xuanyao; Kanduri, Chakravarthi; Oikkonen, Jaana; Karma, Kai; Raijas, Pirre; Ukkola-Vuoti, Liisa; Teo, Yik-Ying; Järvelä, Irma

    2016-01-01

    Abilities related to musical aptitude appear to have a long history in human evolution. To elucidate the molecular and evolutionary background of musical aptitude, we compared genome-wide genotyping data (641 K SNPs) of 148 Finnish individuals characterized for musical aptitude. We assigned signatures of positive selection in a case-control setting using three selection methods: haploPS, XP-EHH and FST. Gene ontology classification revealed that the positive selection regions contained genes affecting inner-ear development. Additionally, literature survey has shown that several of the identified genes were known to be involved in auditory perception (e.g. GPR98, USH2A), cognition and memory (e.g. GRIN2B, IL1A, IL1B, RAPGEF5), reward mechanisms (RGS9), and song perception and production of songbirds (e.g. FOXP1, RGS9, GPR98, GRIN2B). Interestingly, genes related to inner-ear development and cognition were also detected in a previous genome-wide association study of musical aptitude. However, the candidate genes detected in this study were not reported earlier in studies of musical abilities. Identification of genes related to language development (FOXP1 and VLDLR) support the popular hypothesis that music and language share a common genetic and evolutionary background. The findings are consistent with the evolutionary conservation of genes related to auditory processes in other species and provide first empirical evidence for signatures of positive selection for abilities that contribute to musical aptitude. PMID:26879527

  9. Detecting signatures of positive selection associated with musical aptitude in the human genome.

    PubMed

    Liu, Xuanyao; Kanduri, Chakravarthi; Oikkonen, Jaana; Karma, Kai; Raijas, Pirre; Ukkola-Vuoti, Liisa; Teo, Yik-Ying; Järvelä, Irma

    2016-02-16

    Abilities related to musical aptitude appear to have a long history in human evolution. To elucidate the molecular and evolutionary background of musical aptitude, we compared genome-wide genotyping data (641 K SNPs) of 148 Finnish individuals characterized for musical aptitude. We assigned signatures of positive selection in a case-control setting using three selection methods: haploPS, XP-EHH and FST. Gene ontology classification revealed that the positive selection regions contained genes affecting inner-ear development. Additionally, literature survey has shown that several of the identified genes were known to be involved in auditory perception (e.g. GPR98, USH2A), cognition and memory (e.g. GRIN2B, IL1A, IL1B, RAPGEF5), reward mechanisms (RGS9), and song perception and production of songbirds (e.g. FOXP1, RGS9, GPR98, GRIN2B). Interestingly, genes related to inner-ear development and cognition were also detected in a previous genome-wide association study of musical aptitude. However, the candidate genes detected in this study were not reported earlier in studies of musical abilities. Identification of genes related to language development (FOXP1 and VLDLR) support the popular hypothesis that music and language share a common genetic and evolutionary background. The findings are consistent with the evolutionary conservation of genes related to auditory processes in other species and provide first empirical evidence for signatures of positive selection for abilities that contribute to musical aptitude.

  10. Recurrent Innovation at Genes Required for Telomere Integrity in Drosophila.

    PubMed

    Lee, Yuh Chwen G; Leek, Courtney; Levine, Mia T

    2017-02-01

    Telomeres are nucleoprotein complexes at the ends of linear chromosomes. These specialized structures ensure genome integrity and faithful chromosome inheritance. Recurrent addition of repetitive, telomere-specific DNA elements to chromosome ends combats end-attrition, while specialized telomere-associated proteins protect naked, double-stranded chromosome ends from promiscuous repair into end-to-end fusions. Although telomere length homeostasis and end-protection are ubiquitous across eukaryotes, there is sporadic but building evidence that the molecular machinery supporting these essential processes evolves rapidly. Nevertheless, no global analysis of the evolutionary forces that shape these fast-evolving proteins has been performed on any eukaryote. The abundant population and comparative genomic resources of Drosophila melanogaster and its close relatives offer us a unique opportunity to fill this gap. Here we leverage population genetics, molecular evolution, and phylogenomics to define the scope and evolutionary mechanisms driving fast evolution of genes required for telomere integrity. We uncover evidence of pervasive positive selection across multiple evolutionary timescales. We also document prolific expansion, turnover, and expression evolution in gene families founded by telomeric proteins. Motivated by the mutant phenotypes and molecular roles of these fast-evolving genes, we put forward four alternative, but not mutually exclusive, models of intra-genomic conflict that may play out at very termini of eukaryotic chromosomes. Our findings set the stage for investigating both the genetic causes and functional consequences of telomere protein evolution in Drosophila and beyond. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  11. Using evolutionary computations to understand the design and evolution of gene and cell regulatory networks.

    PubMed

    Spirov, Alexander; Holloway, David

    2013-07-15

    This paper surveys modeling approaches for studying the evolution of gene regulatory networks (GRNs). Modeling of the design or 'wiring' of GRNs has become increasingly common in developmental and medical biology, as a means of quantifying gene-gene interactions, the response to perturbations, and the overall dynamic motifs of networks. Drawing from developments in GRN 'design' modeling, a number of groups are now using simulations to study how GRNs evolve, both for comparative genomics and to uncover general principles of evolutionary processes. Such work can generally be termed evolution in silico. Complementary to these biologically-focused approaches, a now well-established field of computer science is Evolutionary Computations (ECs), in which highly efficient optimization techniques are inspired from evolutionary principles. In surveying biological simulation approaches, we discuss the considerations that must be taken with respect to: (a) the precision and completeness of the data (e.g. are the simulations for very close matches to anatomical data, or are they for more general exploration of evolutionary principles); (b) the level of detail to model (we proceed from 'coarse-grained' evolution of simple gene-gene interactions to 'fine-grained' evolution at the DNA sequence level); (c) to what degree is it important to include the genome's cellular context; and (d) the efficiency of computation. With respect to the latter, we argue that developments in computer science EC offer the means to perform more complete simulation searches, and will lead to more comprehensive biological predictions. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding

    PubMed Central

    Ayub, Qasim; Szpak, Michal; Frandsen, Peter; Chen, Yuan; Yngvadottir, Bryndis; Cooper, David N.; de Manuel, Marc; Hernandez-Rodriguez, Jessica; Lobon, Irene; Siegismund, Hans R.; Pagani, Luca; Quail, Michael A.; Hvilsom, Christina; Mudakikwa, Antoine; Eichler, Evan E.; Cranfield, Michael R.; Marques-Bonet, Tomas; Tyler-Smith, Chris; Scally, Aylwyn

    2015-01-01

    Mountain gorillas are an endangered great ape subspecies and a prominent focus for conservation, yet we know little about their genomic diversity and evolutionary past. We sequenced whole genomes from multiple wild individuals and compared the genomes of all four Gorilla subspecies. We found that the two eastern subspecies have experienced a prolonged population decline over the past 100,000 years, resulting in very low genetic diversity and an increased overall burden of deleterious variation. A further recent decline in the mountain gorilla population has led to extensive inbreeding, such that individuals are typically homozygous at 34% of their sequence, leading to the purging of severely deleterious recessive mutations from the population. We discuss the causes of their decline and the consequences for their future survival. PMID:25859046

  13. Comparative Chloroplast Genome Analyses of Streptophyte Green Algae Uncover Major Structural Alterations in the Klebsormidiophyceae, Coleochaetophyceae and Zygnematophyceae.

    PubMed

    Lemieux, Claude; Otis, Christian; Turmel, Monique

    2016-01-01

    The Streptophyta comprises all land plants and six main lineages of freshwater green algae: Mesostigmatophyceae, Chlorokybophyceae, Klebsormidiophyceae, Charophyceae, Coleochaetophyceae and Zygnematophyceae. Previous comparisons of the chloroplast genome from nine streptophyte algae (including four zygnematophyceans) revealed that, although land plant chloroplast DNAs (cpDNAs) inherited most of their highly conserved structural features from green algal ancestors, considerable cpDNA changes took place during the evolution of the Zygnematophyceae, the sister group of land plants. To gain deeper insights into the evolutionary dynamics of the chloroplast genome in streptophyte algae, we sequenced the cpDNAs of nine additional taxa: two klebsormidiophyceans (Entransia fimbriata and Klebsormidium sp. SAG 51.86), one coleocheatophycean (Coleochaete scutata) and six zygnematophyceans (Cylindrocystis brebissonii, Netrium digitus, Roya obtusa, Spirogyra maxima, Cosmarium botrytis and Closterium baillyanum). Our comparative analyses of these genomes with their streptophyte algal counterparts indicate that the large inverted repeat (IR) encoding the rDNA operon experienced loss or expansion/contraction in all three sampled classes and that genes were extensively shuffled in both the Klebsormidiophyceae and Zygnematophyceae. The klebsormidiophycean genomes boast greatly expanded IRs, with the Entransia 60,590-bp IR being the largest known among green algae. The 206,025-bp Entransia cpDNA, which is one of the largest genome among streptophytes, encodes 118 standard genes, i.e., four additional genes compared to its Klebsormidium flaccidum homolog. We inferred that seven of the 21 group II introns usually found in land plants were already present in the common ancestor of the Klebsormidiophyceae and its sister lineages. At 107,236 bp and with 117 standard genes, the Coleochaete IR-less genome is both the smallest and most compact among the streptophyte algal cpDNAs analyzed thus far; it lacks eight genes relative to its Chaetosphaeridium globosum homolog, four of which represent unique events in the evolutionary scenario of gene losses we reconstructed for streptophyte algae. The 10 compared zygnematophycean cpDNAs display tremendous variations at all levels, except gene content. During zygnematophycean evolution, the IR disappeared a minimum of five times, the rDNA operon was broken at four distinct sites, group II introns were lost on at least 43 occasions, and putative foreign genes, mainly of phage/viral origin, were gained.

  14. Comparative Chloroplast Genome Analyses of Streptophyte Green Algae Uncover Major Structural Alterations in the Klebsormidiophyceae, Coleochaetophyceae and Zygnematophyceae

    PubMed Central

    Lemieux, Claude; Otis, Christian; Turmel, Monique

    2016-01-01

    The Streptophyta comprises all land plants and six main lineages of freshwater green algae: Mesostigmatophyceae, Chlorokybophyceae, Klebsormidiophyceae, Charophyceae, Coleochaetophyceae and Zygnematophyceae. Previous comparisons of the chloroplast genome from nine streptophyte algae (including four zygnematophyceans) revealed that, although land plant chloroplast DNAs (cpDNAs) inherited most of their highly conserved structural features from green algal ancestors, considerable cpDNA changes took place during the evolution of the Zygnematophyceae, the sister group of land plants. To gain deeper insights into the evolutionary dynamics of the chloroplast genome in streptophyte algae, we sequenced the cpDNAs of nine additional taxa: two klebsormidiophyceans (Entransia fimbriata and Klebsormidium sp. SAG 51.86), one coleocheatophycean (Coleochaete scutata) and six zygnematophyceans (Cylindrocystis brebissonii, Netrium digitus, Roya obtusa, Spirogyra maxima, Cosmarium botrytis and Closterium baillyanum). Our comparative analyses of these genomes with their streptophyte algal counterparts indicate that the large inverted repeat (IR) encoding the rDNA operon experienced loss or expansion/contraction in all three sampled classes and that genes were extensively shuffled in both the Klebsormidiophyceae and Zygnematophyceae. The klebsormidiophycean genomes boast greatly expanded IRs, with the Entransia 60,590-bp IR being the largest known among green algae. The 206,025-bp Entransia cpDNA, which is one of the largest genome among streptophytes, encodes 118 standard genes, i.e., four additional genes compared to its Klebsormidium flaccidum homolog. We inferred that seven of the 21 group II introns usually found in land plants were already present in the common ancestor of the Klebsormidiophyceae and its sister lineages. At 107,236 bp and with 117 standard genes, the Coleochaete IR-less genome is both the smallest and most compact among the streptophyte algal cpDNAs analyzed thus far; it lacks eight genes relative to its Chaetosphaeridium globosum homolog, four of which represent unique events in the evolutionary scenario of gene losses we reconstructed for streptophyte algae. The 10 compared zygnematophycean cpDNAs display tremendous variations at all levels, except gene content. During zygnematophycean evolution, the IR disappeared a minimum of five times, the rDNA operon was broken at four distinct sites, group II introns were lost on at least 43 occasions, and putative foreign genes, mainly of phage/viral origin, were gained. PMID:27252715

  15. Evolutionary genomics and population structure of Entamoeba histolytica

    PubMed Central

    Das, Koushik; Ganguly, Sandipan

    2014-01-01

    Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba. PMID:25505504

  16. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow.

    PubMed

    Ravinet, M; Faria, R; Butlin, R K; Galindo, J; Bierne, N; Rafajlović, M; Noor, M A F; Mehlig, B; Westram, A M

    2017-08-01

    Speciation, the evolution of reproductive isolation among populations, is continuous, complex, and involves multiple, interacting barriers. Until it is complete, the effects of this process vary along the genome and can lead to a heterogeneous genomic landscape with peaks and troughs of differentiation and divergence. When gene flow occurs during speciation, barriers restricting gene flow locally in the genome lead to patterns of heterogeneity. However, genomic heterogeneity can also be produced or modified by variation in factors such as background selection and selective sweeps, recombination and mutation rate variation, and heterogeneous gene density. Extracting the effects of gene flow, divergent selection and reproductive isolation from such modifying factors presents a major challenge to speciation genomics. We argue one of the principal aims of the field is to identify the barrier loci involved in limiting gene flow. We first summarize the expected signatures of selection at barrier loci, at the genomic regions linked to them and across the entire genome. We then discuss the modifying factors that complicate the interpretation of the observed genomic landscape. Finally, we end with a road map for future speciation research: a proposal for how to account for these modifying factors and to progress towards understanding the nature of barrier loci. Despite the difficulties of interpreting empirical data, we argue that the availability of promising technical and analytical methods will shed further light on the important roles that gene flow and divergent selection have in shaping the genomic landscape of speciation. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.

  17. The dynamic evolutionary history of genome size in North American woodland salamanders.

    PubMed

    Newman, Catherine E; Gregory, T Ryan; Austin, Christopher C

    2017-04-01

    The genus Plethodon is the most species-rich salamander genus in North America, and nearly half of its species face an uncertain future. It is also one of the most diverse families in terms of genome sizes, which range from 1C = 18.2 to 69.3 pg, or 5-20 times larger than the human genome. Large genome size in salamanders results in part from accumulation of transposable elements and is associated with various developmental and physiological traits. However, genome sizes have been reported for only 25% of the species of Plethodon (14 of 55). We collected genome size data for Plethodon serratus to supplement an ongoing phylogeographic study, reconstructed the evolutionary history of genome size in Plethodontidae, and inferred probable genome sizes for the 41 species missing empirical data. Results revealed multiple genome size changes in Plethodon: genomes of western Plethodon increased, whereas genomes of eastern Plethodon decreased, followed by additional decreases or subsequent increases. The estimated genome size of P. serratus was 21 pg. New understanding of variation in genome size evolution, along with genome size inferences for previously unstudied taxa, provide a foundation for future studies on the biology of plethodontid salamanders.

  18. Human Variation in Short Regions Predisposed to Deep Evolutionary Conservation

    PubMed Central

    Loots, Gabriela G.; Ovcharenko, Ivan

    2010-01-01

    The landscape of the human genome consists of millions of short islands of conservation that are 100% conserved across multiple vertebrate genomes (termed “bricks”), the majority of which are located in noncoding regions. Several hundred thousand bricks are deeply conserved reaching the genomes of amphibians and fish. Deep phylogenetic conservation of noncoding DNA has been reported to be strongly associated with the presence of gene regulatory elements, introducing bricks as a proxy to the functional noncoding landscape of the human genome. Here, we report a significant overrepresentation of bricks in the promoters of transcription factors and developmental genes, where the high level of phylogenetic conservation correlates with an increase in brick overrepresentation. We also found that the presence of a brick dictates a predisposition to evolutionary constraint, with only 0.7% of the amniota brick central nucleotides being diverged within the primate lineage—an 11-fold reduction in the divergence rate compared with random expectation. Human single-nucleotide polymorphism (SNP) data explains only 3% of primate-specific variation in amniota bricks, thus arguing for a widespread fixation of brick mutations within the primate lineage and prior to human radiation. This variation, in turn, might have been utilized as a driving force for primate- and hominoid-specific adaptation. We also discovered a pronounced deviation from the evolutionary predisposition in the human lineage, with over 20-fold increase in the substitution rate at brick SNP sites over expected values. In addition, contrary to typical brick mutations, brick variation commonly encountered in the human population displays limited, if any, signatures of negative selection as measured by the minor allele frequency and population differentiation (F-statistical measure) measures. These observations argue for the plasticity of gene regulatory mechanisms in vertebrates—with evidence of strong purifying selection acting on the gene regulatory landscape of the human genome, where widespread advantageous mutations in putative regulatory elements are likely utilized in functional diversification and adaptation of species. PMID:20093432

  19. Experimental evidence supports a sex-specific selective sieve in mitochondrial genome evolution.

    PubMed

    Innocenti, Paolo; Morrow, Edward H; Dowling, Damian K

    2011-05-13

    Mitochondria are maternally transmitted; hence, their genome can only make a direct and adaptive response to selection through females, whereas males represent an evolutionary dead end. In theory, this creates a sex-specific selective sieve, enabling deleterious mutations to accumulate in mitochondrial genomes if they exert male-specific effects. We tested this hypothesis, expressing five mitochondrial variants alongside a standard nuclear genome in Drosophila melanogaster, and found striking sexual asymmetry in patterns of nuclear gene expression. Mitochondrial polymorphism had few effects on nuclear gene expression in females but major effects in males, modifying nearly 10% of transcripts. These were mostly male-biased in expression, with enrichment hotspots in the testes and accessory glands. Our results suggest an evolutionary mechanism that results in mitochondrial genomes harboring male-specific mutation loads.

  20. Evolutionary history and positional shift of a rice centromere.

    PubMed

    Ma, Jianxin; Wing, Rod A; Bennetzen, Jeffrey L; Jackson, Scott A

    2007-10-01

    Rice centromere 8 was previously proposed to be an "immature" centromere that recently arose from a genic region. Our comparative genomics analysis indicates that Cen8 was formed at its current location at least 7-9 million years ago and was physically shifted by a more recent inversion of a segment spanning centromeric and pericentromeric regions.

  1. Comprehensive analysis of the flowering genes in Chinese cabbage and examination of evolutionary pattern of CO-like genes in plant kingdom

    NASA Astrophysics Data System (ADS)

    Song, Xiaoming; Duan, Weike; Huang, Zhinan; Liu, Gaofeng; Wu, Peng; Liu, Tongkun; Li, Ying; Hou, Xilin

    2015-09-01

    In plants, flowering is the most important transition from vegetative to reproductive growth. The flowering patterns of monocots and eudicots are distinctly different, but few studies have described the evolutionary patterns of the flowering genes in them. In this study, we analysed the evolutionary pattern, duplication and expression level of these genes. The main results were as follows: (i) characterization of flowering genes in monocots and eudicots, including the identification of family-specific, orthologous and collinear genes; (ii) full characterization of CONSTANS-like genes in Brassica rapa (BraCOL genes), the key flowering genes; (iii) exploration of the evolution of COL genes in plant kingdom and construction of the evolutionary pattern of COL genes; (iv) comparative analysis of CO and FT genes between Brassicaceae and Grass, which identified several family-specific amino acids, and revealed that CO and FT protein structures were similar in B. rapa and Arabidopsis but different in rice; and (v) expression analysis of photoperiod pathway-related genes in B. rapa under different photoperiod treatments by RT-qPCR. This analysis will provide resources for understanding the flowering mechanisms and evolutionary pattern of COL genes. In addition, this genome-wide comparative study of COL genes may also provide clues for evolution of other flowering genes.

  2. Mitochondrial DNA plays an equal role in influencing female and male longevity in centenarians.

    PubMed

    He, Yong-Han; Lu, Xiang; Tian, Jiao-Yang; Yan, Dong-Jing; Li, Yu-Chun; Lin, Rong; Perry, Benjamin; Chen, Xiao-Qiong; Yu, Qin; Cai, Wang-Wei; Kong, Qing-Peng

    2016-10-01

    The mitochondrion is a double membrane-bound organelle which plays important functional roles in aging and many other complex phenotypes. Transmission of the mitochondrial genome in the matrilineal line causes the evolutionary selection sieve only in females. Theoretically, beneficial or neutral variations are more likely to accumulate and be retained in the female mitochondrial genome during evolution, which may be an initial trigger of gender dimorphism in aging. The asymmetry of evolutionary processes between gender could lead to males and females aging in different ways. If so, gender specific variation loads could be an evolutionary result of maternal heritage of mitochondrial genomes, especially in centenarians who live to an extreme age and are considered as good models for healthy aging. Here, we tested whether the mitochondrial variation loads were associated with altered aging patterns by investigating the mtDNA haplogroup distribution and genetic diversity between female and male centenarians. We found no evidence of differences in aging patterns between genders in centenarians. Our results indicate that the evolutionary consequence of gender dimorphism in mitochondrial genomes is not a factor in the altered aging patterns in human, and that mitochondrial DNA contributes equally to longevity in males and females. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Unlocking Triticeae genomics to sustainably feed the future

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2013-01-01

    The tribe Triticeae includes the major crops wheat and barley. Within the last few years, the whole genomes of four Triticeae species—barley, wheat, Tausch’s goatgrass (Aegilops tauschii) and wild einkorn wheat (Triticum urartu)—have been sequenced. The availability of these genomic resources for Triticeae plants and innovative analytical applications using next-generation sequencing technologies are helping to revitalize our approaches in genetic work and to accelerate improvement of the Triticeae crops. Comparative genomics and integration of genomic resources from Triticeae plants and the model grass Brachypodium distachyon are aiding the discovery of new genes and functional analyses of genes in Triticeae crops. Innovative approaches and tools such as analysis of next-generation populations, evolutionary genomics and systems approaches with mathematical modeling are new strategies that will help us discover alleles for adaptive traits to future agronomic environments. In this review, we provide an update on genomic tools for use with Triticeae plants and Brachypodium and describe emerging approaches toward crop improvements in Triticeae. PMID:24204022

  4. The Ciona intestinalis genome: when the constraints are off

    NASA Technical Reports Server (NTRS)

    Holland, Linda Z.; Gibson-Brown, Jeremy J.

    2003-01-01

    The recent genome sequencing of a non-vertebrate deuterostome, the ascidian tunicate Ciona intestinalis, makes a substantial contribution to the fields of evolutionary and developmental biology.1 Tunicates have some of the smallest bilaterian genomes, embryos with relatively few cells, fixed lineages and early determination of cell fates. Initial analyses of the C. intestinalis genome indicate that it has been evolving rapidly. Comparisons with other bilaterians show that C. intestinalis has lost a number of genes, and that many genes linked together in most other bilaterians have become uncoupled. In addition, a number of independent, lineage-specific gene duplications have been detected. These new results, although interesting in themselves, will take on a deeper significance once the genomes of additional invertebrate deuterostomes (e.g. echinoderms, hemichordates and amphioxus) have been sequenced. With such a broadened database, comparative genomics can begin to ask pointed questions about the relationship between the evolution of genomes and the evolution of body plans. Copyright 2003 Wiley Periodicals, Inc.

  5. A draft genome assembly of the army worm, Spodoptera frugiperda.

    PubMed

    Kakumani, Pavan Kumar; Malhotra, Pawan; Mukherjee, Sunil K; Bhatnagar, Raj K

    2014-08-01

    Spodoptera is an agriculturally important pest insect and studies in understanding its biology have been limited by the unavailability of its genome. In the present study, the genomic DNA was sequenced and assembled into 37,243 scaffolds of size, 358 Mb with N50 of 53.7 kb. Based on degree of identity, we could anchor 305 Mb of the genome onto all the 28 chromosomes of Bombyx mori. Repeat elements were identified, which accounts for 20.28% of the total genome. Further, we predicted 11,595 genes, with an average intron length of 726 bp. The genes were annotated and domain analysis revealed that Sf genes share a significant homology and expression pattern with B. mori, despite differences in KOG gene categories and representation of certain protein families. The present study on Sf genome would help in the characterization of cellular pathways to understand its biology and comparative evolutionary studies among lepidopteran family members to help annotate their genomes. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymesmore » and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.« less

  7. A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model

    PubMed Central

    Castillo, Andreina I; Nelson, Andrew D L; Haug-Baltzell, Asher K; Lyons, Eric

    2018-01-01

    Abstract Integrated platforms for storage, management, analysis and sharing of large quantities of omics data have become fundamental to comparative genomics. CoGe (https://genomevolution.org/coge/) is an online platform designed to manage and study genomic data, enabling both data- and hypothesis-driven comparative genomics. CoGe’s tools and resources can be used to organize and analyse both publicly available and private genomic data from any species. Here, we demonstrate the capabilities of CoGe through three example workflows using 17 Plasmodium genomes as a model. Plasmodium genomes present unique challenges for comparative genomics due to their rapidly evolving and highly variable genomic AT/GC content. These example workflows are intended to serve as templates to help guide researchers who would like to use CoGe to examine diverse aspects of genome evolution. In the first workflow, trends in genome composition and amino acid usage are explored. In the second, changes in genome structure and the distribution of synonymous (Ks) and non-synonymous (Kn) substitution values are evaluated across species with different levels of evolutionary relatedness. In the third workflow, microsyntenic analyses of multigene families’ genomic organization are conducted using two Plasmodium-specific gene families—serine repeat antigen, and cytoadherence-linked asexual gene—as models. In general, these example workflows show how to achieve quick, reproducible and shareable results using the CoGe platform. We were able to replicate previously published results, as well as leverage CoGe’s tools and resources to gain additional insight into various aspects of Plasmodium genome evolution. Our results highlight the usefulness of the CoGe platform, particularly in understanding complex features of genome evolution. Database URL: https://genomevolution.org/coge/

  8. Living Organisms Author Their Read-Write Genomes in Evolution

    PubMed Central

    2017-01-01

    Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with “non-coding” DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called “non-coding” RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations. PMID:29211049

  9. Living Organisms Author Their Read-Write Genomes in Evolution.

    PubMed

    Shapiro, James A

    2017-12-06

    Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with "non-coding" DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called "non-coding" RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations.

  10. High-throughput physical mapping of chromosomes using automated in situ hybridization.

    PubMed

    George, Phillip; Sharakhova, Maria V; Sharakhov, Igor V

    2012-06-28

    Projects to obtain whole-genome sequences for 10,000 vertebrate species and for 5,000 insect and related arthropod species are expected to take place over the next 5 years. For example, the sequencing of the genomes for 15 malaria mosquitospecies is currently being done using an Illumina platform. This Anopheles species cluster includes both vectors and non-vectors of malaria. When the genome assemblies become available, researchers will have the unique opportunity to perform comparative analysis for inferring evolutionary changes relevant to vector ability. However, it has proven difficult to use next-generation sequencing reads to generate high-quality de novo genome assemblies. Moreover, the existing genome assemblies for Anopheles gambiae, although obtained using the Sanger method, are gapped or fragmented. Success of comparative genomic analyses will be limited if researchers deal with numerous sequencing contigs, rather than with chromosome-based genome assemblies. Fragmented, unmapped sequences create problems for genomic analyses because: (i) unidentified gaps cause incorrect or incomplete annotation of genomic sequences; (ii) unmapped sequences lead to confusion between paralogous genes and genes from different haplotypes; and (iii) the lack of chromosome assignment and orientation of the sequencing contigs does not allow for reconstructing rearrangement phylogeny and studying chromosome evolution. Developing high-resolution physical maps for species with newly sequenced genomes is a timely and cost-effective investment that will facilitate genome annotation, evolutionary analysis, and re-sequencing of individual genomes from natural populations. Here, we present innovative approaches to chromosome preparation, fluorescent in situ hybridization (FISH), and imaging that facilitate rapid development of physical maps. Using An. gambiae as an example, we demonstrate that the development of physical chromosome maps can potentially improve genome assemblies and, thus, the quality of genomic analyses. First, we use a high-pressure method to prepare polytene chromosome spreads. This method, originally developed for Drosophila, allows the user to visualize more details on chromosomes than the regular squashing technique. Second, a fully automated, front-end system for FISH is used for high-throughput physical genome mapping. The automated slide staining system runs multiple assays simultaneously and dramatically reduces hands-on time. Third, an automatic fluorescent imaging system, which includes a motorized slide stage, automatically scans and photographs labeled chromosomes after FISH. This system is especially useful for identifying and visualizing multiple chromosomal plates on the same slide. In addition, the scanning process captures a more uniform FISH result. Overall, the automated high-throughput physical mapping protocol is more efficient than a standard manual protocol.

  11. Bipartite Network Analysis of the Archaeal Virosphere: Evolutionary Connections between Viruses and Capsidless Mobile Elements

    PubMed Central

    Prangishvili, David

    2016-01-01

    ABSTRACT Archaea and particularly hyperthermophilic crenarchaea are hosts to many unusual viruses with diverse virion shapes and distinct gene compositions. As is typical of viruses in general, there are no universal genes in the archaeal virosphere. Therefore, to obtain a comprehensive picture of the evolutionary relationships between viruses, network analysis methods are more productive than traditional phylogenetic approaches. Here we present a comprehensive comparative analysis of genomes and proteomes from all currently known taxonomically classified and unclassified, cultivated and uncultivated archaeal viruses. We constructed a bipartite network of archaeal viruses that includes two classes of nodes, the genomes and gene families that connect them. Dissection of this network using formal community detection methods reveals strong modularity, with 10 distinct modules and 3 putative supermodules. However, compared to similar previously analyzed networks of eukaryotic and bacterial viruses, the archaeal virus network is sparsely connected. With the exception of the tailed viruses related to bacteriophages of the order Caudovirales and the families Turriviridae and Sphaerolipoviridae that are linked to a distinct supermodule of eukaryotic and bacterial viruses, there are few connector genes shared by different archaeal virus modules. In contrast, most of these modules include, in addition to viruses, capsidless mobile elements, emphasizing tight evolutionary connections between the two types of entities in archaea. The relative contributions of distinct evolutionary origins, in particular from nonviral elements, and insufficient sampling to the sparsity of the archaeal virus network remain to be determined by further exploration of the archaeal virosphere. IMPORTANCE Viruses infecting archaea are among the most mysterious denizens of the virosphere. Many of these viruses display no genetic or even morphological relationship to viruses of bacteria and eukaryotes, raising questions regarding their origins and position in the global virosphere. Analysis of 5,740 protein sequences from 116 genomes allowed dissection of the archaeal virus network and showed that most groups of archaeal viruses are evolutionarily connected to capsidless mobile genetic elements, including various plasmids and transposons. This finding could reflect actual independent origins of the distinct groups of archaeal viruses from different nonviral elements, providing important insights into the emergence and evolution of the archaeal virome. PMID:27681128

  12. Tracing the peopling of the world through genomics

    PubMed Central

    Nielsen, Rasmus; Akey, Joshua M.; Jakobsson, Mattias; Pritchard, Jonathan K.; Tishkoff, Sarah; Willerslev, Eske

    2018-01-01

    Advances in the sequencing and the analysis of the genomes of both modern and ancient peoples have facilitated a number of breakthroughs in our understanding of human evolutionary history. These include the discovery of interbreeding between anatomically modern humans and extinct hominins; the development of an increasingly detailed description of the complex dispersal of modern humans out of Africa and their population expansion worldwide; and the characterization of many of the genetic adaptions of humans to local environmental conditions. Our interpretation of the evolutionary history and adaptation of humans is being transformed by analyses of these new genomic data. PMID:28102248

  13. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

    PubMed

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.

  14. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

    PubMed Central

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563

  15. Polyploidy and its effect on evolutionary success: old questions revisited with new tools

    PubMed Central

    Madlung, A

    2013-01-01

    Polyploidy, the condition of possessing more than two complete genomes in a cell, has intrigued biologists for almost a century. Polyploidy is found in many plants and some animal species and today we know that polyploidy has had a role in the evolution of all angiosperms. Despite its widespread occurrence, the direct effect of polyploidy on evolutionary success of a species is still largely unknown. Over the years many attractive hypotheses have been proposed in an attempt to assign functionality to the increased content of a duplicated genome. Among these hypotheses are the proposal that genome doubling confers distinct advantages to a polyploid and that these advantages allow polyploids to thrive in environments that pose challenges to the polyploid's diploid progenitors. This article revisits these long-standing questions and explores how the integration of recent genomic developments with ecological, physiological and evolutionary perspectives has contributed to addressing unresolved problems about the role of polyploidy. Although unsatisfactory, the current conclusion has to be that despite significant progress, there still isn't enough information to unequivocally answer many unresolved questions about cause and effect of polyploidy on evolutionary success of a species. There is, however, reason to believe that the increasingly integrative approaches discussed here should allow us in the future to make more direct connections between the effects of polyploidy on the genome and the responses this condition elicits from the organism living in its natural environment. PMID:23149459

  16. Genome of the Asian longhorned beetle (Anoplophora glabripennis), a globally significant invasive species, reveals key functional and evolutionary innovations at the beetle-plant interface.

    PubMed

    McKenna, Duane D; Scully, Erin D; Pauchet, Yannick; Hoover, Kelli; Kirsch, Roy; Geib, Scott M; Mitchell, Robert F; Waterhouse, Robert M; Ahn, Seung-Joon; Arsala, Deanna; Benoit, Joshua B; Blackmon, Heath; Bledsoe, Tiffany; Bowsher, Julia H; Busch, André; Calla, Bernarda; Chao, Hsu; Childers, Anna K; Childers, Christopher; Clarke, Dave J; Cohen, Lorna; Demuth, Jeffery P; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dolan, Amanda; Duan, Jian J; Dugan, Shannon; Friedrich, Markus; Glastad, Karl M; Goodisman, Michael A D; Haddad, Stephanie; Han, Yi; Hughes, Daniel S T; Ioannidis, Panagiotis; Johnston, J Spencer; Jones, Jeffery W; Kuhn, Leslie A; Lance, David R; Lee, Chien-Yueh; Lee, Sandra L; Lin, Han; Lynch, Jeremy A; Moczek, Armin P; Murali, Shwetha C; Muzny, Donna M; Nelson, David R; Palli, Subba R; Panfilio, Kristen A; Pers, Dan; Poelchau, Monica F; Quan, Honghu; Qu, Jiaxin; Ray, Ann M; Rinehart, Joseph P; Robertson, Hugh M; Roehrdanz, Richard; Rosendale, Andrew J; Shin, Seunggwan; Silva, Christian; Torson, Alex S; Jentzsch, Iris M Vargas; Werren, John H; Worley, Kim C; Yocum, George; Zdobnov, Evgeny M; Gibbs, Richard A; Richards, Stephen

    2016-11-11

    Relatively little is known about the genomic basis and evolution of wood-feeding in beetles. We undertook genome sequencing and annotation, gene expression assays, studies of plant cell wall degrading enzymes, and other functional and comparative studies of the Asian longhorned beetle, Anoplophora glabripennis, a globally significant invasive species capable of inflicting severe feeding damage on many important tree species. Complementary studies of genes encoding enzymes involved in digestion of woody plant tissues or detoxification of plant allelochemicals were undertaken with the genomes of 14 additional insects, including the newly sequenced emerald ash borer and bull-headed dung beetle. The Asian longhorned beetle genome encodes a uniquely diverse arsenal of enzymes that can degrade the main polysaccharide networks in plant cell walls, detoxify plant allelochemicals, and otherwise facilitate feeding on woody plants. It has the metabolic plasticity needed to feed on diverse plant species, contributing to its highly invasive nature. Large expansions of chemosensory genes involved in the reception of pheromones and plant kairomones are consistent with the complexity of chemical cues it uses to find host plants and mates. Amplification and functional divergence of genes associated with specialized feeding on plants, including genes originally obtained via horizontal gene transfer from fungi and bacteria, contributed to the addition, expansion, and enhancement of the metabolic repertoire of the Asian longhorned beetle, certain other phytophagous beetles, and to a lesser degree, other phytophagous insects. Our results thus begin to establish a genomic basis for the evolutionary success of beetles on plants.

  17. Comparative genomic analysis of Genlisea (corkscrew plants—Lentibulariaceae) chloroplast genomes reveals an increasing loss of the ndh genes

    PubMed Central

    Silva, Saura R.; Michael, Todd P.; Meer, Elliott J.; Pinheiro, Daniel G.; Miranda, Vitor F. O.

    2018-01-01

    In the carnivorous plant family Lentibulariaceae, all three genome compartments (nuclear, chloroplast, and mitochondria) have some of the highest rates of nucleotide substitutions across angiosperms. While the genera Genlisea and Utricularia have the smallest known flowering plant nuclear genomes, the chloroplast genomes (cpDNA) are mostly structurally conserved except for deletion and/or pseudogenization of the NAD(P)H-dehydrogenase complex (ndh) genes known to be involved in stress conditions of low light or CO2 concentrations. In order to determine how the cpDNA are changing, and to better understand the evolutionary history within the Genlisea genus, we sequenced, assembled and analyzed complete cpDNA from six species (G. aurea, G. filiformis, G. pygmaea, G. repens, G. tuberosa and G. violacea) together with the publicly available G. margaretae cpDNA. In general, the cpDNA structure among the analyzed Genlisea species is highly similar. However, we found that the plastidial ndh genes underwent a progressive process of degradation similar to the other terrestrial Lentibulariaceae cpDNA analyzed to date, but in contrast to the aquatic species. Contrary to current thinking that the terrestrial environment is a more stressful environment and thus requiring the ndh genes, we provide evidence that in the Lentibulariaceae the terrestrial forms have progressive loss while the aquatic forms have the eleven plastidial ndh genes intact. Therefore, the Lentibulariaceae system provides an important opportunity to understand the evolutionary forces that govern the transition to an aquatic environment and may provide insight into how plants manage water stress at a genome scale. PMID:29293597

  18. Opposing Forces of A/T-Biased Mutations and G/C-Biased Gene Conversions Shape the Genome of the Nematode Pristionchus pacificus

    PubMed Central

    Weller, Andreas M.; Rödelsperger, Christian; Eberhardt, Gabi; Molnar, Ruxandra I.; Sommer, Ralf J.

    2014-01-01

    Base substitution mutations are a major source of genetic novelty and mutation accumulation line (MAL) studies revealed a nearly universal AT bias in de novo mutation spectra. While a comparison of de novo mutation spectra with the actual nucleotide composition in the genome suggests the existence of general counterbalancing mechanisms, little is known about the evolutionary and historical details of these opposing forces. Here, we correlate MAL-derived mutation spectra with patterns observed from population resequencing. Variation observed in natural populations has already been subject to evolutionary forces. Distinction between rare and common alleles, the latter of which are close to fixation and of presumably older age, can provide insight into mutational processes and their influence on genome evolution. We provide a genome-wide analysis of de novo mutations in 22 MALs of the nematode Pristionchus pacificus and compare the spectra with natural variants observed in resequencing of 104 natural isolates. MALs show an AT bias of 5.3, one of the highest values observed to date. In contrast, the AT bias in natural variants is much lower. Specifically, rare derived alleles show an AT bias of 2.4, whereas common derived alleles close to fixation show no AT bias at all. These results indicate the existence of a strong opposing force and they suggest that the GC content of the P. pacificus genome is in equilibrium. We discuss GC-biased gene conversion as a potential mechanism acting against AT-biased mutations. This study provides insight into genome evolution by combining MAL studies with natural variation. PMID:24414549

  19. Comparative genomic analysis of Genlisea (corkscrew plants-Lentibulariaceae) chloroplast genomes reveals an increasing loss of the ndh genes.

    PubMed

    Silva, Saura R; Michael, Todd P; Meer, Elliott J; Pinheiro, Daniel G; Varani, Alessandro M; Miranda, Vitor F O

    2018-01-01

    In the carnivorous plant family Lentibulariaceae, all three genome compartments (nuclear, chloroplast, and mitochondria) have some of the highest rates of nucleotide substitutions across angiosperms. While the genera Genlisea and Utricularia have the smallest known flowering plant nuclear genomes, the chloroplast genomes (cpDNA) are mostly structurally conserved except for deletion and/or pseudogenization of the NAD(P)H-dehydrogenase complex (ndh) genes known to be involved in stress conditions of low light or CO2 concentrations. In order to determine how the cpDNA are changing, and to better understand the evolutionary history within the Genlisea genus, we sequenced, assembled and analyzed complete cpDNA from six species (G. aurea, G. filiformis, G. pygmaea, G. repens, G. tuberosa and G. violacea) together with the publicly available G. margaretae cpDNA. In general, the cpDNA structure among the analyzed Genlisea species is highly similar. However, we found that the plastidial ndh genes underwent a progressive process of degradation similar to the other terrestrial Lentibulariaceae cpDNA analyzed to date, but in contrast to the aquatic species. Contrary to current thinking that the terrestrial environment is a more stressful environment and thus requiring the ndh genes, we provide evidence that in the Lentibulariaceae the terrestrial forms have progressive loss while the aquatic forms have the eleven plastidial ndh genes intact. Therefore, the Lentibulariaceae system provides an important opportunity to understand the evolutionary forces that govern the transition to an aquatic environment and may provide insight into how plants manage water stress at a genome scale.

  20. MANTIS: a phylogenetic framework for multi-species genome comparisons.

    PubMed

    Tzika, Athanasia C; Helaers, Raphaël; Van de Peer, Yves; Milinkovitch, Michel C

    2008-01-15

    Practitioners of comparative genomics face huge analytical challenges as whole genome sequences and functional/expression data accumulate. Furthermore, the field would greatly benefit from a better integration of this wealth of data with evolutionary concepts. Here, we present MANTIS, a relational database for the analysis of (i) gains and losses of genes on specific branches of the metazoan phylogeny, (ii) reconstructed genome content of ancestral species and (iii) over- or under-representation of functions/processes and tissue specificity of gained, duplicated and lost genes. MANTIS estimates the most likely positions of gene losses on the true phylogeny using a maximum-likelihood function. A user-friendly interface and an extensive query system allow to investigate questions pertaining to gene identity, phylogenetic mapping and function/expression parameters. MANTIS is freely available at http://www.mantisdb.org and constitutes the missing link between multi-species genome comparisons and functional analyses.

Top