Phylogenomic analyses data of the avian phylogenomics project.
Jarvis, Erich D; Mirarab, Siavash; Aberer, Andre J; Li, Bo; Houde, Peter; Li, Cai; Ho, Simon Y W; Faircloth, Brant C; Nabholz, Benoit; Howard, Jason T; Suh, Alexander; Weber, Claudia C; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Narula, Nitish; Liu, Liang; Burt, Dave; Ellegren, Hans; Edwards, Scott V; Stamatakis, Alexandros; Mindell, David P; Cracraft, Joel; Braun, Edward L; Warnow, Tandy; Jun, Wang; Gilbert, M Thomas Pius; Zhang, Guojie
2015-01-01
Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomic analyses. Here we present the datasets associated with the phylogenomic analyses, which include sequence alignment files consisting of nucleotides, amino acids, indels, and transposable elements, as well as tree files containing gene trees and species trees. Inferring an accurate phylogeny required generating: 1) A well annotated data set across species based on genome synteny; 2) Alignments with unaligned or incorrectly overaligned sequences filtered out; and 3) Diverse data sets, including genes and their inferred trees, indels, and transposable elements. Our total evidence nucleotide tree (TENT) data set (consisting of exons, introns, and UCEs) gave what we consider our most reliable species tree when using the concatenation-based ExaML algorithm or when using statistical binning with the coalescence-based MP-EST algorithm (which we refer to as MP-EST*). Other data sets, such as the coding sequence of some exons, revealed other properties of genome evolution, namely convergence. The Avian Phylogenomics Project is the largest vertebrate phylogenomics project to date that we are aware of. The sequence, alignment, and tree data are expected to accelerate analyses in phylogenomics and other related areas.
OrthoSelect: a protocol for selecting orthologous groups in phylogenomics.
Schreiber, Fabian; Pick, Kerstin; Erpenbeck, Dirk; Wörheide, Gert; Morgenstern, Burkhard
2009-07-16
Phylogenetic studies using expressed sequence tags (EST) are becoming a standard approach to answer evolutionary questions. Such studies are usually based on large sets of newly generated, unannotated, and error-prone EST sequences from different species. A first crucial step in EST-based phylogeny reconstruction is to identify groups of orthologous sequences. From these data sets, appropriate target genes are selected, and redundant sequences are eliminated to obtain suitable sequence sets as input data for tree-reconstruction software. Generating such data sets manually can be very time consuming. Thus, software tools are needed that carry out these steps automatically. We developed a flexible and user-friendly software pipeline, running on desktop machines or computer clusters, that constructs data sets for phylogenomic analyses. It automatically searches assembled EST sequences against databases of orthologous groups (OG), assigns ESTs to these predefined OGs, translates the sequences into proteins, eliminates redundant sequences assigned to the same OG, creates multiple sequence alignments of identified orthologous sequences and offers the possibility to further process this alignment in a last step by excluding potentially homoplastic sites and selecting sufficiently conserved parts. Our software pipeline can be used as it is, but it can also be adapted by integrating additional external programs. This makes the pipeline useful for non-bioinformaticians as well as to bioinformatic experts. The software pipeline is especially designed for ESTs, but it can also handle protein sequences. OrthoSelect is a tool that produces orthologous gene alignments from assembled ESTs. Our tests show that OrthoSelect detects orthologs in EST libraries with high accuracy. In the absence of a gold standard for orthology prediction, we compared predictions by OrthoSelect to a manually created and published phylogenomic data set. Our tool was not only able to rebuild the data set with a specificity of 98%, but it detected four percent more orthologous sequences. Furthermore, the results OrthoSelect produces are in absolut agreement with the results of other programs, but our tool offers a significant speedup and additional functionality, e.g. handling of ESTs, computing sequence alignments, and refining them. To our knowledge, there is currently no fully automated and freely available tool for this purpose. Thus, OrthoSelect is a valuable tool for researchers in the field of phylogenomics who deal with large quantities of EST sequences. OrthoSelect is written in Perl and runs on Linux/Mac OS X. The tool can be downloaded at (http://gobics.de/fabian/orthoselect.php).
A Consistent Phylogenetic Backbone for the Fungi
Ebersberger, Ingo; de Matos Simoes, Ricardo; Kupczok, Anne; Gube, Matthias; Kothe, Erika; Voigt, Kerstin; von Haeseler, Arndt
2012-01-01
The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data—a common practice in phylogenomic analyses—introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses. PMID:22114356
Steinke, Dirk; Salzburger, Walter; Meyer, Axel
2006-06-01
The power of comparative phylogenomic analyses also depends on the amount of data that are included in such studies. We used expressed sequence tags (ESTs) from fish model species as a proof of principle approach in order to test the reliability of using ESTs for phylogenetic inference. As expected, the robustness increases with the amount of sequences. Although some progress has been made in the elucidation of the phylogeny of teleosts, relationships among the main lineages of the derived fish (Euteleostei) remain poorly defined and are still debated. We performed a phylogenomic analysis of a set of 42 of orthologous genes from 10 available fish model systems from seven different orders (Salmoniformes, Siluriformes, Cypriniformes, Tetraodontiformes, Cyprinodontiformes, Beloniformes, and Perciformes) of euteleostean fish to estimate divergence times and evolutionary relationships among those lineages. All 10 fish species serve as models for developmental, aquaculture, genomic, and comparative genetic studies. The phylogenetic signal and the strength of the contribution of each of the 42 orthologous genes were estimated with randomly chosen data subsets. Our study revealed a molecular phylogeny of higher-level relationships of derived teleosts, which indicates that the use of multiple genes produces robust phylogenies, a finding that is expected to apply to other phylogenetic issues among distantly related taxa. Our phylogenomic analyses confirm that the euteleostean superorders Ostariophysi and Acanthopterygii are monophyletic and the Protacanthopterygii and Ostariophysi are sister clades. In addition, and contrary to the traditional phylogenetic hypothesis, our analyses determine that killifish (Cyprinodontiformes), medaka (Beloniformes), and cichlids (Perciformes) appear to be more closely related to each other than either of them is to pufferfish (Tetraodontiformes). All 10 lineages split before or during the fragmentation of the supercontinent Pangea in the Jurassic.
Evolution of Rhizaria: new insights from phylogenomic analysis of uncultivated protists.
Burki, Fabien; Kudryavtsev, Alexander; Matz, Mikhail V; Aglyamova, Galina V; Bulman, Simon; Fiers, Mark; Keeling, Patrick J; Pawlowski, Jan
2010-12-02
Recent phylogenomic analyses have revolutionized our view of eukaryote evolution by revealing unexpected relationships between and within the eukaryotic supergroups. However, for several groups of uncultivable protists, only the ribosomal RNA genes and a handful of proteins are available, often leading to unresolved evolutionary relationships. A striking example concerns the supergroup Rhizaria, which comprises several groups of uncultivable free-living protists such as radiolarians, foraminiferans and gromiids, as well as the parasitic plasmodiophorids and haplosporids. Thus far, the relationships within this supergroup have been inferred almost exclusively from rRNA, actin, and polyubiquitin genes, and remain poorly resolved. To address this, we have generated large Expressed Sequence Tag (EST) datasets for 5 species of Rhizaria belonging to 3 important groups: Acantharea (Astrolonche sp., Phyllostaurus sp.), Phytomyxea (Spongospora subterranea, Plasmodiophora brassicae) and Gromiida (Gromia sphaerica). 167 genes were selected for phylogenetic analyses based on the representation of at least one rhizarian species for each gene. Concatenation of these genes produced a supermatrix composed of 36,735 amino acid positions, including 10 rhizarians, 9 stramenopiles, and 9 alveolates. Phylogenomic analyses of this large dataset revealed a strongly supported clade grouping Foraminifera and Acantharea. The position of this clade within Rhizaria was sensitive to the method employed and the taxon sampling: Maximum Likelihood (ML) and Bayesian analyses using empirical model of evolution favoured an early divergence, whereas the CAT model and ML analyses with fast-evolving sites or the foraminiferan species Reticulomyxa filosa removed suggested a derived position, closely related to Gromia and Phytomyxea. In contrast to what has been previously reported, our analyses also uncovered the presence of the rhizarian-specific polyubiquitin insertion in Acantharea. Finally, this work reveals another possible rhizarian signature in the 60S ribosomal protein L10a. Our study provides new insights into the evolution of Rhizaria based on phylogenomic analyses of ESTs from three groups of previously under-sampled protists. It was enabled through the application of a recently developed method of transcriptome analysis, requiring very small amount of starting material. Our study illustrates the potential of this method to elucidate the early evolution of eukaryotes by providing large amount of data for uncultivable free-living and parasitic protists.
Agent of whirling disease meets orphan worm: phylogenomic analyses firmly place Myxozoa in Cnidaria.
Nesnidal, Maximilian P; Helmkampf, Martin; Bruchhaus, Iris; El-Matbouli, Mansour; Hausdorf, Bernhard
2013-01-01
Myxozoa are microscopic obligate endoparasites with complex live cycles. Representatives are Myxobolus cerebralis, the causative agent of whirling disease in salmonids, and the enigmatic "orphan worm" Buddenbrockia plumatellae parasitizing in Bryozoa. Originally, Myxozoa were classified as protists, but later several metazoan characteristics were reported. However, their phylogenetic relationships remained doubtful. Some molecular phylogenetic analyses placed them as sister group to or even within Bilateria, whereas the possession of polar capsules that are similar to nematocysts of Cnidaria and of minicollagen genes suggest a close relationship between Myxozoa and Cnidaria. EST data of Buddenbrockia also indicated a cnidarian origin of Myxozoa, but were not sufficient to reject a closer relationship to bilaterians. Phylogenomic analyses of new genomic sequences of Myxobolus cerebralis firmly place Myxozoa as sister group to Medusozoa within Cnidaria. Based on the new dataset, the alternative hypothesis that Myxozoa form a clade with Bilateria can be rejected using topology tests. Sensitivity analyses indicate that this result is not affected by long branch attraction artifacts or compositional bias.
Pyron, R Alexander; Hendry, Catriona R; Chou, Vincent M; Lemmon, Emily M; Lemmon, Alan R; Burbrink, Frank T
2014-12-01
Next-generation genomic sequencing promises to quickly and cheaply resolve remaining contentious nodes in the Tree of Life, and facilitates species-tree estimation while taking into account stochastic genealogical discordance among loci. Recent methods for estimating species trees bypass full likelihood-based estimates of the multi-species coalescent, and approximate the true species-tree using simpler summary metrics. These methods converge on the true species-tree with sufficient genomic sampling, even in the anomaly zone. However, no studies have yet evaluated their efficacy on a large-scale phylogenomic dataset, and compared them to previous concatenation strategies. Here, we generate such a dataset for Caenophidian snakes, a group with >2500 species that contains several rapid radiations that were poorly resolved with fewer loci. We generate sequence data for 333 single-copy nuclear loci with ∼100% coverage (∼0% missing data) for 31 major lineages. We estimate phylogenies using neighbor joining, maximum parsimony, maximum likelihood, and three summary species-tree approaches (NJst, STAR, and MP-EST). All methods yield similar resolution and support for most nodes. However, not all methods support monophyly of Caenophidia, with Acrochordidae placed as the sister taxon to Pythonidae in some analyses. Thus, phylogenomic species-tree estimation may occasionally disagree with well-supported relationships from concatenated analyses of small numbers of nuclear or mitochondrial genes, a consideration for future studies. In contrast for at least two diverse, rapid radiations (Lamprophiidae and Colubridae), phylogenomic data and species-tree inference do little to improve resolution and support. Thus, certain nodes may lack strong signal, and larger datasets and more sophisticated analyses may still fail to resolve them. Copyright © 2014 Elsevier Inc. All rights reserved.
Agent of Whirling Disease Meets Orphan Worm: Phylogenomic Analyses Firmly Place Myxozoa in Cnidaria
Nesnidal, Maximilian P.; Helmkampf, Martin; Bruchhaus, Iris; El-Matbouli, Mansour; Hausdorf, Bernhard
2013-01-01
Myxozoa are microscopic obligate endoparasites with complex live cycles. Representatives are Myxobolus cerebralis, the causative agent of whirling disease in salmonids, and the enigmatic “orphan worm” Buddenbrockia plumatellae parasitizing in Bryozoa. Originally, Myxozoa were classified as protists, but later several metazoan characteristics were reported. However, their phylogenetic relationships remained doubtful. Some molecular phylogenetic analyses placed them as sister group to or even within Bilateria, whereas the possession of polar capsules that are similar to nematocysts of Cnidaria and of minicollagen genes suggest a close relationship between Myxozoa and Cnidaria. EST data of Buddenbrockia also indicated a cnidarian origin of Myxozoa, but were not sufficient to reject a closer relationship to bilaterians. Phylogenomic analyses of new genomic sequences of Myxobolus cerebralis firmly place Myxozoa as sister group to Medusozoa within Cnidaria. Based on the new dataset, the alternative hypothesis that Myxozoa form a clade with Bilateria can be rejected using topology tests. Sensitivity analyses indicate that this result is not affected by long branch attraction artifacts or compositional bias. PMID:23382916
Phylogenomics provides strong evidence for relationships of butterflies and moths
Kawahara, Akito Y.; Breinholt, Jesse W.
2014-01-01
Butterflies and moths constitute some of the most popular and charismatic insects. Lepidoptera include approximately 160 000 described species, many of which are important model organisms. Previous studies on the evolution of Lepidoptera did not confidently place butterflies, and many relationships among superfamilies in the megadiverse clade Ditrysia remain largely uncertain. We generated a molecular dataset with 46 taxa, combining 33 new transcriptomes with 13 available genomes, transcriptomes and expressed sequence tags (ESTs). Using HaMStR with a Lepidoptera-specific core-orthologue set of single copy loci, we identified 2696 genes for inclusion into the phylogenomic analysis. Nucleotides and amino acids of the all-gene, all-taxon dataset yielded nearly identical, well-supported trees. Monophyly of butterflies (Papilionoidea) was strongly supported, and the group included skippers (Hesperiidae) and the enigmatic butterfly–moths (Hedylidae). Butterflies were placed sister to the remaining obtectomeran Lepidoptera, and the latter was grouped with greater than or equal to 87% bootstrap support. Establishing confident relationships among the four most diverse macroheteroceran superfamilies was previously challenging, but we recovered 100% bootstrap support for the following relationships: ((Geometroidea, Noctuoidea), (Bombycoidea, Lasiocampoidea)). We present the first robust, transcriptome-based tree of Lepidoptera that strongly contradicts historical placement of butterflies, and provide an evolutionary framework for genomic, developmental and ecological studies on this diverse insect order. PMID:24966318
Yu, Xiaoyu; Reva, Oleg N
2018-01-01
Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.
Yu, Xiaoyu; Reva, Oleg N
2018-01-01
Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354
Phylogenomics provides strong evidence for relationships of butterflies and moths.
Kawahara, Akito Y; Breinholt, Jesse W
2014-08-07
Butterflies and moths constitute some of the most popular and charismatic insects. Lepidoptera include approximately 160 000 described species, many of which are important model organisms. Previous studies on the evolution of Lepidoptera did not confidently place butterflies, and many relationships among superfamilies in the megadiverse clade Ditrysia remain largely uncertain. We generated a molecular dataset with 46 taxa, combining 33 new transcriptomes with 13 available genomes, transcriptomes and expressed sequence tags (ESTs). Using HaMStR with a Lepidoptera-specific core-orthologue set of single copy loci, we identified 2696 genes for inclusion into the phylogenomic analysis. Nucleotides and amino acids of the all-gene, all-taxon dataset yielded nearly identical, well-supported trees. Monophyly of butterflies (Papilionoidea) was strongly supported, and the group included skippers (Hesperiidae) and the enigmatic butterfly-moths (Hedylidae). Butterflies were placed sister to the remaining obtectomeran Lepidoptera, and the latter was grouped with greater than or equal to 87% bootstrap support. Establishing confident relationships among the four most diverse macroheteroceran superfamilies was previously challenging, but we recovered 100% bootstrap support for the following relationships: ((Geometroidea, Noctuoidea), (Bombycoidea, Lasiocampoidea)). We present the first robust, transcriptome-based tree of Lepidoptera that strongly contradicts historical placement of butterflies, and provide an evolutionary framework for genomic, developmental and ecological studies on this diverse insect order. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Alignment-free inference of hierarchical and reticulate phylogenomic relationships.
Bernard, Guillaume; Chan, Cheong Xin; Chan, Yao-Ban; Chua, Xin-Yi; Cong, Yingnan; Hogan, James M; Maetschke, Stefan R; Ragan, Mark A
2017-06-30
We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed. © The Author 2017. Published by Oxford University Press.
The Impact of Missing Data on Species Tree Estimation.
Xi, Zhenxiang; Liu, Liang; Davis, Charles C
2016-03-01
Phylogeneticists are increasingly assembling genome-scale data sets that include hundreds of genes to resolve their focal clades. Although these data sets commonly include a moderate to high amount of missing data, there remains no consensus on their impact to species tree estimation. Here, using several simulated and empirical data sets, we assess the effects of missing data on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and gene rate heterogeneity. We demonstrate that concatenation (RAxML), gene-tree-based coalescent (ASTRAL, MP-EST, and STAR), and supertree (matrix representation with parsimony [MRP]) methods perform reliably, so long as missing data are randomly distributed (by gene and/or by species) and that a sufficiently large number of genes are sampled. When data sets are indecisive sensu Sanderson et al. (2010. Phylogenomics with incomplete taxon coverage: the limits to inference. BMC Evol Biol. 10:155) and/or ILS is high, however, high amounts of missing data that are randomly distributed require exhaustive levels of gene sampling, likely exceeding most empirical studies to date. Moreover, missing data become especially problematic when they are nonrandomly distributed. We demonstrate that STAR produces inconsistent results when the amount of nonrandom missing data is high, regardless of the degree of ILS and gene rate heterogeneity. Similarly, concatenation methods using maximum likelihood can be misled by nonrandom missing data in the presence of gene rate heterogeneity, which becomes further exacerbated when combined with high ILS. In contrast, ASTRAL, MP-EST, and MRP are more robust under all of these scenarios. These results underscore the importance of understanding the influence of missing data in the phylogenomics era. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Chen, Meng-Yun; Liang, Dan; Zhang, Peng
2015-11-01
Incongruence between different phylogenomic analyses is the main challenge faced by phylogeneticists in the genomic era. To reduce incongruence, phylogenomic studies normally adopt some data filtering approaches, such as reducing missing data or using slowly evolving genes, to improve the signal quality of data. Here, we assembled a phylogenomic data set of 58 jawed vertebrate taxa and 4682 genes to investigate the backbone phylogeny of jawed vertebrates under both concatenation and coalescent-based frameworks. To evaluate the efficiency of extracting phylogenetic signals among different data filtering methods, we chose six highly intractable internodes within the backbone phylogeny of jawed vertebrates as our test questions. We found that our phylogenomic data set exhibits substantial conflicting signal among genes for these questions. Our analyses showed that non-specific data sets that are generated without bias toward specific questions are not sufficient to produce consistent results when there are several difficult nodes within a phylogeny. Moreover, phylogenetic accuracy based on non-specific data is considerably influenced by the size of data and the choice of tree inference methods. To address such incongruences, we selected genes that resolve a given internode but not the entire phylogeny. Notably, not only can this strategy yield correct relationships for the question, but it also reduces inconsistency associated with data sizes and inference methods. Our study highlights the importance of gene selection in phylogenomic analyses, suggesting that simply using a large amount of data cannot guarantee correct results. Constructing question-specific data sets may be more powerful for resolving problematic nodes. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Vatanparast, Mohammad; Powell, Adrian; Doyle, Jeff J; Egan, Ashley N
2018-03-01
The development of pipelines for locus discovery has spurred the use of target enrichment for plant phylogenomics. However, few studies have compared pipelines from locus discovery and bait design, through validation, to tree inference. We compared three methods within Leguminosae (Fabaceae) and present a workflow for future efforts. Using 30 transcriptomes, we compared Hyb-Seq, MarkerMiner, and the Yang and Smith (Y&S) pipelines for locus discovery, validated 7501 baits targeting 507 loci across 25 genera via Illumina sequencing, and inferred gene and species trees via concatenation- and coalescent-based methods. Hyb-Seq discovered loci with the longest mean length. MarkerMiner discovered the most conserved loci with the least flagged as paralogous. Y&S offered the most parsimony-informative sites and putative orthologs. Target recovery averaged 93% across taxa. We optimized our targeted locus set based on a workflow designed to minimize paralog/ortholog conflation and thus present 423 loci for legume phylogenomics. Methods differed across criteria important for phylogenetic marker development. We recommend Hyb-Seq as a method that may be useful for most phylogenomic projects. Our targeted locus set is a resource for future, community-driven efforts to reconstruct the legume tree of life.
Contentious relationships in phylogenomic studies can be driven by a handful of genes
Shen, Xing-Xing; Hittinger, Chris Todd; Rokas, Antonis
2017-01-01
Phylogenomic studies have resolved countless branches of the tree of life (ToL), but remain strongly contradictory on certain, contentious relationships. Here, we employ a maximum likelihood framework to quantify the distribution of phylogenetic signal among genes and sites for 17 contentious branches and 6 well-established control branches in plant, animal, and fungal phylogenomic data matrices. We find that resolution in some of these 17 branches rests on a single gene or a few sites, and that removal of a single gene in concatenation analyses or a single site from every gene in coalescence-based analyses diminishes support and can alter the inferred topology. These results suggest that tiny subsets of very large data matrices drive the resolution of specific internodes, providing a dissection of the distribution of support and observed incongruence in phylogenomic analyses. We submit that quantifying the distribution of phylogenetic signal in phylogenomic data is essential for evaluating whether branches, especially contentious ones, are truly resolved. Finally, we offer one detailed example of such an evaluation for the controversy regarding the earliest-branching metazoan phylum, where examination of the distributions of gene-wise and site-wise phylogenetic signal across 8 data matrices consistently supports ctenophores as sister group to all other metazoans. PMID:28812701
Hedin, Marshal; Derkarabetian, Shahan; Ramírez, Martín J; Vink, Cor; Bond, Jason E
2018-01-26
Here we show that the most venomous spiders in the world are phylogenetically misplaced. Australian atracine spiders (family Hexathelidae), including the notorious Sydney funnel-web spider Atrax robustus, produce venom peptides that can kill people. Intriguingly, eastern Australian mouse spiders (family Actinopodidae) are also medically dangerous, possessing venom peptides strikingly similar to Atrax hexatoxins. Based on the standing morphology-based classification, mouse spiders are hypothesized distant relatives of atracines, having diverged over 200 million years ago. Using sequence-capture phylogenomics, we instead show convincingly that hexathelids are non-monophyletic, and that atracines are sister to actinopodids. Three new mygalomorph lineages are elevated to the family level, and a revised circumscription of Hexathelidae is presented. Re-writing this phylogenetic story has major implications for how we study venom evolution in these spiders, and potentially genuine consequences for antivenom development and bite treatment research. More generally, our research provides a textbook example of the applied importance of modern phylogenomic research.
Phylogenomics of Lophotrochozoa with Consideration of Systematic Error.
Kocot, Kevin M; Struck, Torsten H; Merkel, Julia; Waits, Damien S; Todt, Christiane; Brannock, Pamela M; Weese, David A; Cannon, Johanna T; Moroz, Leonid L; Lieb, Bernhard; Halanych, Kenneth M
2017-03-01
Phylogenomic studies have improved understanding of deep metazoan phylogeny and show promise for resolving incongruences among analyses based on limited numbers of loci. One region of the animal tree that has been especially difficult to resolve, even with phylogenomic approaches, is relationships within Lophotrochozoa (the animal clade that includes molluscs, annelids, and flatworms among others). Lack of resolution in phylogenomic analyses could be due to insufficient phylogenetic signal, limitations in taxon and/or gene sampling, or systematic error. Here, we investigated why lophotrochozoan phylogeny has been such a difficult question to answer by identifying and reducing sources of systematic error. We supplemented existing data with 32 new transcriptomes spanning the diversity of Lophotrochozoa and constructed a new set of Lophotrochozoa-specific core orthologs. Of these, 638 orthologous groups (OGs) passed strict screening for paralogy using a tree-based approach. In order to reduce possible sources of systematic error, we calculated branch-length heterogeneity, evolutionary rate, percent missing data, compositional bias, and saturation for each OG and analyzed increasingly stricter subsets of only the most stringent (best) OGs for these five variables. Principal component analysis of the values for each factor examined for each OG revealed that compositional heterogeneity and average patristic distance contributed most to the variance observed along the first principal component while branch-length heterogeneity and, to a lesser extent, saturation contributed most to the variance observed along the second. Missing data did not strongly contribute to either. Additional sensitivity analyses examined effects of removing taxa with heterogeneous branch lengths, large amounts of missing data, and compositional heterogeneity. Although our analyses do not unambiguously resolve lophotrochozoan phylogeny, we advance the field by reducing the list of viable hypotheses. Moreover, our systematic approach for dissection of phylogenomic data can be applied to explore sources of incongruence and poor support in any phylogenomic data set. [Annelida; Brachiopoda; Bryozoa; Entoprocta; Mollusca; Nemertea; Phoronida; Platyzoa; Polyzoa; Spiralia; Trochozoa.]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Campbell, Lahcen I; Rota-Stabelli, Omar; Edgecombe, Gregory D; Marchioro, Trevor; Longhorn, Stuart J; Telford, Maximilian J; Philippe, Hervé; Rebecchi, Lorena; Peterson, Kevin J; Pisani, Davide
2011-09-20
Morphological data traditionally group Tardigrada (water bears), Onychophora (velvet worms), and Arthropoda (e.g., spiders, insects, and their allies) into a monophyletic group of invertebrates with walking appendages known as the Panarthropoda. However, molecular data generally do not support the inclusion of tardigrades within the Panarthropoda, but instead place them closer to Nematoda (roundworms). Here we present results from the analyses of two independent genomic datasets, expressed sequence tags (ESTs) and microRNAs (miRNAs), which congruently resolve the phylogenetic relationships of Tardigrada. Our EST analyses, based on 49,023 amino acid sites from 255 proteins, significantly support a monophyletic Panarthropoda including Tardigrada and suggest a sister group relationship between Arthropoda and Onychophora. Using careful experimental manipulations--comparisons of model fit, signal dissection, and taxonomic pruning--we show that support for a Tardigrada + Nematoda group derives from the phylogenetic artifact of long-branch attraction. Our small RNA libraries fully support our EST results; no miRNAs were found to link Tardigrada and Nematoda, whereas all panarthropods were found to share one unique miRNA (miR-276). In addition, Onychophora and Arthropoda were found to share a second miRNA (miR-305). Our study confirms the monophyly of the legged ecdysozoans, shows that past support for a Tardigrada + Nematoda group was due to long-branch attraction, and suggests that the velvet worms are the sister group to the arthropods.
Campbell, Lahcen I.; Rota-Stabelli, Omar; Edgecombe, Gregory D.; Marchioro, Trevor; Longhorn, Stuart J.; Telford, Maximilian J.; Philippe, Hervé; Rebecchi, Lorena; Peterson, Kevin J.; Pisani, Davide
2011-01-01
Morphological data traditionally group Tardigrada (water bears), Onychophora (velvet worms), and Arthropoda (e.g., spiders, insects, and their allies) into a monophyletic group of invertebrates with walking appendages known as the Panarthropoda. However, molecular data generally do not support the inclusion of tardigrades within the Panarthropoda, but instead place them closer to Nematoda (roundworms). Here we present results from the analyses of two independent genomic datasets, expressed sequence tags (ESTs) and microRNAs (miRNAs), which congruently resolve the phylogenetic relationships of Tardigrada. Our EST analyses, based on 49,023 amino acid sites from 255 proteins, significantly support a monophyletic Panarthropoda including Tardigrada and suggest a sister group relationship between Arthropoda and Onychophora. Using careful experimental manipulations—comparisons of model fit, signal dissection, and taxonomic pruning—we show that support for a Tardigrada + Nematoda group derives from the phylogenetic artifact of long-branch attraction. Our small RNA libraries fully support our EST results; no miRNAs were found to link Tardigrada and Nematoda, whereas all panarthropods were found to share one unique miRNA (miR-276). In addition, Onychophora and Arthropoda were found to share a second miRNA (miR-305). Our study confirms the monophyly of the legged ecdysozoans, shows that past support for a Tardigrada + Nematoda group was due to long-branch attraction, and suggests that the velvet worms are the sister group to the arthropods. PMID:21896763
Morphometrics of Daucus (Apiaceae): A counterpart to a phylogenomic study
USDA-ARS?s Scientific Manuscript database
Molecular phylogenetics of genome-scale data sets (phylogenomics) often produces phylogenetic trees with unprecedented resolution. A companion phylogenomics analysis of Daucus (carrots) using 94 conserved nuclear orthologs supported many of the traditional species but showed unexpected results that ...
Analyzing contentious relationships and outlier genes in phylogenomics.
Walker, Joseph F; Brown, Joseph W; Smith, Stephen A
2018-06-08
Recent studies have demonstrated that conflict is common among gene trees in phylogenomic studies, and that less than one percent of genes may ultimately drive species tree inference in supermatrix analyses. Here, we examined two datasets where supermatrix and coalescent-based species trees conflict. We identified two highly influential "outlier" genes in each dataset. When removed from each dataset, the inferred supermatrix trees matched the topologies obtained from coalescent analyses. We also demonstrate that, while the outlier genes in the vertebrate dataset have been shown in a previous study to be the result of errors in orthology detection, the outlier genes from a plant dataset did not exhibit any obvious systematic error and therefore may be the result of some biological process yet to be determined. While topological comparisons among a small set of alternate topologies can be helpful in discovering outlier genes, they can be limited in several ways, such as assuming all genes share the same topology. Coalescent species tree methods relax this assumption but do not explicitly facilitate the examination of specific edges. Coalescent methods often also assume that conflict is the result of incomplete lineage sorting (ILS). Here we explored a framework that allows for quickly examining alternative edges and support for large phylogenomic datasets that does not assume a single topology for all genes. For both datasets, these analyses provided detailed results confirming the support for coalescent-based topologies. This framework suggests that we can improve our understanding of the underlying signal in phylogenomic datasets by asking more targeted edge-based questions.
Phylogenomics from Whole Genome Sequences Using aTRAM.
Allen, Julie M; Boyd, Bret; Nguyen, Nam-Phuong; Vachaspati, Pranjal; Warnow, Tandy; Huang, Daisie I; Grady, Patrick G S; Bell, Kayce C; Cronk, Quentin C B; Mugisha, Lawrence; Pittendrigh, Barry R; Leonardi, M Soledad; Reed, David L; Johnson, Kevin P
2017-09-01
Novel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes ($<$1000 Mbp) it is feasible to sequence the entire genome at modest coverage ($10-30\\times$). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage.Both the concatenated analysis and the coalescent-based analysis produced the same tree topology, which was consistent with previously published results and resolved weakly supported nodes. These results demonstrate that this approach is successful at developing phylogenomic data sets from raw genome sequencing reads. Further, we found that with coverages above $5-10\\times$, aTRAM was successful at assembling 80-90% of the contigs for both close and distantly related taxa. As sequencing costs continue to decline, we expect full genome sequencing will become more feasible for a wider array of organisms, and aTRAM will enable mining of these genomic data sets for an extensive variety of applications, including phylogenomics. [aTRAM; gene assembly; genome sequencing; phylogenomics.]. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants
Conte, Matthieu G; Gaillard, Sylvain; Droc, Gaetan; Perin, Christophe
2008-01-01
Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods. PMID:18426584
Torruella, Guifré; Derelle, Romain; Paps, Jordi; Lang, B. Franz; Roger, Andrew J.; Shalchian-Tabrizi, Kamran; Ruiz-Trillo, Iñaki
2012-01-01
Many of the eukaryotic phylogenomic analyses published to date were based on alignments of hundreds to thousands of genes. Frequently, in such analyses, the most realistic evolutionary models currently available are often used to minimize the impact of systematic error. However, controversy remains over whether or not idiosyncratic gene family dynamics (i.e., gene duplications and losses) and incorrect orthology assignments are always appropriately taken into account. In this paper, we present an innovative strategy for overcoming orthology assignment problems. Rather than identifying and eliminating genes with paralogy problems, we have constructed a data set comprised exclusively of conserved single-copy protein domains that, unlike most of the commonly used phylogenomic data sets, should be less confounded by orthology miss-assignments. To evaluate the power of this approach, we performed maximum likelihood and Bayesian analyses to infer the evolutionary relationships within the opisthokonts (which includes Metazoa, Fungi, and related unicellular lineages). We used this approach to test 1) whether Filasterea and Ichthyosporea form a clade, 2) the interrelationships of early-branching metazoans, and 3) the relationships among early-branching fungi. We also assessed the impact of some methods that are known to minimize systematic error, including reducing the distance between the outgroup and ingroup taxa or using the CAT evolutionary model. Overall, our analyses support the Filozoa hypothesis in which Ichthyosporea are the first holozoan lineage to emerge followed by Filasterea, Choanoflagellata, and Metazoa. Blastocladiomycota appears as a lineage separate from Chytridiomycota, although this result is not strongly supported. These results represent independent tests of previous phylogenetic hypotheses, highlighting the importance of sophisticated approaches for orthology assignment in phylogenomic analyses. PMID:21771718
Improved phylogenomic taxon sampling noticeably affects nonbilaterian relationships.
Pick, K S; Philippe, H; Schreiber, F; Erpenbeck, D; Jackson, D J; Wrede, P; Wiens, M; Alié, A; Morgenstern, B; Manuel, M; Wörheide, G
2010-09-01
Despite expanding data sets and advances in phylogenomic methods, deep-level metazoan relationships remain highly controversial. Recent phylogenomic analyses depart from classical concepts in recovering ctenophores as the earliest branching metazoan taxon and propose a sister-group relationship between sponges and cnidarians (e.g., Dunn CW, Hejnol A, Matus DQ, et al. (18 co-authors). 2008. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452:745-749). Here, we argue that these results are artifacts stemming from insufficient taxon sampling and long-branch attraction (LBA). By increasing taxon sampling from previously unsampled nonbilaterians and using an identical gene set to that reported by Dunn et al., we recover monophyletic Porifera as the sister group to all other Metazoa. This suggests that the basal position of the fast-evolving Ctenophora proposed by Dunn et al. was due to LBA and that broad taxon sampling is of fundamental importance to metazoan phylogenomic analyses. Additionally, saturation in the Dunn et al. character set is comparatively high, possibly contributing to the poor support for some nonbilaterian nodes.
Leaché, Adam D; Banbury, Barbara L; Linkem, Charles W; de Oca, Adrián Nieto-Montes
2016-03-22
Resolving the short phylogenetic branches that result from rapid evolutionary diversification often requires large numbers of loci. We collected targeted sequence capture data from 585 nuclear loci (541 ultraconserved elements and 44 protein-coding genes) to estimate the phylogenetic relationships among iguanian lizards in the North American genus Sceloporus. We tested for diversification rate shifts to determine if rapid radiation in the genus is correlated with chromosomal evolution. The phylogenomic trees that we obtained for Sceloporus using concatenation and coalescent-based species tree inference provide strong support for the monophyly and interrelationships among nearly all major groups. The diversification analysis supported one rate shift on the Sceloporus phylogeny approximately 20-25 million years ago that is associated with the doubling of the speciation rate from 0.06 species/million years (Ma) to 0.15 species/Ma. The posterior probability for this rate shift occurring on the branch leading to the Sceloporus species groups exhibiting increased chromosomal diversity is high (posterior probability = 0.997). Despite high levels of gene tree discordance, we were able to estimate a phylogenomic tree for Sceloporus that solves some of the taxonomic problems caused by previous analyses of fewer loci. The taxonomic changes that we propose using this new phylogenomic tree help clarify the number and composition of the major species groups in the genus. Our study provides new evidence for a putative link between chromosomal evolution and the rapid divergence and radiation of Sceloporus across North America.
Phylogenomic analyses reveal novel relationships among snake families.
Streicher, Jeffrey W; Wiens, John J
2016-07-01
Snakes are a diverse and important group of vertebrates. However, relationships among the major groups of snakes have remained highly uncertain, with recent studies hypothesizing very different (and typically weakly supported) relationships. Here, we address family-level snake relationships with new phylogenomic data from 3776 nuclear loci from ultraconserved elements (1.40million aligned base pairs, 52% missing data overall) sampled from 29 snake species that together represent almost all families, a dataset ∼100 times larger than used in previous studies. We found relatively strong support from species-tree analyses (NJst) for most relationships, including three largely novel clades: (1) a clade uniting the boas, pythons and their relatives, (2) a clade placing cylindrophiids and uropeltids with this clade, and (3) a clade uniting bolyeriids (Round Island boas) with pythonids and their relatives (xenopeltids and loxocemids). Relationships among families of advanced snakes (caenophidians) were also strongly supported. The results show the potential for phylogenomic analyses to resolve difficult groups, but also show a surprising sensitivity of the analyses to the inclusion or exclusion of outgroups. Copyright © 2016 Elsevier Inc. All rights reserved.
Oueslati, Amel; Salhi-Hannachi, Amel; Luro, François; Vignes, Hélène; Mournet, Pierre; Ollitrault, Patrick
2017-01-01
The mandarin horticultural group is an important component of world citrus production for the fresh fruit market. This group formerly classified as C. reticulata is highly polymorphic and recent molecular studies have suggested that numerous cultivated mandarins were introgressed by C. maxima (the pummelos). C. maxima and C. reticulata are also the ancestors of sweet and sour oranges, grapefruit, and therefore of all the "small citrus" modern varieties (mandarins, tangors, tangelos) derived from sexual hybridization between these horticultural groups. Recently, NGS technologies have greatly modified how plant evolution and genomic structure are analyzed, moving from phylogenetics to phylogenomics. The objective of this work was to develop a workflow for phylogenomic inference from Genotyping By Sequencing (GBS) data and to analyze the interspecific admixture along the nine citrus chromosomes for horticultural groups and recent varieties resulting from the combination of the C. reticulata and C. maxima gene pools. A GBS library was established from 55 citrus varieties, using the ApekI restriction enzyme and selective PCR to improve the read depth. Diagnostic polymorphisms (DPs) of C. reticulata/C. maxima differentiation were identified and used to decipher the phylogenomic structure of the 55 varieties. The GBS approach was powerful and revealed 30,289 SNPs and 8,794 Indels with 12.6% of missing data. 11,133 DPs were selected covering the nine chromosomes with a higher density in genic regions. GBS combined with the detection of DPs was powerful for deciphering the "phylogenomic karyotypes" of cultivars derived from admixture of the two ancestral species after a limited number of interspecific recombinations. All the mandarins, mandarin hybrids, tangelos and tangors analyzed displayed introgression of C. maxima in different parts of the genome. C. reticulata/C. maxima admixture should be a major component of the high phenotypic variability of this germplasm opening up the way for association studies based on phylogenomics.
Oueslati, Amel; Salhi-Hannachi, Amel; Luro, François; Vignes, Hélène; Mournet, Pierre
2017-01-01
The mandarin horticultural group is an important component of world citrus production for the fresh fruit market. This group formerly classified as C. reticulata is highly polymorphic and recent molecular studies have suggested that numerous cultivated mandarins were introgressed by C. maxima (the pummelos). C. maxima and C. reticulata are also the ancestors of sweet and sour oranges, grapefruit, and therefore of all the “small citrus” modern varieties (mandarins, tangors, tangelos) derived from sexual hybridization between these horticultural groups. Recently, NGS technologies have greatly modified how plant evolution and genomic structure are analyzed, moving from phylogenetics to phylogenomics. The objective of this work was to develop a workflow for phylogenomic inference from Genotyping By Sequencing (GBS) data and to analyze the interspecific admixture along the nine citrus chromosomes for horticultural groups and recent varieties resulting from the combination of the C. reticulata and C. maxima gene pools. A GBS library was established from 55 citrus varieties, using the ApekI restriction enzyme and selective PCR to improve the read depth. Diagnostic polymorphisms (DPs) of C. reticulata/C. maxima differentiation were identified and used to decipher the phylogenomic structure of the 55 varieties. The GBS approach was powerful and revealed 30,289 SNPs and 8,794 Indels with 12.6% of missing data. 11,133 DPs were selected covering the nine chromosomes with a higher density in genic regions. GBS combined with the detection of DPs was powerful for deciphering the “phylogenomic karyotypes” of cultivars derived from admixture of the two ancestral species after a limited number of interspecific recombinations. All the mandarins, mandarin hybrids, tangelos and tangors analyzed displayed introgression of C. maxima in different parts of the genome. C. reticulata/C. maxima admixture should be a major component of the high phenotypic variability of this germplasm opening up the way for association studies based on phylogenomics. PMID:28982157
Brown, Tyler S; Narechania, Apurva; Walker, John R; Planet, Paul J; Bifani, Pablo J; Kolokotronis, Sergios-Orestis; Kreiswirth, Barry N; Mathema, Barun
2016-11-21
Whole genome sequencing (WGS) has rapidly become an important research tool in tuberculosis epidemiology and is likely to replace many existing methods in public health microbiology in the near future. WGS-based methods may be particularly useful in areas with less diverse Mycobacterium tuberculosis populations, such as New York City, where conventional genotyping is often uninformative and field epidemiology often difficult. This study applies four candidate strategies for WGS-based identification of emerging M. tuberculosis subpopulations, employing both phylogenomic and population genetics methods. M. tuberculosis subpopulations in New York City and New Jersey can be distinguished via phylogenomic reconstruction, evidence of demographic expansion and subpopulation-specific signatures of selection, and by determination of subgroup-defining nucleotide substitutions. These methods identified known historical outbreak clusters and previously unidentified subpopulations within relatively monomorphic M. tuberculosis endemic clone groups. Neutrality statistics based on the site frequency spectrum were less useful for identifying M. tuberculosis subpopulations, likely due to the low levels of informative genetic variation in recently diverged isolate groups. In addition, we observed that isolates from New York City endemic clone groups have acquired multiple non-synonymous SNPs in virulence- and growth-associated pathways, and relatively few mutations in drug resistance-associated genes, suggesting that overall pathoadaptive fitness, rather than the acquisition of drug resistance mutations, has played a central role in the evolutionary history and epidemiology of M. tuberculosis subpopulations in New York City. Our results demonstrate that some but not all WGS-based methods are useful for detection of emerging M. tuberculosis clone groups, and support the use of phylogenomic reconstruction in routine tuberculosis laboratory surveillance, particularly in areas with relatively less diverse M. tuberculosis populations. Our study also supports the use of wider-reaching phylogenomic and population genomic methods in tuberculosis public health practice, which can support tuberculosis control activities by identifying genetic polymorphisms contributing to epidemiological success in local M. tuberculosis populations and possibly explain why certain isolate groups are apparently more successful in specific host populations.
Panzera, Alejandra; Leaché, Adam D; D'Elía, Guillermo; Victoriano, Pedro F
2017-01-01
The genus Liolaemus is one of the most ecologically diverse and species-rich genera of lizards worldwide. It currently includes more than 250 recognized species, which have been subject to many ecological and evolutionary studies. Nevertheless, Liolaemus lizards have a complex taxonomic history, mainly due to the incongruence between morphological and genetic data, incomplete taxon sampling, incomplete lineage sorting and hybridization. In addition, as many species have restricted and remote distributions, this has hampered their examination and inclusion in molecular systematic studies. The aims of this study are to infer a robust phylogeny for a subsample of lizards representing the Chilean clade (subgenus Liolaemus sensu stricto ), and to test the monophyly of several of the major species groups. We use a phylogenomic approach, targeting 541 ultra-conserved elements (UCEs) and 44 protein-coding genes for 16 taxa. We conduct a comparison of phylogenetic analyses using maximum-likelihood and several species tree inference methods. The UCEs provide stronger support for phylogenetic relationships compared to the protein-coding genes; however, the UCEs outnumber the protein-coding genes by 10-fold. On average, the protein-coding genes contain over twice the number of informative sites. Based on our phylogenomic analyses, all the groups sampled are polyphyletic. Liolaemus tenuis tenuis is difficult to place in the phylogeny, because only a few loci (nine) were recovered for this species. Topologies or support values did not change dramatically upon exclusion of L. t. tenuis from analyses, suggesting that missing data did not had a significant impact on phylogenetic inference in this data set. The phylogenomic analyses provide strong support for sister group relationships between L. fuscus , L. monticola , L. nigroviridis and L. nitidus , and L. platei and L. velosoi . Despite our limited taxon sampling, we have provided a reliable starting hypothesis for the relationships among many major groups of the Chilean clade of Liolaemus that will help future work aimed at resolving the Liolaemus phylogeny.
Burke, Sean V; Wysocki, William P; Zuloaga, Fernando O; Craine, Joseph M; Pires, J Chris; Edger, Patrick P; Mayfield-Jones, Dustin; Clark, Lynn G; Kelchner, Scot A; Duvall, Melvin R
2016-06-18
Panicoideae are the second largest subfamily in Poaceae (grass family), with 212 genera and approximately 3316 species. Previous studies have begun to reveal relationships within the subfamily, but largely lack resolution and/or robust support for certain tribal and subtribal groups. This study aims to resolve these relationships, as well as characterize a putative mitochondrial insert in one linage. 35 newly sequenced Panicoideae plastomes were combined in a phylogenomic study with 37 other species: 15 Panicoideae and 22 from outgroups. A robust Panicoideae topology largely congruent with previous studies was obtained, but with some incongruences with previously reported subtribal relationships. A mitochondrial DNA (mtDNA) to plastid DNA (ptDNA) transfer was discovered in the Paspalum lineage. The phylogenomic analysis returned a topology that largely supports previous studies. Five previously recognized subtribes appear on the topology to be non-monophyletic. Additionally, evidence for mtDNA to ptDNA transfer was identified in both Paspalum fimbriatum and P. dilatatum, and suggests a single rare event that took place in a common progenitor. Finally, the framework from this study can guide larger whole plastome sampling to discern the relationships in Cyperochloeae, Steyermarkochloeae, Gynerieae, and other incertae sedis taxa that are weakly supported or unresolved.
Phylogenomics of the carrot genus (Daucus, Apiaceae)
USDA-ARS?s Scientific Manuscript database
Molecular phylogenetics of genome-scale data sets (phylogenomics) often produces phylogenetic trees with unprecedented resolution. We here explore the utility of multiple nuclear orthologs for the taxonomic resolution of a wide variety of Daucus species and outgroups. We studied the phylogeny of 89 ...
Reddy, Sushma; Kimball, Rebecca T; Pandey, Akanksha; Hosner, Peter A; Braun, Michael J; Hackett, Shannon J; Han, Kin-Lan; Harshman, John; Huddleston, Christopher J; Kingston, Sarah; Marks, Ben D; Miglia, Kathleen J; Moore, William S; Sheldon, Frederick H; Witt, Christopher C; Yuri, Tamaki; Braun, Edward L
2017-09-01
Phylogenomics, the use of large-scale data matrices in phylogenetic analyses, has been viewed as the ultimate solution to the problem of resolving difficult nodes in the tree of life. However, it has become clear that analyses of these large genomic data sets can also result in conflicting estimates of phylogeny. Here, we use the early divergences in Neoaves, the largest clade of extant birds, as a "model system" to understand the basis for incongruence among phylogenomic trees. We were motivated by the observation that trees from two recent avian phylogenomic studies exhibit conflicts. Those studies used different strategies: 1) collecting many characters [$\\sim$ 42 mega base pairs (Mbp) of sequence data] from 48 birds, sometimes including only one taxon for each major clade; and 2) collecting fewer characters ($\\sim$ 0.4 Mbp) from 198 birds, selected to subdivide long branches. However, the studies also used different data types: the taxon-poor data matrix comprised 68% non-coding sequences whereas coding exons dominated the taxon-rich data matrix. This difference raises the question of whether the primary reason for incongruence is the number of sites, the number of taxa, or the data type. To test among these alternative hypotheses we assembled a novel, large-scale data matrix comprising 90% non-coding sequences from 235 bird species. Although increased taxon sampling appeared to have a positive impact on phylogenetic analyses the most important variable was data type. Indeed, by analyzing different subsets of the taxa in our data matrix we found that increased taxon sampling actually resulted in increased congruence with the tree from the previous taxon-poor study (which had a majority of non-coding data) instead of the taxon-rich study (which largely used coding data). We suggest that the observed differences in the estimates of topology for these studies reflect data-type effects due to violations of the models used in phylogenetic analyses, some of which may be difficult to detect. If incongruence among trees estimated using phylogenomic methods largely reflects problems with model fit developing more "biologically-realistic" models is likely to be critical for efforts to reconstruct the tree of life. [Birds; coding exons; GTR model; model fit; Neoaves; non-coding DNA; phylogenomics; taxon sampling.]. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Rand, Hugh; Shumway, Martin; Trees, Eija K.; Simmons, Mustafa; Agarwala, Richa; Davis, Steven; Tillman, Glenn E.; Defibaugh-Chavez, Stephanie; Carleton, Heather A.; Klimke, William A.; Katz, Lee S.
2017-01-01
Background As next generation sequence technology has advanced, there have been parallel advances in genome-scale analysis programs for determining evolutionary relationships as proxies for epidemiological relationship in public health. Most new programs skip traditional steps of ortholog determination and multi-gene alignment, instead identifying variants across a set of genomes, then summarizing results in a matrix of single-nucleotide polymorphisms or alleles for standard phylogenetic analysis. However, public health authorities need to document the performance of these methods with appropriate and comprehensive datasets so they can be validated for specific purposes, e.g., outbreak surveillance. Here we propose a set of benchmark datasets to be used for comparison and validation of phylogenomic pipelines. Methods We identified four well-documented foodborne pathogen events in which the epidemiology was concordant with routine phylogenomic analyses (reference-based SNP and wgMLST approaches). These are ideal benchmark datasets, as the trees, WGS data, and epidemiological data for each are all in agreement. We have placed these sequence data, sample metadata, and “known” phylogenetic trees in publicly-accessible databases and developed a standard descriptive spreadsheet format describing each dataset. To facilitate easy downloading of these benchmarks, we developed an automated script that uses the standard descriptive spreadsheet format. Results Our “outbreak” benchmark datasets represent the four major foodborne bacterial pathogens (Listeria monocytogenes, Salmonella enterica, Escherichia coli, and Campylobacter jejuni) and one simulated dataset where the “known tree” can be accurately called the “true tree”. The downloading script and associated table files are available on GitHub: https://github.com/WGS-standards-and-analysis/datasets. Discussion These five benchmark datasets will help standardize comparison of current and future phylogenomic pipelines, and facilitate important cross-institutional collaborations. Our work is part of a global effort to provide collaborative infrastructure for sequence data and analytic tools—we welcome additional benchmark datasets in our recommended format, and, if relevant, we will add these on our GitHub site. Together, these datasets, dataset format, and the underlying GitHub infrastructure present a recommended path for worldwide standardization of phylogenomic pipelines. PMID:29372115
Timme, Ruth E; Rand, Hugh; Shumway, Martin; Trees, Eija K; Simmons, Mustafa; Agarwala, Richa; Davis, Steven; Tillman, Glenn E; Defibaugh-Chavez, Stephanie; Carleton, Heather A; Klimke, William A; Katz, Lee S
2017-01-01
As next generation sequence technology has advanced, there have been parallel advances in genome-scale analysis programs for determining evolutionary relationships as proxies for epidemiological relationship in public health. Most new programs skip traditional steps of ortholog determination and multi-gene alignment, instead identifying variants across a set of genomes, then summarizing results in a matrix of single-nucleotide polymorphisms or alleles for standard phylogenetic analysis. However, public health authorities need to document the performance of these methods with appropriate and comprehensive datasets so they can be validated for specific purposes, e.g., outbreak surveillance. Here we propose a set of benchmark datasets to be used for comparison and validation of phylogenomic pipelines. We identified four well-documented foodborne pathogen events in which the epidemiology was concordant with routine phylogenomic analyses (reference-based SNP and wgMLST approaches). These are ideal benchmark datasets, as the trees, WGS data, and epidemiological data for each are all in agreement. We have placed these sequence data, sample metadata, and "known" phylogenetic trees in publicly-accessible databases and developed a standard descriptive spreadsheet format describing each dataset. To facilitate easy downloading of these benchmarks, we developed an automated script that uses the standard descriptive spreadsheet format. Our "outbreak" benchmark datasets represent the four major foodborne bacterial pathogens ( Listeria monocytogenes , Salmonella enterica , Escherichia coli , and Campylobacter jejuni ) and one simulated dataset where the "known tree" can be accurately called the "true tree". The downloading script and associated table files are available on GitHub: https://github.com/WGS-standards-and-analysis/datasets. These five benchmark datasets will help standardize comparison of current and future phylogenomic pipelines, and facilitate important cross-institutional collaborations. Our work is part of a global effort to provide collaborative infrastructure for sequence data and analytic tools-we welcome additional benchmark datasets in our recommended format, and, if relevant, we will add these on our GitHub site. Together, these datasets, dataset format, and the underlying GitHub infrastructure present a recommended path for worldwide standardization of phylogenomic pipelines.
Gomila, Margarita; Busquets, Antonio; Mulet, Magdalena; García-Valdés, Elena; Lalucat, Jorge
2017-01-01
The Pseudomonas syringae phylogenetic group comprises 15 recognized bacterial species and more than 60 pathovars. The classification and identification of strains is relevant for practical reasons but also for understanding the epidemiology and ecology of this group of plant pathogenic bacteria. Genome-based taxonomic analyses have been introduced recently to clarify the taxonomy of the whole genus. A set of 139 draft and complete genome sequences of strains belonging to all species of the P. syringae group available in public databases were analyzed, together with the genomes of closely related species used as outgroups. Comparative genomics based on the genome sequences of the species type strains in the group allowed the delineation of phylogenomic species and demonstrated that a high proportion of strains included in the study are misclassified. Furthermore, representatives of at least 7 putative novel species were detected. It was also confirmed that P. ficuserectae, P. meliae , and P. savastanoi are later synonyms of P. amygdali and that " P. coronafaciens " should be revived as a nomenspecies.
GPSit: An automated method for evolutionary analysis of nonculturable ciliated microeukaryotes.
Chen, Xiao; Wang, Yurui; Sheng, Yalan; Warren, Alan; Gao, Shan
2018-05-01
Microeukaryotes are among the most important components of the microbial food web in almost all aquatic and terrestrial ecosystems worldwide. In order to gain a better understanding their roles and functions in ecosystems, sequencing coupled with phylogenomic analyses of entire genomes or transcriptomes is increasingly used to reconstruct the evolutionary history and classification of these microeukaryotes and thus provide a more robust framework for determining their systematics and diversity. More importantly, phylogenomic research usually requires high levels of hands-on bioinformatics experience. Here, we propose an efficient automated method, "Guided Phylogenomic Search in trees" (GPSit), which starts from predicted protein sequences of newly sequenced species and a well-defined customized orthologous database. Compared with previous protocols, our method streamlines the entire workflow by integrating all essential and other optional operations. In so doing, the manual operation time for reconstructing phylogenetic relationships is reduced from days to several hours, compared to other methods. Furthermore, GPSit supports user-defined parameters in most steps and thus allows users to adapt it to their studies. The effectiveness of GPSit is demonstrated by incorporating available online data and new single-cell data of three nonculturable marine ciliates (Anteholosticha monilata, Deviata sp. and Diophrys scutum) under moderate sequencing coverage (~5×). Our results indicate that the former could reconstruct robust "deep" phylogenetic relationships while the latter reveals the presence of intermediate taxa in shallow relationships. Based on empirical phylogenomic data, we also used GPSit to evaluate the impact of different levels of missing data on two commonly used methods of phylogenetic analyses, maximum likelihood (ML) and Bayesian inference (BI) methods. We found that BI is less sensitive to missing data when fast-evolving sites are removed. © 2018 John Wiley & Sons Ltd.
Zhou, Xiaofan; Shen, Xing-Xing; Hittinger, Chris Todd
2018-01-01
Abstract The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets with hundreds to thousands of genes and up to 200 taxa with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation-based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the programs’ relative performance. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses. PMID:29177474
Evolution of the bamboos (Bambusoideae; Poaceae): a full plastome phylogenomic analysis.
Wysocki, William P; Clark, Lynn G; Attigala, Lakshmi; Ruiz-Sanchez, Eduardo; Duvall, Melvin R
2015-03-18
Bambusoideae (Poaceae) comprise three distinct and well-supported lineages: tropical woody bamboos (Bambuseae), temperate woody bamboos (Arundinarieae) and herbaceous bamboos (Olyreae). Phylogenetic studies using chloroplast markers have generally supported a sister relationship between Bambuseae and Olyreae. This suggests either at least two origins of the woody bamboo syndrome in this subfamily or its loss in Olyreae. Here a full chloroplast genome (plastome) phylogenomic study is presented using the coding and noncoding regions of 13 complete plastomes from the Bambuseae, eight from Olyreae and 10 from Arundinarieae. Trees generated using full plastome sequences support the previously recovered monophyletic relationship between Bambuseae and Olyreae. In addition to these relationships, several unique plastome features are uncovered including the first mitogenome-to-plastome horizontal gene transfer observed in monocots. Phylogenomic agreement with previous published phylogenies reinforces the validity of these studies. Additionally, this study presents the first published plastomes from Neotropical woody bamboos and the first full plastome phylogenomic study performed within the herbaceous bamboos. Although the phylogenomic tree presented in this study is largely robust, additional studies using nuclear genes support monophyly in woody bamboos as well as hybridization among previous woody bamboo lineages. The evolutionary history of the Bambusoideae could be further clarified using transcriptomic techniques to increase sampling among nuclear orthologues and investigate the molecular genetics underlying the development of woody and floral tissues.
The Opiliones tree of life: shedding light on harvestmen relationships through transcriptomics.
Fernández, Rosa; Sharma, Prashant P; Tourinho, Ana Lúcia; Giribet, Gonzalo
2017-02-22
Opiliones are iconic arachnids with a Palaeozoic origin and a diversity that reflects ancient biogeographic patterns dating back at least to the times of Pangea. Owing to interest in harvestman diversity, evolution and biogeography, their relationships have been thoroughly studied using morphology and PCR-based Sanger approaches to infer their systematic relationships. More recently, two studies utilized transcriptomics-based phylogenomics to explore their basal relationships and diversification, but sampling was limiting for understanding deep evolutionary patterns, as they lacked good taxon representation at the family level. Here, we analysed a set of the 14 existing transcriptomes with 40 additional ones generated for this study, representing approximately 80% of the extant familial diversity in Opiliones. Our phylogenetic analyses, including a set of data matrices with different gene occupancy and evolutionary rates, and using a multitude of methods correcting for a diversity of factors affecting phylogenomic data matrices, provide a robust and stable Opiliones tree of life, where most families and higher taxa are precisely placed. Our dating analyses using alternative calibration points, methods and analytical parameters provide well-resolved old divergences, consistent with ancient regionalization in Pangea in some groups, and Pangean vicariance in others. The integration of state-of-the-art molecular techniques and analyses, together with the broadest taxonomic sampling to date presented in a phylogenomic study of harvestmen, provide new insights into harvestmen interrelationships, as well as an overview of the general biogeographic patterns of this ancient arthropod group. © 2017 The Author(s).
The Opiliones tree of life: shedding light on harvestmen relationships through transcriptomics
Sharma, Prashant P.; Tourinho, Ana Lúcia
2017-01-01
Opiliones are iconic arachnids with a Palaeozoic origin and a diversity that reflects ancient biogeographic patterns dating back at least to the times of Pangea. Owing to interest in harvestman diversity, evolution and biogeography, their relationships have been thoroughly studied using morphology and PCR-based Sanger approaches to infer their systematic relationships. More recently, two studies utilized transcriptomics-based phylogenomics to explore their basal relationships and diversification, but sampling was limiting for understanding deep evolutionary patterns, as they lacked good taxon representation at the family level. Here, we analysed a set of the 14 existing transcriptomes with 40 additional ones generated for this study, representing approximately 80% of the extant familial diversity in Opiliones. Our phylogenetic analyses, including a set of data matrices with different gene occupancy and evolutionary rates, and using a multitude of methods correcting for a diversity of factors affecting phylogenomic data matrices, provide a robust and stable Opiliones tree of life, where most families and higher taxa are precisely placed. Our dating analyses using alternative calibration points, methods and analytical parameters provide well-resolved old divergences, consistent with ancient regionalization in Pangea in some groups, and Pangean vicariance in others. The integration of state-of-the-art molecular techniques and analyses, together with the broadest taxonomic sampling to date presented in a phylogenomic study of harvestmen, provide new insights into harvestmen interrelationships, as well as an overview of the general biogeographic patterns of this ancient arthropod group. PMID:28228511
Insect Phylogenomics: Exploring the Source of Incongruence Using New Transcriptomic Data
Simon, Sabrina; Narechania, Apurva; DeSalle, Rob; Hadrys, Heike
2012-01-01
The evolution of the diverse insect lineages is one of the most fascinating issues in evolutionary biology. Despite extensive research in this area, the resolution of insect phylogeny especially of interordinal relationships has turned out to be still a great challenge. One of the challenges for insect systematics is the radiation of the polyneopteran lineages with several contradictory and/or unresolved relationships. Here, we provide the first transcriptomic data for three enigmatic polyneopteran orders (Dermaptera, Plecoptera, and Zoraptera) to clarify one of the most debated issues among higher insect systematics. We applied different approaches to generate 3 data sets comprising 78 species and 1,579 clusters of orthologous genes. Using these three matrices, we explored several key mechanistic problems of phylogenetic reconstruction including missing data, matrix selection, gene and taxa number/choice, and the biological function of the genes. Based on the first phylogenomic approach including these three ambiguous polyneopteran orders, we provide here conclusive support for monophyletic Polyneoptera, contesting the hypothesis of Zoraptera + Paraneoptera and Plecoptera + remaining Neoptera. In addition, we employ various approaches to evaluate data quality and highlight problematic nodes within the Insect Tree that still exist despite our phylogenomic approach. We further show how the support for these nodes or alternative hypotheses might depend on the taxon- and/or gene-sampling. PMID:23175716
Diversification of Rosaceae since the Late Cretaceous based on plastid phylogenomics.
Zhang, Shu-Dong; Jin, Jian-Jun; Chen, Si-Yun; Chase, Mark W; Soltis, Douglas E; Li, Hong-Tao; Yang, Jun-Bo; Li, De-Zhu; Yi, Ting-Shuang
2017-05-01
Phylogenetic relationships in Rosaceae have long been problematic because of frequent hybridisation, apomixis and presumed rapid radiation, and their historical diversification has not been clarified. With 87 genera representing all subfamilies and tribes of Rosaceae and six of the other eight families of Rosales (outgroups), we analysed 130 newly sequenced plastomes together with 12 from GenBank in an attempt to reconstruct deep relationships and reveal temporal diversification of this family. Our results highlight the importance of improving sequence alignment and the use of appropriate substitution models in plastid phylogenomics. Three subfamilies and 16 tribes (as previously delimited) were strongly supported as monophyletic, and their relationships were fully resolved and strongly supported at most nodes. Rosaceae were estimated to have originated during the Late Cretaceous with evidence for rapid diversification events during several geological periods. The major lineages rapidly diversified in warm and wet habits during the Late Cretaceous, and the rapid diversification of genera from the early Oligocene onwards occurred in colder and drier environments. Plastid phylogenomics offers new and important insights into deep phylogenetic relationships and the diversification history of Rosaceae. The robust phylogenetic backbone and time estimates we provide establish a framework for future comparative studies on rosaceous evolution. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
USDA-ARS?s Scientific Manuscript database
The importance of taxon sampling in phylogenetic accuracy is a topic of active debate. We investigated the role of taxon sampling in causing incongruent results between two recent phylogenomic studies of stinging wasps (Hymenoptera: Aculeata), a diverse lineage that includes ants, bees and the major...
So many genes, so little time: A practical approach to divergence-time estimation in the genomic era
2018-01-01
Phylogenomic datasets have been successfully used to address questions involving evolutionary relationships, patterns of genome structure, signatures of selection, and gene and genome duplications. However, despite the recent explosion in genomic and transcriptomic data, the utility of these data sources for efficient divergence-time inference remains unexamined. Phylogenomic datasets pose two distinct problems for divergence-time estimation: (i) the volume of data makes inference of the entire dataset intractable, and (ii) the extent of underlying topological and rate heterogeneity across genes makes model mis-specification a real concern. “Gene shopping”, wherein a phylogenomic dataset is winnowed to a set of genes with desirable properties, represents an alternative approach that holds promise in alleviating these issues. We implemented an approach for phylogenomic datasets (available in SortaDate) that filters genes by three criteria: (i) clock-likeness, (ii) reasonable tree length (i.e., discernible information content), and (iii) least topological conflict with a focal species tree (presumed to have already been inferred). Such a winnowing procedure ensures that errors associated with model (both clock and topology) mis-specification are minimized, therefore reducing error in divergence-time estimation. We demonstrated the efficacy of this approach through simulation and applied it to published animal (Aves, Diplopoda, and Hymenoptera) and plant (carnivorous Caryophyllales, broad Caryophyllales, and Vitales) phylogenomic datasets. By quantifying rate heterogeneity across both genes and lineages we found that every empirical dataset examined included genes with clock-like, or nearly clock-like, behavior. Moreover, many datasets had genes that were clock-like, exhibited reasonable evolutionary rates, and were mostly compatible with the species tree. We identified overlap in age estimates when analyzing these filtered genes under strict clock and uncorrelated lognormal (UCLN) models. However, this overlap was often due to imprecise estimates from the UCLN model. We find that “gene shopping” can be an efficient approach to divergence-time inference for phylogenomic datasets that may otherwise be characterized by extensive gene tree heterogeneity. PMID:29772020
Smith, Stephen A; Brown, Joseph W; Walker, Joseph F
2018-01-01
Phylogenomic datasets have been successfully used to address questions involving evolutionary relationships, patterns of genome structure, signatures of selection, and gene and genome duplications. However, despite the recent explosion in genomic and transcriptomic data, the utility of these data sources for efficient divergence-time inference remains unexamined. Phylogenomic datasets pose two distinct problems for divergence-time estimation: (i) the volume of data makes inference of the entire dataset intractable, and (ii) the extent of underlying topological and rate heterogeneity across genes makes model mis-specification a real concern. "Gene shopping", wherein a phylogenomic dataset is winnowed to a set of genes with desirable properties, represents an alternative approach that holds promise in alleviating these issues. We implemented an approach for phylogenomic datasets (available in SortaDate) that filters genes by three criteria: (i) clock-likeness, (ii) reasonable tree length (i.e., discernible information content), and (iii) least topological conflict with a focal species tree (presumed to have already been inferred). Such a winnowing procedure ensures that errors associated with model (both clock and topology) mis-specification are minimized, therefore reducing error in divergence-time estimation. We demonstrated the efficacy of this approach through simulation and applied it to published animal (Aves, Diplopoda, and Hymenoptera) and plant (carnivorous Caryophyllales, broad Caryophyllales, and Vitales) phylogenomic datasets. By quantifying rate heterogeneity across both genes and lineages we found that every empirical dataset examined included genes with clock-like, or nearly clock-like, behavior. Moreover, many datasets had genes that were clock-like, exhibited reasonable evolutionary rates, and were mostly compatible with the species tree. We identified overlap in age estimates when analyzing these filtered genes under strict clock and uncorrelated lognormal (UCLN) models. However, this overlap was often due to imprecise estimates from the UCLN model. We find that "gene shopping" can be an efficient approach to divergence-time inference for phylogenomic datasets that may otherwise be characterized by extensive gene tree heterogeneity.
Peng Zhao; Hui-Juan Zhou; Daniel Potter; Yi-Heng Hu; Xiao-Jia Feng; Meng Dang; Li Feng; Saman Zulfiqar; Wen-Zhe Liu; Gui-Fang Zhao; Keith Woeste
2018-01-01
Genomic data are a powerful tool for elucidating the processes involved in the evolution and divergence of species. The speciation and phylogenetic relationships among Chinese Juglans remain unclear. Here, we used results from phylogenomic and population genetic analyses, transcriptomics, Genotyping-By-Sequencing (GBS), and whole chloroplast...
Wu, Hao-Yang; Wang, Yan-Hui; Xie, Qiang; Ke, Yun-Ling; Bu, Wen-Jun
2016-06-17
With the great development of sequencing technologies and systematic methods, our understanding of evolutionary relationships at deeper levels within the tree of life has greatly improved over the last decade. However, the current taxonomic methodology is insufficient to describe the growing levels of diversity in both a standardised and general way due to the limitations of using only morphological traits to describe clades. Herein, we propose the idea of a molecular classification based on hierarchical and discrete amino acid characters. Clades are classified based on the results of phylogenetic analyses and described using amino acids with group specificity in phylograms. Practices based on the recently published phylogenomic datasets of insects together with 15 de novo sequenced transcriptomes in this study demonstrate that such a methodology can accommodate various higher ranks of taxonomy. Such an approach has the advantage of describing organisms in a standard and discrete way within a phylogenetic framework, thereby facilitating the recognition of clades from the view of the whole lineage, as indicated by PhyloCode. By combining identification keys and phylogenies, the molecular classification based on hierarchical and discrete characters may greatly boost the progress of integrative taxonomy.
Wu, Hao-Yang; Wang, Yan-Hui; Xie, Qiang; Ke, Yun-Ling; Bu, Wen-Jun
2016-01-01
With the great development of sequencing technologies and systematic methods, our understanding of evolutionary relationships at deeper levels within the tree of life has greatly improved over the last decade. However, the current taxonomic methodology is insufficient to describe the growing levels of diversity in both a standardised and general way due to the limitations of using only morphological traits to describe clades. Herein, we propose the idea of a molecular classification based on hierarchical and discrete amino acid characters. Clades are classified based on the results of phylogenetic analyses and described using amino acids with group specificity in phylograms. Practices based on the recently published phylogenomic datasets of insects together with 15 de novo sequenced transcriptomes in this study demonstrate that such a methodology can accommodate various higher ranks of taxonomy. Such an approach has the advantage of describing organisms in a standard and discrete way within a phylogenetic framework, thereby facilitating the recognition of clades from the view of the whole lineage, as indicated by PhyloCode. By combining identification keys and phylogenies, the molecular classification based on hierarchical and discrete characters may greatly boost the progress of integrative taxonomy. PMID:27312960
Gomila, Margarita; Busquets, Antonio; Mulet, Magdalena; García-Valdés, Elena; Lalucat, Jorge
2017-01-01
The Pseudomonas syringae phylogenetic group comprises 15 recognized bacterial species and more than 60 pathovars. The classification and identification of strains is relevant for practical reasons but also for understanding the epidemiology and ecology of this group of plant pathogenic bacteria. Genome-based taxonomic analyses have been introduced recently to clarify the taxonomy of the whole genus. A set of 139 draft and complete genome sequences of strains belonging to all species of the P. syringae group available in public databases were analyzed, together with the genomes of closely related species used as outgroups. Comparative genomics based on the genome sequences of the species type strains in the group allowed the delineation of phylogenomic species and demonstrated that a high proportion of strains included in the study are misclassified. Furthermore, representatives of at least 7 putative novel species were detected. It was also confirmed that P. ficuserectae, P. meliae, and P. savastanoi are later synonyms of P. amygdali and that “P. coronafaciens” should be revived as a nomenspecies. PMID:29270162
Tucker, Derek B; Colli, Guarino R; Giugliano, Lilian G; Hedges, S Blair; Hendry, Catriona R; Lemmon, Emily Moriarty; Lemmon, Alan R; Sites, Jack W; Pyron, R Alexander
2016-10-01
A well-known issue in phylogenetics is discordance among gene trees, species trees, morphology, and other data types. Gene-tree discordance is often caused by incomplete lineage sorting, lateral gene transfer, and gene duplication. Multispecies-coalescent methods can account for incomplete lineage sorting and are believed by many to be more accurate than concatenation. However, simulation studies and empirical data have demonstrated that concatenation and species tree methods often recover similar topologies. We use three popular methods of phylogenetic reconstruction (one concatenation, two species tree) to evaluate relationships within Teiidae. These lizards are distributed across the United States to Argentina and the West Indies, and their classification has been controversial due to incomplete sampling and the discordance among various character types (chromosomes, DNA, musculature, osteology, etc.) used to reconstruct phylogenetic relationships. Recent morphological and molecular analyses of the group resurrected three genera and created five new genera to resolve non-monophyly in three historically ill-defined genera: Ameiva, Cnemidophorus, and Tupinambis. Here, we assess the phylogenetic relationships of the Teiidae using "next-generation" anchored-phylogenomics sequencing. Our final alignment includes 316 loci (488,656bp DNA) for 244 individuals (56 species of teiids, representing all currently recognized genera) and all three methods (ExaML, MP-EST, and ASTRAL-II) recovered essentially identical topologies. Our results are basically in agreement with recent results from morphology and smaller molecular datasets, showing support for monophyly of the eight new genera. Interestingly, even with hundreds of loci, the relationships among some genera in Tupinambinae remain ambiguous (i.e. low nodal support for the position of Salvator and Dracaena). Copyright © 2016 Elsevier Inc. All rights reserved.
Fernández, Rosa; Kallal, Robert J; Dimitrov, Dimitar; Ballesteros, Jesús A; Arnedo, Miquel A; Giribet, Gonzalo; Hormiga, Gustavo
2018-05-07
Dating back to almost 400 mya, spiders are among the most diverse terrestrial predators [1]. However, despite considerable effort [1-9], their phylogenetic relationships and diversification dynamics remain poorly understood. Here, we use a synergistic approach to study spider evolution through phylogenomics, comparative transcriptomics, and lineage diversification analyses. Our analyses, based on ca. 2,500 genes from 159 spider species, reject a single origin of the orb web (the "ancient orb-web hypothesis") and suggest that orb webs evolved multiple times since the late Triassic-Jurassic. We find no significant association between the loss of foraging webs and increases in diversification rates, suggesting that other factors (e.g., habitat heterogeneity or biotic interactions) potentially played a key role in spider diversification. Finally, we report notable genomic differences in the main spider lineages: while araneoids (ecribellate orb-weavers and their allies) reveal an enrichment in genes related to behavior and sensory reception, the retrolateral tibial apophysis (RTA) clade-the most diverse araneomorph spider lineage-shows enrichment in genes related to immune responses and polyphenic determination. This study, one of the largest invertebrate phylogenomic analyses to date, highlights the usefulness of transcriptomic data not only to build a robust backbone for the Spider Tree of Life, but also to address the genetic basis of diversification in the spider evolutionary chronicle. Copyright © 2018 Elsevier Ltd. All rights reserved.
Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick
2015-01-01
Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP marker set will be useful for systematic estimation of admixture structure of citrus germplasm and for diverse genetic studies. PMID:25973611
Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick
2015-01-01
Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP marker set will be useful for systematic estimation of admixture structure of citrus germplasm and for diverse genetic studies.
Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A
2016-07-01
Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.
Phylogenomics of the killer whale indicates ecotype divergence in sympatry.
Moura, A E; Kenny, J G; Chaudhuri, R R; Hughes, M A; Reisinger, R R; de Bruyn, P J N; Dahlheim, M E; Hall, N; Hoelzel, A R
2015-01-01
For many highly mobile species, the marine environment presents few obvious barriers to gene flow. Even so, there is considerable diversity within and among species, referred to by some as the 'marine speciation paradox'. The recent and diverse radiation of delphinid cetaceans (dolphins) represents a good example of this. Delphinids are capable of extensive dispersion and yet many show fine-scale genetic differentiation among populations. Proposed mechanisms include the division and isolation of populations based on habitat dependence and resource specializations, and habitat release or changing dispersal corridors during glacial cycles. Here we use a phylogenomic approach to investigate the origin of differentiated sympatric populations of killer whales (Orcinus orca). Killer whales show strong specialization on prey choice in populations of stable matrifocal social groups (ecotypes), associated with genetic and phenotypic differentiation. Our data suggest evolution in sympatry among populations of resource specialists.
Phylogenomics of the killer whale indicates ecotype divergence in sympatry
Moura, A E; Kenny, J G; Chaudhuri, R R; Hughes, M A; Reisinger, R R; de Bruyn, P J N; Dahlheim, M E; Hall, N; Hoelzel, A R
2015-01-01
For many highly mobile species, the marine environment presents few obvious barriers to gene flow. Even so, there is considerable diversity within and among species, referred to by some as the ‘marine speciation paradox'. The recent and diverse radiation of delphinid cetaceans (dolphins) represents a good example of this. Delphinids are capable of extensive dispersion and yet many show fine-scale genetic differentiation among populations. Proposed mechanisms include the division and isolation of populations based on habitat dependence and resource specializations, and habitat release or changing dispersal corridors during glacial cycles. Here we use a phylogenomic approach to investigate the origin of differentiated sympatric populations of killer whales (Orcinus orca). Killer whales show strong specialization on prey choice in populations of stable matrifocal social groups (ecotypes), associated with genetic and phenotypic differentiation. Our data suggest evolution in sympatry among populations of resource specialists. PMID:25052415
Wen, Dingqiao; Yu, Yun; Hahn, Matthew W.; Nakhleh, Luay
2016-01-01
The role of hybridization and subsequent introgression has been demonstrated in an increasing number of species. Recently, Fontaine et al. (Science, 347, 2015, 1258524) conducted a phylogenomic analysis of six members of the Anopheles gambiae species complex. Their analysis revealed a reticulate evolutionary history and pointed to extensive introgression on all four autosomal arms. The study further highlighted the complex evolutionary signals that the co-occurrence of incomplete lineage sorting (ILS) and introgression can give rise to in phylogenomic analyses. While tree-based methodologies were used in the study, phylogenetic networks provide a more natural model to capture reticulate evolutionary histories. In this work, we reanalyse the Anopheles data using a recently devised framework that combines the multispecies coalescent with phylogenetic networks. This framework allows us to capture ILS and introgression simultaneously, and forms the basis for statistical methods for inferring reticulate evolutionary histories. The new analysis reveals a phylogenetic network with multiple hybridization events, some of which differ from those reported in the original study. To elucidate the extent and patterns of introgression across the genome, we devise a new method that quantifies the use of reticulation branches in the phylogenetic network by each genomic region. Applying the method to the mosquito data set reveals the evolutionary history of all the chromosomes. This study highlights the utility of ‘network thinking’ and the new insights it can uncover, in particular in phylogenomic analyses of large data sets with extensive gene tree incongruence. PMID:26808290
Romiguier, Jonathan; Cameron, Sydney A; Woodard, S Hollis; Fischman, Brielle J; Keller, Laurent; Praz, Christophe J
2016-03-01
As increasingly large molecular data sets are collected for phylogenomics, the conflicting phylogenetic signal among gene trees poses challenges to resolve some difficult nodes of the Tree of Life. Among these nodes, the phylogenetic position of the honey bees (Apini) within the corbiculate bee group remains controversial, despite its considerable importance for understanding the emergence and maintenance of eusociality. Here, we show that this controversy stems in part from pervasive phylogenetic conflicts among GC-rich gene trees. GC-rich genes typically have a high nucleotidic heterogeneity among species, which can induce topological conflicts among gene trees. When retaining only the most GC-homogeneous genes or using a nonhomogeneous model of sequence evolution, our analyses reveal a monophyletic group of the three lineages with a eusocial lifestyle (honey bees, bumble bees, and stingless bees). These phylogenetic relationships strongly suggest a single origin of eusociality in the corbiculate bees, with no reversal to solitary living in this group. To accurately reconstruct other important evolutionary steps across the Tree of Life, we suggest removing GC-rich and GC-heterogeneous genes from large phylogenomic data sets. Interpreted as a consequence of genome-wide variations in recombination rates, this GC effect can affect all taxa featuring GC-biased gene conversion, which is common in eukaryotes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Phylogenomic Investigation of CYCLOIDEA-Like TCP Genes in the Leguminosae1
Citerne, Hélène L.; Luo, Da; Pennington, R. Toby; Coen, Enrico; Cronk, Quentin C.B.
2003-01-01
Numerous TCP genes (transcription factors with a TCP domain) occur in legumes. Genes of this class in Arabidopsis (TCP1) and snapdragon (Antirrhinum majus; CYCLOIDEA) have been shown to be asymmetrically expressed in developing floral primordia, and in snapdragon, they are required for floral zygomorphy (bilaterally symmetrical flowers). These genes are therefore particularly interesting in Leguminosae, a family that is thought to have evolved zygomorphy independently from other zygomorphic angiosperm lineages. Using a phylogenomic approach, we show that homologs of TCP1/CYCLOIDEA occur in legumes and may be divided into two main classes (LEGCYC group I and II), apparently the result of an early duplication, and each class is characterized by a typical amino acid signature in the TCP domain. Furthermore, group I genes in legumes may be divided into two subclasses (LEGCYC IA and IB), apparently the result of a duplication near the base of the papilionoid legumes or below. Most papilionoid legumes investigated have all three genes present (LEGCYC IA, IB, and II), inviting further work to investigate possible functional difference between the three types. However, within these three major gene groups, the precise relationships of the paralogs between species are difficult to determine probably because of a complex history of duplication and loss with lineage sorting or heterotachy (within-site rate variation) due to functional differentiation. The results illustrate both the potential and the difficulties of orthology determination in variable gene families, on which the phylogenomic approach to formulating hypotheses of function depends. PMID:12644657
Hampl, Vladimir; Hug, Laura; Leigh, Jessica W; Dacks, Joel B; Lang, B Franz; Simpson, Alastair G B; Roger, Andrew J
2009-03-10
Nearly all of eukaryotic diversity has been classified into 6 suprakingdom-level groups (supergroups) based on molecular and morphological/cell-biological evidence; these are Opisthokonta, Amoebozoa, Archaeplastida, Rhizaria, Chromalveolata, and Excavata. However, molecular phylogeny has not provided clear evidence that either Chromalveolata or Excavata is monophyletic, nor has it resolved the relationships among the supergroups. To establish the affinities of Excavata, which contains parasites of global importance and organisms regarded previously as primitive eukaryotes, we conducted a phylogenomic analysis of a dataset of 143 proteins and 48 taxa, including 19 excavates. Previous phylogenomic studies have not included all major subgroups of Excavata, and thus have not definitively addressed their interrelationships. The enigmatic flagellate Andalucia is sister to typical jakobids. Jakobids (including Andalucia), Euglenozoa and Heterolobosea form a major clade that we name Discoba. Analyses of the complete dataset group Discoba with the mitochondrion-lacking excavates or "metamonads" (diplomonads, parabasalids, and Preaxostyla), but not with the final excavate group, Malawimonas. This separation likely results from a long-branch attraction artifact. Gradual removal of rapidly-evolving taxa from the dataset leads to moderate bootstrap support (69%) for the monophyly of all Excavata, and 90% support once all metamonads are removed. Most importantly, Excavata robustly emerges between unikonts (Amoebozoa + Opisthokonta) and "megagrouping" of Archaeplastida, Rhizaria, and chromalveolates. Our analyses indicate that Excavata forms a monophyletic suprakingdom-level group that is one of the 3 primary divisions within eukaryotes, along with unikonts and a megagroup of Archaeplastida, Rhizaria, and the chromalveolate lineages.
Alves, João M.P.; Serrano, Myrna G.; Maia da Silva, Flávia; Voegtly, Logan J.; Matveyev, Andrey V.; Teixeira, Marta M.G.; Camargo, Erney P.; Buck, Gregory A.
2013-01-01
It has been long known that insect-infecting trypanosomatid flagellates from the genera Angomonas and Strigomonas harbor bacterial endosymbionts (Candidatus Kinetoplastibacterium or TPE [trypanosomatid proteobacterial endosymbiont]) that supplement the host metabolism. Based on previous analyses of other bacterial endosymbiont genomes from other lineages, a stereotypical path of genome evolution in such bacteria over the duration of their association with the eukaryotic host has been characterized. In this work, we sequence and analyze the genomes of five TPEs, perform their metabolic reconstruction, do an extensive phylogenomic analyses with all available Betaproteobacteria, and compare the TPEs with their nearest betaproteobacterial relatives. We also identify a number of housekeeping and central metabolism genes that seem to have undergone positive selection. Our genome structure analyses show total synteny among the five TPEs despite millions of years of divergence, and that this lineage follows the common path of genome evolution observed in other endosymbionts of diverse ancestries. As previously suggested by cell biology and biochemistry experiments, Ca. Kinetoplastibacterium spp. preferentially maintain those genes necessary for the biosynthesis of compounds needed by their hosts. We have also shown that metabolic and informational genes related to the cooperation with the host are overrepresented amongst genes shown to be under positive selection. Finally, our phylogenomic analysis shows that, while being in the Alcaligenaceae family of Betaproteobacteria, the closest relatives of these endosymbionts are not in the genus Bordetella as previously reported, but more likely in the Taylorella genus. PMID:23345457
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weighill, Deborah A; Jacobson, Daniel A
We explore the use of a network meta-modeling approach to compare the effects of similarity metrics used to construct biological networks on the topology of the resulting networks. This work reviews various similarity metrics for the construction of networks and various topology measures for the characterization of resulting network topology, demonstrating the use of these metrics in the construction and comparison of phylogenomic and transcriptomic networks.
LEEBENS-MACK, JIM; VISION, TODD; BRENNER, ERIC; BOWERS, JOHN E.; CANNON, STEVEN; CLEMENT, MARK J.; CUNNINGHAM, CLIFFORD W.; dePAMPHILIS, CLAUDE; deSALLE, ROB; DOYLE, JEFF J.; EISEN, JONATHAN A.; GU, XUN; HARSHMAN, JOHN; JANSEN, ROBERT K.; KELLOGG, ELIZABETH A.; KOONIN, EUGENE V.; MISHLER, BRENT D.; PHILIPPE, HERVÉ; PIRES, J. CHRIS; QIU, YIN-LONG; RHEE, SEUNG Y.; SJÖLANDER, KIMMEN; SOLTIS, DOUGLAS E.; SOLTIS, PAMELA S.; STEVENSON, DENNIS W.; WALL, KERR; WARNOW, TANDY; ZMASEK, CHRISTIAN
2011-01-01
In the eight years since phylogenomics was introduced as the intersection of genomics and phylogenetics, the field has provided fundamental insights into gene function, genome history and organismal relationships. The utility of phylogenomics is growing with the increase in the number and diversity of taxa for which whole genome and large transcriptome sequence sets are being generated. We assert that the synergy between genomic and phylogenetic perspectives in comparative biology would be enhanced by the development and refinement of minimal reporting standards for phylogenetic analyses. Encouraged by the development of the Minimum Information About a Microarray Experiment (MIAME) standard, we propose a similar roadmap for the development of a Minimal Information About a Phylogenetic Analysis (MIAPA) standard. Key in the successful development and implementation of such a standard will be broad participation by developers of phylogenetic analysis software, phylogenetic database developers, practitioners of phylogenomics, and journal editors. PMID:16901231
Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex
Garrido-Sanz, Daniel; Meier-Kolthoff, Jan P.; Göker, Markus; Martín, Marta; Rivilla, Rafael; Redondo-Nieto, Miguel
2016-01-01
The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR. PMID:26915094
A functional phylogenomic view of the seed plants.
Lee, Ernest K; Cibrian-Jaramillo, Angelica; Kolokotronis, Sergios-Orestis; Katari, Manpreet S; Stamatakis, Alexandros; Ott, Michael; Chiu, Joanna C; Little, Damon P; Stevenson, Dennis Wm; McCombie, W Richard; Martienssen, Robert A; Coruzzi, Gloria; Desalle, Rob
2011-12-01
A novel result of the current research is the development and implementation of a unique functional phylogenomic approach that explores the genomic origins of seed plant diversification. We first use 22,833 sets of orthologs from the nuclear genomes of 101 genera across land plants to reconstruct their phylogenetic relationships. One of the more salient results is the resolution of some enigmatic relationships in seed plant phylogeny, such as the placement of Gnetales as sister to the rest of the gymnosperms. In using this novel phylogenomic approach, we were also able to identify overrepresented functional gene ontology categories in genes that provide positive branch support for major nodes prompting new hypotheses for genes associated with the diversification of angiosperms. For example, RNA interference (RNAi) has played a significant role in the divergence of monocots from other angiosperms, which has experimental support in Arabidopsis and rice. This analysis also implied that the second largest subunit of RNA polymerase IV and V (NRPD2) played a prominent role in the divergence of gymnosperms. This hypothesis is supported by the lack of 24nt siRNA in conifers, the maternal control of small RNA in the seeds of flowering plants, and the emergence of double fertilization in angiosperms. Our approach takes advantage of genomic data to define orthologs, reconstruct relationships, and narrow down candidate genes involved in plant evolution within a phylogenomic view of species' diversification.
A Functional Phylogenomic View of the Seed Plants
Katari, Manpreet S.; Stamatakis, Alexandros; Ott, Michael; Chiu, Joanna C.; Little, Damon P.; Stevenson, Dennis Wm.; McCombie, W. Richard; Martienssen, Robert A.; Coruzzi, Gloria; DeSalle, Rob
2011-01-01
A novel result of the current research is the development and implementation of a unique functional phylogenomic approach that explores the genomic origins of seed plant diversification. We first use 22,833 sets of orthologs from the nuclear genomes of 101 genera across land plants to reconstruct their phylogenetic relationships. One of the more salient results is the resolution of some enigmatic relationships in seed plant phylogeny, such as the placement of Gnetales as sister to the rest of the gymnosperms. In using this novel phylogenomic approach, we were also able to identify overrepresented functional gene ontology categories in genes that provide positive branch support for major nodes prompting new hypotheses for genes associated with the diversification of angiosperms. For example, RNA interference (RNAi) has played a significant role in the divergence of monocots from other angiosperms, which has experimental support in Arabidopsis and rice. This analysis also implied that the second largest subunit of RNA polymerase IV and V (NRPD2) played a prominent role in the divergence of gymnosperms. This hypothesis is supported by the lack of 24nt siRNA in conifers, the maternal control of small RNA in the seeds of flowering plants, and the emergence of double fertilization in angiosperms. Our approach takes advantage of genomic data to define orthologs, reconstruct relationships, and narrow down candidate genes involved in plant evolution within a phylogenomic view of species' diversification. PMID:22194700
Large-scale phylogenomic analyses indicate a deep origin of primary plastids within cyanobacteria.
Criscuolo, Alexis; Gribaldo, Simonetta
2011-11-01
The emergence of photosynthetic eukaryotes has played a crucial role in evolution and has strongly modified earth's ecology. Several phylogenetic analyses have established that primary plastids arose from a cyanobacterium through endosymbiosis. However, the question of which present-day cyanobacterial lineage is most closely related to primary plastids has been unclear. Here, we have performed an extensive phylogenomic investigation on the origin of primary plastids based on the analysis of up to 191 protein markers and over 30,000 aligned amino acid sites from 22 primary photosynthetic eukaryotes and 61 cyanobacteria representing a wide taxonomic sampling of this phylum. By using a number of solutions to circumvent a large range of systematic errors, we have reconstructed a robust global phylogeny of cyanobacteria and studied the placement of primary plastids within it. Our results strongly support an early emergence of primary plastids within cyanobacteria, prior to the diversification of most present-day cyanobacterial lineages for which genomic data are available.
Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics1
Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron
2014-01-01
• Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. PMID:25225629
Cruz-Morales, Pablo; Ramos-Aboites, Hilda E; Licona-Cassani, Cuauhtémoc; Selem-Mójica, Nelly; Mejía-Ponce, Paulina M; Souza-Saldívar, Valeria; Barona-Gómez, Francisco
2017-09-01
Desferrioxamines are hydroxamate siderophores widely conserved in both aquatic and soil-dwelling Actinobacteria. While the genetic and enzymatic bases of siderophore biosynthesis and their transport in model families of this phylum are well understood, evolutionary studies are lacking. Here, we perform a comprehensive desferrioxamine-centric (des genes) phylogenomic analysis, which includes the genomes of six novel strains isolated from an iron and phosphorous depleted oasis in the Chihuahuan desert of Mexico. Our analyses reveal previously unnoticed desferrioxamine evolutionary patterns, involving both biosynthetic and transport genes, likely to be related to desferrioxamines chemical diversity. The identified patterns were used to postulate experimentally testable hypotheses after phenotypic characterization, including profiling of siderophores production and growth stimulation of co-cultures under iron deficiency. Based in our results, we propose a novel des gene, which we term desG, as responsible for incorporation of phenylacetyl moieties during biosynthesis of previously reported arylated desferrioxamines. Moreover, a genomic-based classification of the siderophore-binding proteins responsible for specific and generalist siderophore assimilation is postulated. This report provides a much-needed evolutionary framework, with specific insights supported by experimental data, to direct the future ecological and functional analysis of desferrioxamines in the environment. © FEMS 2017.
Kocot, Kevin M; Citarella, Mathew R; Moroz, Leonid L; Halanych, Kenneth M
2013-01-01
Molecular phylogenetics relies on accurate identification of orthologous sequences among the taxa of interest. Most orthology inference programs available for use in phylogenomics rely on small sets of pre-defined orthologs from model organisms or phenetic approaches such as all-versus-all sequence comparisons followed by Markov graph-based clustering. Such approaches have high sensitivity but may erroneously include paralogous sequences. We developed PhyloTreePruner, a software utility that uses a phylogenetic approach to refine orthology inferences made using phenetic methods. PhyloTreePruner checks single-gene trees for evidence of paralogy and generates a new alignment for each group containing only sequences inferred to be orthologs. Importantly, PhyloTreePruner takes into account support values on the tree and avoids unnecessarily deleting sequences in cases where a weakly supported tree topology incorrectly indicates paralogy. A test of PhyloTreePruner on a dataset generated from 11 completely sequenced arthropod genomes identified 2,027 orthologous groups sampled for all taxa. Phylogenetic analysis of the concatenated supermatrix yielded a generally well-supported topology that was consistent with the current understanding of arthropod phylogeny. PhyloTreePruner is freely available from http://sourceforge.net/projects/phylotreepruner/.
Phylogenomics resolves the evolutionary chronicle of our squirting closest relatives.
Giribet, Gonzalo
2018-04-27
A recent paper in BMC Biology has resolved the family relationships of sea squirts, one of our closest invertebrate relatives, by using a large phylogenomic data set derived from available genomes and newly generated transcriptomes. The work confirms previous ideas that ascidians (the sea squirts) are not monophyletic, as they include some pelagic jelly-like relatives, and proposes a chronogram for a group that has been difficult to resolve due to their accelerated genome evolution.See research article: https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-018-0499-2.
Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.
2015-01-01
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for orthology identification combined with extensive manual curation, including incorporation of the results of several completed and ongoing research projects in archaeal genomics. A new level of classification is introduced, superclusters that unit two or more arCOGs and more completely reflect gene family evolution than individual, disconnected arCOGs. Assessment of the current archaeal genome annotation in public databases indicates that consistent use of arCOGs can significantly improve the annotation quality. In addition to their utility for genome annotation, arCOGs also are a platform for phylogenomic analysis. We explore this aspect of arCOGs by performing a phylogenomic study of the Thermococci that are traditionally viewed as the basal branch of the Euryarchaeota. The results of phylogenomic analysis that involved both comparison of multiple phylogenetic trees and a search for putative derived shared characters by using phyletic patterns extracted from the arCOGs reveal a likely evolutionary relationship between the Thermococci, Methanococci, and Methanobacteria. The arCOGs are expected to be instrumental for a comprehensive phylogenomic study of the archaea. PMID:25764277
Phylogenomics and comparative genomics of Lactobacillus salivarius, a mammalian gut commensal.
Harris, Hugh M B; Bourin, Maxence J B; Claesson, Marcus J; O'Toole, Paul W
2017-08-01
The genus Lactobacillus is a diverse group with a combined species count of over 200. They are the largest group within the lactic acid bacteria and one of the most important bacterial groups involved in food microbiology and human nutrition because of their fermentative and probiotic properties. Lactobacillus salivarius , a species commonly isolated from the gastrointestinal tract of humans and animals, has been described as having potential probiotic properties and results of previous studies have revealed considerable functional diversity existing on both the chromosomes and plasmids. Our study consists of comparative genomic analyses of the functional and phylogenomic diversity of 42 genomes of strains of L . salivarius using bioinformatic techniques. The main aim of the study was to describe intra-species diversity and to determine how this diversity is spread across the replicons. We found that multiple phylogenomic and non-phylogenomic methods used for reconstructing trees all converge on similar tree topologies, showing that different metrics largely agree on the evolutionary history of the species. The greatest genomic variation lies on the small plasmids, followed by the repA -type circular megaplasmid, with the chromosome varying least of all. Additionally, the presence of extra linear and circular megaplasmids is noted in several strains, while small plasmids are not always present. Glycosyl hydrolases, bacteriocins and proteases vary considerably on all replicons while two exopolysaccharide clusters and several clustered regularly interspaced short palindromic repeats-associated systems show a lot of variation on the chromosome. Overall, despite its reputation as a mammalian gastrointestinal tract specialist, the intra-specific variation of L. salivarius reveals potential strain-dependant effects on human health.
Sharma, Rita; Cao, Peijian; Jung, Ki-Hong; Sharma, Manoj K.; Ronald, Pamela C.
2013-01-01
Glycoside hydrolases (GH) catalyze the hydrolysis of glycosidic bonds in cell wall polymers and can have major effects on cell wall architecture. Taking advantage of the massive datasets available in public databases, we have constructed a rice phylogenomic database of GHs (http://ricephylogenomics.ucdavis.edu/cellwalls/gh/). This database integrates multiple data types including the structural features, orthologous relationships, mutant availability, and gene expression patterns for each GH family in a phylogenomic context. The rice genome encodes 437 GH genes classified into 34 families. Based on pairwise comparison with eight dicot and four monocot genomes, we identified 138 GH genes that are highly diverged between monocots and dicots, 57 of which have diverged further in rice as compared with four monocot genomes scanned in this study. Chromosomal localization and expression analysis suggest a role for both whole-genome and localized gene duplications in expansion and diversification of GH families in rice. We examined the meta-profiles of expression patterns of GH genes in twenty different anatomical tissues of rice. Transcripts of 51 genes exhibit tissue or developmental stage-preferential expression, whereas, seventeen other genes preferentially accumulate in actively growing tissues. When queried in RiceNet, a probabilistic functional gene network that facilitates functional gene predictions, nine out of seventeen genes form a regulatory network with the well-characterized genes involved in biosynthesis of cell wall polymers including cellulose synthase and cellulose synthase-like genes of rice. Two-thirds of the GH genes in rice are up regulated in response to biotic and abiotic stress treatments indicating a role in stress adaptation. Our analyses identify potential GH targets for cell wall modification. PMID:23986771
Wu, Chung-Shien; Wang, Ya-Nan; Hsu, Chi-Yao; Chaw, Shu-Miaw
2011-01-01
The relationships among the extant five gymnosperm groups—gnetophytes, Pinaceae, non-Pinaceae conifers (cupressophytes), Ginkgo, and cycads—remain equivocal. To clarify this issue, we sequenced the chloroplast genomes (cpDNAs) from two cupressophytes, Cephalotaxus wilsoniana and Taiwania cryptomerioides, and 53 common chloroplast protein-coding genes from another three cupressophytes, Agathis dammara, Nageia nagi, and Sciadopitys verticillata, and a non-Cycadaceae cycad, Bowenia serrulata. Comparative analyses of 11 conifer cpDNAs revealed that Pinaceae and cupressophytes each lost a different copy of inverted repeats (IRs), which contrasts with the view that the same IR has been lost in all conifers. Based on our structural finding, the character of an IR loss no longer conflicts with the “gnepines” hypothesis (gnetophytes sister to Pinaceae). Chloroplast phylogenomic analyses of amino acid sequences recovered incongruent topologies using different tree-building methods; however, we demonstrated that high heterotachous genes (genes that have highly different rates in different lineages) contributed to the long-branch attraction (LBA) artifact, resulting in incongruence of phylogenomic estimates. Additionally, amino acid compositions appear more heterogeneous in high than low heterotachous genes among the five gymnosperm groups. Removal of high heterotachous genes alleviated the LBA artifact and yielded congruent and robust tree topologies in which gnetophytes and Pinaceae formed a sister clade to cupressophytes (the gnepines hypothesis) and Ginkgo clustered with cycads. Adding more cupressophyte taxa could not improve the accuracy of chloroplast phylogenomics for the five gymnosperm groups. In contrast, removal of high heterotachous genes from data sets is simple and can increase confidence in evaluating the phylogeny of gymnosperms. PMID:21933779
Wu, Chung-Shien; Wang, Ya-Nan; Hsu, Chi-Yao; Lin, Ching-Ping; Chaw, Shu-Miaw
2011-01-01
The relationships among the extant five gymnosperm groups--gnetophytes, Pinaceae, non-Pinaceae conifers (cupressophytes), Ginkgo, and cycads--remain equivocal. To clarify this issue, we sequenced the chloroplast genomes (cpDNAs) from two cupressophytes, Cephalotaxus wilsoniana and Taiwania cryptomerioides, and 53 common chloroplast protein-coding genes from another three cupressophytes, Agathis dammara, Nageia nagi, and Sciadopitys verticillata, and a non-Cycadaceae cycad, Bowenia serrulata. Comparative analyses of 11 conifer cpDNAs revealed that Pinaceae and cupressophytes each lost a different copy of inverted repeats (IRs), which contrasts with the view that the same IR has been lost in all conifers. Based on our structural finding, the character of an IR loss no longer conflicts with the "gnepines" hypothesis (gnetophytes sister to Pinaceae). Chloroplast phylogenomic analyses of amino acid sequences recovered incongruent topologies using different tree-building methods; however, we demonstrated that high heterotachous genes (genes that have highly different rates in different lineages) contributed to the long-branch attraction (LBA) artifact, resulting in incongruence of phylogenomic estimates. Additionally, amino acid compositions appear more heterogeneous in high than low heterotachous genes among the five gymnosperm groups. Removal of high heterotachous genes alleviated the LBA artifact and yielded congruent and robust tree topologies in which gnetophytes and Pinaceae formed a sister clade to cupressophytes (the gnepines hypothesis) and Ginkgo clustered with cycads. Adding more cupressophyte taxa could not improve the accuracy of chloroplast phylogenomics for the five gymnosperm groups. In contrast, removal of high heterotachous genes from data sets is simple and can increase confidence in evaluating the phylogeny of gymnosperms.
Comparative genomic data of the Avian Phylogenomics Project.
Zhang, Guojie; Li, Bo; Li, Cai; Gilbert, M Thomas P; Jarvis, Erich D; Wang, Jun
2014-01-01
The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.
Molecular diversity of early foraminifera
NASA Astrophysics Data System (ADS)
Holzmann, Maria; Pawlowski, Jan
2017-04-01
Monothalamid foraminifera are a diverse group that is characterized by single-chambered agglutinated or organic test. They occur in all marine habitats and are also present in terrestrial and freshwater environments. Monothalamids branch at the base of foraminiferal tree, as a paraphyletic group with some clades branching at the base of Globothalamea and Tubothalamea. We have currently more than 1500 sequences of monothalamids in our database that can be divided in at least 20 clades among which certain are particularly well presented by sequence numbers and/or number of different species. These are members of clade BM that contain Bathysiphon and Micrometula, clade C that contains among others xenophyophorans, saccaminids, and a large variety of organic-walled or agglutinated genera, clade E that contains the genera Psammophaga, Vellaria and Nellya and four clades that contain freshwater foraminifera. In general, the monothalamid clades comprise both agglutinated and organic-walled genera. Some common genera, such as Crithionina, Saccammina, Hippocrepina, are polyphyletic. Our results clearly show that monothalamids are highly diverse and their molecular diversity by far surpasses their morphological variety. Based on phylogenomic studies, monothalamids evolved early in the evolution of eukaryotes, as a part of the supergroup of Rhizaria, comprising also radiolarians and other amoeboid protists. The monothalamids have diverged from ancestral radiolarians, probably about 1000 million years ago, but the exact time is difficult to infer because of the uncertainties concerning a calibration of a eukaryotic phylogenomic tree.
Genome-Based Analyses of Six Hexacorallian Species Reject the "Naked Coral" Hypothesis.
Wang, Xin; Drillon, Guénola; Ryu, Taewoo; Voolstra, Christian R; Aranda, Manuel
2017-10-01
Scleractinian corals are the foundation species of the coral-reef ecosystem. Their calcium carbonate skeletons form extensive structures that are home to millions of species, making coral reefs one of the most diverse ecosystems of our planet. However, our understanding of how reef-building corals have evolved the ability to calcify and become the ecosystem builders they are today is hampered by uncertain relationships within their subclass Hexacorallia. Corallimorpharians have been proposed to originate from a complex scleractinian ancestor that lost the ability to calcify in response to increasing ocean acidification, suggesting the possibility for corals to lose and gain the ability to calcify in response to increasing ocean acidification. Here, we employed a phylogenomic approach using whole-genome data from six hexacorallian species to resolve the evolutionary relationship between reef-building corals and their noncalcifying relatives. Phylogenetic analysis based on 1,421 single-copy orthologs, as well as gene presence/absence and synteny information, converged on the same topologies, showing strong support for scleractinian monophyly and a corallimorpharian sister clade. Our broad phylogenomic approach using sequence-based and sequence-independent analyses provides unambiguous evidence for the monophyly of scleractinian corals and the rejection of corallimorpharians as descendants of a complex coral ancestor. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
László G. Nagy; Robert Riley; Philip J. Bergmann; Krisztina Krizsán; Francis M. Martin; Igor V. Grigoriev; Dan Cullen; David S. Hibbett
2016-01-01
Fungal decomposition of plant cell walls (PCW) is a complex process that has diverse industrial applications and huge impacts on the carbon cycle. White rot (WR) is a powerful mode of PCW decay in which lignin and carbohydrates are both degraded. Mechanistic studies of decay coupled with comparative genomic analyses have provided clues to the enzymatic components of WR...
Fernández, Rosa; Giribet, Gonzalo
2015-01-01
Ricinulei are among the most obscure and cryptic arachnid orders, constituting a micro-diverse group with extreme endemism. The 76 extant species described to date are grouped in three genera: Ricinoides, from tropical Western and Central Africa, and the two Neotropical genera Cryptocellus and Pseudocellus. Until now, a single molecular phylogeny of Ricinulei has been published, recovering the African Ricinoides as the sister group of the American Pseudocellus and providing evidence for the diversification of the order pre-dating the fragmentation of Gondwana. Here, we present, to our knowledge, the first phylogenomic study of this neglected arachnid order based on data from five transcriptomes obtained from the five major mitochondrial lineages of Ricinulei. Our results, based on up to more than 2000 genes, strongly support a clade containing Pseudocellus and Cryptocellus, constituting the American group of Ricinulei, with the African Ricinoides nesting outside. Our dating of the diversification of the African and American clades using a 76 gene data matrix with 90% gene occupancy indicates that this arachnid lineage was distributed in the South American, North American and African plates of Gondwana and that its diversification is concordant with a biogeographic scenario (both for pattern and tempo) of Gondwanan vicariance. PMID:26543583
Ecogenomics and Taxonomy of Cyanobacteria Phylum
Walter, Juline M.; Coutinho, Felipe H.; Dutilh, Bas E.; Swings, Jean; Thompson, Fabiano L.; Thompson, Cristiane C.
2017-01-01
Cyanobacteria are major contributors to global biogeochemical cycles. The genetic diversity among Cyanobacteria enables them to thrive across many habitats, although only a few studies have analyzed the association of phylogenomic clades to specific environmental niches. In this study, we adopted an ecogenomics strategy with the aim to delineate ecological niche preferences of Cyanobacteria and integrate them to the genomic taxonomy of these bacteria. First, an appropriate phylogenomic framework was established using a set of genomic taxonomy signatures (including a tree based on conserved gene sequences, genome-to-genome distance, and average amino acid identity) to analyse ninety-nine publicly available cyanobacterial genomes. Next, the relative abundances of these genomes were determined throughout diverse global marine and freshwater ecosystems, using metagenomic data sets. The whole-genome-based taxonomy of the ninety-nine genomes allowed us to identify 57 (of which 28 are new genera) and 87 (of which 32 are new species) different cyanobacterial genera and species, respectively. The ecogenomic analysis allowed the distinction of three major ecological groups of Cyanobacteria (named as i. Low Temperature; ii. Low Temperature Copiotroph; and iii. High Temperature Oligotroph) that were coherently linked to the genomic taxonomy. This work establishes a new taxonomic framework for Cyanobacteria in the light of genomic taxonomy and ecogenomic approaches. PMID:29184540
Ecogenomics and Taxonomy of Cyanobacteria Phylum.
Walter, Juline M; Coutinho, Felipe H; Dutilh, Bas E; Swings, Jean; Thompson, Fabiano L; Thompson, Cristiane C
2017-01-01
Cyanobacteria are major contributors to global biogeochemical cycles. The genetic diversity among Cyanobacteria enables them to thrive across many habitats, although only a few studies have analyzed the association of phylogenomic clades to specific environmental niches. In this study, we adopted an ecogenomics strategy with the aim to delineate ecological niche preferences of Cyanobacteria and integrate them to the genomic taxonomy of these bacteria. First, an appropriate phylogenomic framework was established using a set of genomic taxonomy signatures (including a tree based on conserved gene sequences, genome-to-genome distance, and average amino acid identity) to analyse ninety-nine publicly available cyanobacterial genomes. Next, the relative abundances of these genomes were determined throughout diverse global marine and freshwater ecosystems, using metagenomic data sets. The whole-genome-based taxonomy of the ninety-nine genomes allowed us to identify 57 (of which 28 are new genera) and 87 (of which 32 are new species) different cyanobacterial genera and species, respectively. The ecogenomic analysis allowed the distinction of three major ecological groups of Cyanobacteria (named as i. Low Temperature; ii. Low Temperature Copiotroph; and iii. High Temperature Oligotroph) that were coherently linked to the genomic taxonomy. This work establishes a new taxonomic framework for Cyanobacteria in the light of genomic taxonomy and ecogenomic approaches.
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics.
Edwards, Scott V; Cloutier, Alison; Baker, Allan J
2017-11-01
Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600-∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biologists.
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics
Cloutier, Alison; Baker, Allan J.
2017-01-01
Abstract Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600–∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. PMID:28637293
Chavda, Kalyan D.; Chen, Liang; Fouts, Derrick E.; Sutton, Granger; Brinkac, Lauren; Jenkins, Stephen G.; Bonomo, Robert A.
2016-01-01
ABSTRACT Knowledge regarding the genomic structure of Enterobacter spp., the second most prevalent carbapenemase-producing Enterobacteriaceae, remains limited. Here we sequenced 97 clinical Enterobacter species isolates that were both carbapenem susceptible and resistant from various geographic regions to decipher the molecular origins of carbapenem resistance and to understand the changing phylogeny of these emerging and drug-resistant pathogens. Of the carbapenem-resistant isolates, 30 possessed blaKPC-2, 40 had blaKPC-3, 2 had blaKPC-4, and 2 had blaNDM-1. Twenty-three isolates were carbapenem susceptible. Six genomes were sequenced to completion, and their sizes ranged from 4.6 to 5.1 Mbp. Phylogenomic analysis placed 96 of these genomes, 351 additional Enterobacter genomes downloaded from NCBI GenBank, and six newly sequenced type strains into 19 phylogenomic groups—18 groups (A to R) in the Enterobacter cloacae complex and Enterobacter aerogenes. Diverse mechanisms underlying the molecular evolutionary trajectory of these drug-resistant Enterobacter spp. were revealed, including the acquisition of an antibiotic resistance plasmid, followed by clonal spread, horizontal transfer of blaKPC-harboring plasmids between different phylogenomic groups, and repeated transposition of the blaKPC gene among different plasmid backbones. Group A, which comprises multilocus sequence type 171 (ST171), was the most commonly identified (23% of isolates). Genomic analysis showed that ST171 isolates evolved from a common ancestor and formed two different major clusters; each acquiring unique blaKPC-harboring plasmids, followed by clonal expansion. The data presented here represent the first comprehensive study of phylogenomic interrogation and the relationship between antibiotic resistance and plasmid discrimination among carbapenem-resistant Enterobacter spp., demonstrating the genetic diversity and complexity of the molecular mechanisms driving antibiotic resistance in this genus. PMID:27965456
Brown, Jeremy M; Thomson, Robert C
2017-07-01
As the application of genomic data in phylogenetics has become routine, a number of cases have arisen where alternative data sets strongly support conflicting conclusions. This sensitivity to analytical decisions has prevented firm resolution of some of the most recalcitrant nodes in the tree of life. To better understand the causes and nature of this sensitivity, we analyzed several phylogenomic data sets using an alternative measure of topological support (the Bayes factor) that both demonstrates and averts several limitations of more frequently employed support measures (such as Markov chain Monte Carlo estimates of posterior probabilities). Bayes factors reveal important, previously hidden, differences across six "phylogenomic" data sets collected to resolve the phylogenetic placement of turtles within Amniota. These data sets vary substantially in their support for well-established amniote relationships, particularly in the proportion of genes that contain extreme amounts of information as well as the proportion that strongly reject these uncontroversial relationships. All six data sets contain little information to resolve the phylogenetic placement of turtles relative to other amniotes. Bayes factors also reveal that a very small number of extremely influential genes (less than 1% of genes in a data set) can fundamentally change significant phylogenetic conclusions. In one example, these genes are shown to contain previously unrecognized paralogs. This study demonstrates both that the resolution of difficult phylogenomic problems remains sensitive to seemingly minor analysis details and that Bayes factors are a valuable tool for identifying and solving these challenges. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Evaluating phylogenetic congruence in the post-genomic era.
Leigh, Jessica W; Lapointe, François-Joseph; Lopez, Philippe; Bapteste, Eric
2011-01-01
Congruence is a broadly applied notion in evolutionary biology used to justify multigene phylogeny or phylogenomics, as well as in studies of coevolution, lateral gene transfer, and as evidence for common descent. Existing methods for identifying incongruence or heterogeneity using character data were designed for data sets that are both small and expected to be rarely incongruent. At the same time, methods that assess incongruence using comparison of trees test a null hypothesis of uncorrelated tree structures, which may be inappropriate for phylogenomic studies. As such, they are ill-suited for the growing number of available genome sequences, most of which are from prokaryotes and viruses, either for phylogenomic analysis or for studies of the evolutionary forces and events that have shaped these genomes. Specifically, many existing methods scale poorly with large numbers of genes, cannot accommodate high levels of incongruence, and do not adequately model patterns of missing taxa for different markers. We propose the development of novel incongruence assessment methods suitable for the analysis of the molecular evolution of the vast majority of life and support the investigation of homogeneity of evolutionary process in cases where markers do not share identical tree structures.
Phylotranscriptomic consolidation of the jawed vertebrate timetree.
Irisarri, Iker; Baurain, Denis; Brinkmann, Henner; Delsuc, Frédéric; Sire, Jean-Yves; Kupfer, Alexander; Petersen, Jörn; Jarek, Michael; Meyer, Axel; Vences, Miguel; Philippe, Hervé
2017-09-01
Phylogenomics is extremely powerful but introduces new challenges as no agreement exists on "standards" for data selection, curation and tree inference. We use jawed vertebrates (Gnathostomata) as model to address these issues. Despite considerable efforts in resolving their evolutionary history and macroevolution, few studies have included a full phylogenetic diversity of gnathostomes and some relationships remain controversial. We tested a novel bioinformatic pipeline to assemble large and accurate phylogenomic datasets from RNA sequencing and find this phylotranscriptomic approach successful and highly cost-effective. Increased sequencing effort up to ca. 10Gbp allows recovering more genes, but shallower sequencing (1.5Gbp) is sufficient to obtain thousands of full-length orthologous transcripts. We reconstruct a robust and strongly supported timetree of jawed vertebrates using 7,189 nuclear genes from 100 taxa, including 23 new transcriptomes from previously unsampled key species. Gene jackknifing of genomic data corroborates the robustness of our tree and allows calculating genome-wide divergence times by overcoming gene sampling bias. Mitochondrial genomes prove insufficient to resolve the deepest relationships because of limited signal and among-lineage rate heterogeneity. Our analyses emphasize the importance of large curated nuclear datasets to increase the accuracy of phylogenomics and provide a reference framework for the evolutionary history of jawed vertebrates.
Evaluating Phylogenetic Congruence in the Post-Genomic Era
Leigh, Jessica W.; Lapointe, François-Joseph; Lopez, Philippe; Bapteste, Eric
2011-01-01
Congruence is a broadly applied notion in evolutionary biology used to justify multigene phylogeny or phylogenomics, as well as in studies of coevolution, lateral gene transfer, and as evidence for common descent. Existing methods for identifying incongruence or heterogeneity using character data were designed for data sets that are both small and expected to be rarely incongruent. At the same time, methods that assess incongruence using comparison of trees test a null hypothesis of uncorrelated tree structures, which may be inappropriate for phylogenomic studies. As such, they are ill-suited for the growing number of available genome sequences, most of which are from prokaryotes and viruses, either for phylogenomic analysis or for studies of the evolutionary forces and events that have shaped these genomes. Specifically, many existing methods scale poorly with large numbers of genes, cannot accommodate high levels of incongruence, and do not adequately model patterns of missing taxa for different markers. We propose the development of novel incongruence assessment methods suitable for the analysis of the molecular evolution of the vast majority of life and support the investigation of homogeneity of evolutionary process in cases where markers do not share identical tree structures. PMID:21712432
Phylogenomic detection and functional prediction of genes potentially important for plant meiosis.
Zhang, Luoyan; Kong, Hongzhi; Ma, Hong; Yang, Ji
2018-02-15
Meiosis is a specialized type of cell division necessary for sexual reproduction in eukaryotes. A better understanding of the cytological procedures of meiosis has been achieved by comprehensive cytogenetic studies in plants, while the genetic mechanisms regulating meiotic progression remain incompletely understood. The increasing accumulation of complete genome sequences and large-scale gene expression datasets has provided a powerful resource for phylogenomic inference and unsupervised identification of genes involved in plant meiosis. By integrating sequence homology and expression data, 164, 131, 124 and 162 genes potentially important for meiosis were identified in the genomes of Arabidopsis thaliana, Oryza sativa, Selaginella moellendorffii and Pogonatum aloides, respectively. The predicted genes were assigned to 45 meiotic GO terms, and their functions were related to different processes occurring during meiosis in various organisms. Most of the predicted meiotic genes underwent lineage-specific duplication events during plant evolution, with about 30% of the predicted genes retaining only a single copy in higher plant genomes. The results of this study provided clues to design experiments for better functional characterization of meiotic genes in plants, promoting the phylogenomic approach to the evolutionary dynamics of the plant meiotic machineries. Copyright © 2017 Elsevier B.V. All rights reserved.
Gupta, Radhey S.; Lo, Brian; Son, Jeen
2018-01-01
The genus Mycobacterium contains 188 species including several major human pathogens as well as numerous other environmental species. We report here comprehensive phylogenomics and comparative genomic analyses on 150 genomes of Mycobacterium species to understand their interrelationships. Phylogenetic trees were constructed for the 150 species based on 1941 core proteins for the genus Mycobacterium, 136 core proteins for the phylum Actinobacteria and 8 other conserved proteins. Additionally, the overall genome similarity amongst the Mycobacterium species was determined based on average amino acid identity of the conserved protein families. The results from these analyses consistently support the existence of five distinct monophyletic groups within the genus Mycobacterium at the highest level, which are designated as the “Tuberculosis-Simiae,” “Terrae,” “Triviale,” “Fortuitum-Vaccae,” and “Abscessus-Chelonae” clades. Some of these clades have also been observed in earlier phylogenetic studies. Of these clades, the “Abscessus-Chelonae” clade forms the deepest branching lineage and does not form a monophyletic grouping with the “Fortuitum-Vaccae” clade of fast-growing species. In parallel, our comparative analyses of proteins from mycobacterial genomes have identified 172 molecular signatures in the form of conserved signature indels and conserved signature proteins, which are uniquely shared by either all Mycobacterium species or by members of the five identified clades. The identified molecular signatures (or synapomorphies) provide strong independent evidence for the monophyly of the genus Mycobacterium and the five described clades and they provide reliable means for the demarcation of these clades and for their diagnostics. Based on the results of our comprehensive phylogenomic analyses and numerous identified molecular signatures, which consistently and strongly support the division of known mycobacterial species into the five described clades, we propose here division of the genus Mycobacterium into an emended genus Mycobacterium encompassing the “Tuberculosis-Simiae” clade, which includes all of the major human pathogens, and four novel genera viz. Mycolicibacterium gen. nov., Mycolicibacter gen. nov., Mycolicibacillus gen. nov. and Mycobacteroides gen. nov. corresponding to the “Fortuitum-Vaccae,” “Terrae,” “Triviale,” and “Abscessus-Chelonae” clades, respectively. With the division of mycobacterial species into these five distinct groups, attention can now be focused on unique genetic and molecular characteristics that differentiate members of these groups. PMID:29497402
Blom, Mozes P K; Bragg, Jason G; Potter, Sally; Moritz, Craig
2017-05-01
Accurate gene tree inference is an important aspect of species tree estimation in a summary-coalescent framework. Yet, in empirical studies, inferred gene trees differ in accuracy due to stochastic variation in phylogenetic signal between targeted loci. Empiricists should, therefore, examine the consistency of species tree inference, while accounting for the observed heterogeneity in gene tree resolution of phylogenomic data sets. Here, we assess the impact of gene tree estimation error on summary-coalescent species tree inference by screening ${\\sim}2000$ exonic loci based on gene tree resolution prior to phylogenetic inference. We focus on a phylogenetically challenging radiation of Australian lizards (genus Cryptoblepharus, Scincidae) and explore effects on topology and support. We identify a well-supported topology based on all loci and find that a relatively small number of high-resolution gene trees can be sufficient to converge on the same topology. Adding gene trees with decreasing resolution produced a generally consistent topology, and increased confidence for specific bipartitions that were poorly supported when using a small number of informative loci. This corroborates coalescent-based simulation studies that have highlighted the need for a large number of loci to confidently resolve challenging relationships and refutes the notion that low-resolution gene trees introduce phylogenetic noise. Further, our study also highlights the value of quantifying changes in nodal support across locus subsets of increasing size (but decreasing gene tree resolution). Such detailed analyses can reveal anomalous fluctuations in support at some nodes, suggesting the possibility of model violation. By characterizing the heterogeneity in phylogenetic signal among loci, we can account for uncertainty in gene tree estimation and assess its effect on the consistency of the species tree estimate. We suggest that the evaluation of gene tree resolution should be incorporated in the analysis of empirical phylogenomic data sets. This will ultimately increase our confidence in species tree estimation using summary-coalescent methods and enable us to exploit genomic data for phylogenetic inference. [Coalescence; concatenation; Cryptoblepharus; exon capture; gene tree; phylogenomics; species tree.]. © The authors 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For permissions, please e-mail: journals.permission@oup.com.
Sorenson, Laurie; Santini, Francesco
2013-01-01
Ray-finned fishes constitute the dominant radiation of vertebrates with over 32,000 species. Although molecular phylogenetics has begun to disentangle major evolutionary relationships within this vast section of the Tree of Life, there is no widely available approach for efficiently collecting phylogenomic data within fishes, leaving much of the enormous potential of massively parallel sequencing technologies for resolving major radiations in ray-finned fishes unrealized. Here, we provide a genomic perspective on longstanding questions regarding the diversification of major groups of ray-finned fishes through targeted enrichment of ultraconserved nuclear DNA elements (UCEs) and their flanking sequence. Our workflow efficiently and economically generates data sets that are orders of magnitude larger than those produced by traditional approaches and is well-suited to working with museum specimens. Analysis of the UCE data set recovers a well-supported phylogeny at both shallow and deep time-scales that supports a monophyletic relationship between Amia and Lepisosteus (Holostei) and reveals elopomorphs and then osteoglossomorphs to be the earliest diverging teleost lineages. Our approach additionally reveals that sequence capture of UCE regions and their flanking sequence offers enormous potential for resolving phylogenetic relationships within ray-finned fishes. PMID:23824177
Trajectories and Drivers of Genome Evolution in Surface-Associated Marine Phaeobacter
Sikorski, Johannes; Bunk, Boyke; Scheuner, Carmen; Meier-Kolthoff, Jan P; Spröer, Cathrin; Gram, Lone; Overmann, Jörg
2017-01-01
Abstract The extent of genome divergence and the evolutionary events leading to speciation of marine bacteria have mostly been studied for (locally) abundant, free-living groups. The genus Phaeobacter is found on different marine surfaces, seems to occupy geographically disjunct habitats, and is involved in different biotic interactions, and was therefore targeted in the present study. The analysis of the chromosomes of 32 closely related but geographically spread Phaeobacter strains revealed an exceptionally large, highly syntenic core genome. The flexible gene pool is constantly but slightly expanding across all Phaeobacter lineages. The horizontally transferred genes mostly originated from bacteria of the Roseobacter group and horizontal transfer most likely was mediated by gene transfer agents. No evidence for geographic isolation and habitat specificity of the different phylogenomic Phaeobacter clades was detected based on the sources of isolation. In contrast, the functional gene repertoire and physiological traits of different phylogenomic Phaeobacter clades were sufficiently distinct to suggest an adaptation to an associated lifestyle with algae, to additional nutrient sources, or toxic heavy metals. Our study reveals that the evolutionary trajectories of surface-associated marine bacteria can differ significantly from free-living marine bacteria or marine generalists. PMID:29194520
Comparative Phylogenomics Uncovers the Impact of Symbiotic Associations on Host Genome Evolution
Delaux, Pierre-Marc; Varala, Kranthi; Edger, Patrick P.; Coruzzi, Gloria M.; Pires, J. Chris; Ané, Jean-Michel
2014-01-01
Mutualistic symbioses between eukaryotes and beneficial microorganisms of their microbiome play an essential role in nutrition, protection against disease, and development of the host. However, the impact of beneficial symbionts on the evolution of host genomes remains poorly characterized. Here we used the independent loss of the most widespread plant–microbe symbiosis, arbuscular mycorrhization (AM), as a model to address this question. Using a large phenotypic approach and phylogenetic analyses, we present evidence that loss of AM symbiosis correlates with the loss of many symbiotic genes in the Arabidopsis lineage (Brassicales). Then, by analyzing the genome and/or transcriptomes of nine other phylogenetically divergent non-host plants, we show that this correlation occurred in a convergent manner in four additional plant lineages, demonstrating the existence of an evolutionary pattern specific to symbiotic genes. Finally, we use a global comparative phylogenomic approach to track this evolutionary pattern among land plants. Based on this approach, we identify a set of 174 highly conserved genes and demonstrate enrichment in symbiosis-related genes. Our findings are consistent with the hypothesis that beneficial symbionts maintain purifying selection on host gene networks during the evolution of entire lineages. PMID:25032823
Deep Whole-Genome Sequencing to Detect Mixed Infection of Mycobacterium tuberculosis
Gan, Mingyu; Liu, Qingyun; Yang, Chongguang; Gao, Qian; Luo, Tao
2016-01-01
Mixed infection by multiple Mycobacterium tuberculosis (MTB) strains is associated with poor treatment outcome of tuberculosis (TB). Traditional genotyping methods have been used to detect mixed infections of MTB, however, their sensitivity and resolution are limited. Deep whole-genome sequencing (WGS) has been proved highly sensitive and discriminative for studying population heterogeneity of MTB. Here, we developed a phylogenetic-based method to detect MTB mixed infections using WGS data. We collected published WGS data of 782 global MTB strains from public database. We called homogeneous and heterogeneous single nucleotide variations (SNVs) of individual strains by mapping short reads to the ancestral MTB reference genome. We constructed a phylogenomic database based on 68,639 homogeneous SNVs of 652 MTB strains. Mixed infections were determined if multiple evolutionary paths were identified by mapping the SNVs of individual samples to the phylogenomic database. By simulation, our method could specifically detect mixed infections when the sequencing depth of minor strains was as low as 1× coverage, and when the genomic distance of two mixed strains was as small as 16 SNVs. By applying our methods to all 782 samples, we detected 47 mixed infections and 45 of them were caused by locally endemic strains. The results indicate that our method is highly sensitive and discriminative for identifying mixed infections from deep WGS data of MTB isolates. PMID:27391214
Bond, Jason E; Garrison, Nicole L; Hamilton, Chris A; Godwin, Rebecca L; Hedin, Marshal; Agnarsson, Ingi
2014-08-04
Spiders represent an ancient predatory lineage known for their extraordinary biomaterials, including venoms and silks. These adaptations make spiders key arthropod predators in most terrestrial ecosystems. Despite ecological, biomedical, and biomaterial importance, relationships among major spider lineages remain unresolved or poorly supported. Current working hypotheses for a spider "backbone" phylogeny are largely based on morphological evidence, as most molecular markers currently employed are generally inadequate for resolving deeper-level relationships. We present here a phylogenomic analysis of spiders including taxa representing all major spider lineages. Our robust phylogenetic hypothesis recovers some fundamental and uncontroversial spider clades, but rejects the prevailing paradigm of a monophyletic Orbiculariae, the most diverse lineage, containing orb-weaving spiders. Based on our results, the orb web either evolved much earlier than previously hypothesized and is ancestral for a majority of spiders or else it has multiple independent origins, as hypothesized by precladistic authors. Cribellate deinopoid orb weavers that use mechanically adhesive silk are more closely related to a diverse clade of mostly webless spiders than to the araneoid orb-weaving spiders that use adhesive droplet silks. The fundamental shift in our understanding of spider phylogeny proposed here has broad implications for interpreting the evolution of spiders, their remarkable biomaterials, and a key extended phenotype--the spider web. Copyright © 2014 Elsevier Ltd. All rights reserved.
Supertrees Based on the Subtree Prune-and-Regraft Distance
Whidden, Christopher; Zeh, Norbert; Beiko, Robert G.
2014-01-01
Supertree methods reconcile a set of phylogenetic trees into a single structure that is often interpreted as a branching history of species. A key challenge is combining conflicting evolutionary histories that are due to artifacts of phylogenetic reconstruction and phenomena such as lateral gene transfer (LGT). Many supertree approaches use optimality criteria that do not reflect underlying processes, have known biases, and may be unduly influenced by LGT. We present the first method to construct supertrees by using the subtree prune-and-regraft (SPR) distance as an optimality criterion. Although calculating the rooted SPR distance between a pair of trees is NP-hard, our new maximum agreement forest-based methods can reconcile trees with hundreds of taxa and > 50 transfers in fractions of a second, which enables repeated calculations during the course of an iterative search. Our approach can accommodate trees in which uncertain relationships have been collapsed to multifurcating nodes. Using a series of benchmark datasets simulated under plausible rates of LGT, we show that SPR supertrees are more similar to correct species histories than supertrees based on parsimony or Robinson–Foulds distance criteria. We successfully constructed an SPR supertree from a phylogenomic dataset of 40,631 gene trees that covered 244 genomes representing several major bacterial phyla. Our SPR-based approach also allowed direct inference of highways of gene transfer between bacterial classes and genera. A Small number of these highways connect genera in different phyla and can highlight specific genes implicated in long-distance LGT. [Lateral gene transfer; matrix representation with parsimony; phylogenomics; prokaryotic phylogeny; Robinson–Foulds; subtree prune-and-regraft; supertrees.] PMID:24695589
Bowden, Deborah L; Vargas-Caro, Carolina; Ovenden, Jennifer R; Bennett, Michael B; Bustamante, Carlos
2016-11-01
The complete mitochondrial genome of the grey nurse shark Carcharias taurus is described from 25 963 828 sequences obtained using Illumina NGS technology. Total length of the mitogenome is 16 715 bp, consisting of 2 rRNAs, 13 protein-coding regions, 22 tRNA and 2 non-coding regions thus updating the previously published mitogenome for this species. The phylogenomic reconstruction inferred from the mitogenome of 15 species of Lamniform and Carcharhiniform sharks supports the inclusion of C. taurus in a clade with the Lamnidae and Cetorhinidae. This complete mitogenome contributes to ongoing investigation into the monophyly of the Family Odontaspididae.
Complete genome of Cobetia marina JCM 21022T and phylogenomic analysis of the family Halomonadaceae
NASA Astrophysics Data System (ADS)
Tang, Xianghai; Xu, Kuipeng; Han, Xiaojuan; Mo, Zhaolan; Mao, Yunxiang
2018-03-01
Cobetia marina is a model proteobacteria in researches on marine biofouling. Its taxonomic nomenclature has been revised many times over the past few decades. To better understand the role of the surface-associated lifestyle of C. marina and the phylogeny of the family Halomonadaceae, we sequenced the entire genome of C. marina JCM 21022T using single molecule real-time sequencing technology (SMRT) and performed comparative genomics and phylogenomics analyses. The circular chromosome was 4 176 300 bp with an average GC content of 62.44% and contained 3 611 predicted coding sequences, 72 tRNA genes, and 21 rRNA genes. The C. marina JCM 21022T genome contained a set of crucial genes involved in surface colonization processes. The comparative genome analysis indicated the significant differences between C. marina JCM 21022T and Cobetia amphilecti KMM 296 (formerly named C. marina KMM 296) resulted from sequence insertions or deletions and chromosomal recombination. Despite these differences, pan and core genome analysis showed similar gene functions between the two strains. The phylogenomic study of the family Halomonadaceae is reported here for the first time. We found that the relationships were well resolved among every genera tested, including Chromohalobacter, Halomonas, Cobetia, Kushneria, Zymobacter, and Halotalea.
Czech, Laura; Hermann, Lucas; Stöveken, Nadine; Richter, Alexandra A.; Smits, Sander H. J.; Heider, Johann
2018-01-01
Fluctuations in environmental osmolarity are ubiquitous stress factors in many natural habitats of microorganisms, as they inevitably trigger osmotically instigated fluxes of water across the semi-permeable cytoplasmic membrane. Under hyperosmotic conditions, many microorganisms fend off the detrimental effects of water efflux and the ensuing dehydration of the cytoplasm and drop in turgor through the accumulation of a restricted class of organic osmolytes, the compatible solutes. Ectoine and its derivative 5-hydroxyectoine are prominent members of these compounds and are synthesized widely by members of the Bacteria and a few Archaea and Eukarya in response to high salinity/osmolarity and/or growth temperature extremes. Ectoines have excellent function-preserving properties, attributes that have led to their description as chemical chaperones and fostered the development of an industrial-scale biotechnological production process for their exploitation in biotechnology, skin care, and medicine. We review, here, the current knowledge on the biochemistry of the ectoine/hydroxyectoine biosynthetic enzymes and the available crystal structures of some of them, explore the genetics of the underlying biosynthetic genes and their transcriptional regulation, and present an extensive phylogenomic analysis of the ectoine/hydroxyectoine biosynthetic genes. In addition, we address the biochemistry, phylogenomics, and genetic regulation for the alternative use of ectoines as nutrients. PMID:29565833
Ghatak, Sandeep; Blom, Jochen; Das, Samir; Sanjukta, Rajkumari; Puro, Kekungu; Mawlong, Michael; Shakuntala, Ingudam; Sen, Arnab; Goesmann, Alexander; Kumar, Ashok; Ngachan, S V
2016-07-01
Aeromonas species are important pathogens of fishes and aquatic animals capable of infecting humans and other animals via food. Due to the paucity of pan-genomic studies on aeromonads, the present study was undertaken to analyse the pan-genome of three clinically important Aeromonas species (A. hydrophila, A. veronii, A. caviae). Results of pan-genome analysis revealed an open pan-genome for all three species with pan-genome sizes of 9181, 7214 and 6884 genes for A. hydrophila, A. veronii and A. caviae, respectively. Core-genome: pan-genome ratio (RCP) indicated greater genomic diversity for A. hydrophila and interestingly RCP emerged as an effective indicator to gauge genomic diversity which could possibly be extended to other organisms too. Phylogenomic network analysis highlighted the influence of homologous recombination and lateral gene transfer in the evolution of Aeromonas spp. Prediction of virulence factors indicated no significant difference among the three species though analysis of pathogenic potential and acquired antimicrobial resistance genes revealed greater hazards from A. hydrophila. In conclusion, the present study highlighted the usefulness of whole genome analyses to infer evolutionary cues for Aeromonas species which indicated considerable phylogenomic diversity for A. hydrophila and hitherto unknown genomic evidence for pathogenic potential of A. hydrophila compared to A. veronii and A. caviae.
Phylogenomic Data Yield New and Robust Insights into the Phylogeny and Evolution of Weevils.
Shin, Seunggwan; Clarke, Dave J; Lemmon, Alan R; Moriarty Lemmon, Emily; Aitken, Alexander L; Haddad, Stephanie; Farrell, Brian D; Marvaldi, Adriana E; Oberprieler, Rolf G; McKenna, Duane D
2018-04-01
The phylogeny and evolution of weevils (the beetle superfamily Curculionoidea) has been extensively studied, but many relationships, especially in the large family Curculionidae (true weevils; > 50,000 species), remain uncertain. We used phylogenomic methods to obtain DNA sequences from 522 protein-coding genes for representatives of all families of weevils and all subfamilies of Curculionidae. Most of our phylogenomic results had strong statistical support, and the inferred relationships were generally congruent with those reported in previous studies, but with some interesting exceptions. Notably, the backbone relationships of the weevil phylogeny were consistently strongly supported, and the former Nemonychidae (pine flower snout beetles) were polyphyletic, with the subfamily Cimberidinae (here elevated to Cimberididae) placed as sister group of all other weevils. The clade comprising the sister families Brentidae (straight-snouted weevils) and Curculionidae was maximally supported and the composition of both families was firmly established. The contributions of substitution modeling, codon usage and/or mutational bias to differences between trees reconstructed from amino acid and nucleotide sequences were explored. A reconstructed timetree for weevils is consistent with a Mesozoic radiation of gymnosperm-associated taxa to form most extant families and diversification of Curculionidae alongside flowering plants-first monocots, then other groups-beginning in the Cretaceous.
Complete genome of Cobetia marina JCM 21022T and phylogenomic analysis of the family Halomonadaceae
NASA Astrophysics Data System (ADS)
Tang, Xianghai; Xu, Kuipeng; Han, Xiaojuan; Mo, Zhaolan; Mao, Yunxiang
2016-09-01
Cobetia marina is a model proteobacteria in researches on marine biofouling. Its taxonomic nomenclature has been revised many times over the past few decades. To better understand the role of the surface-associated lifestyle of C. marina and the phylogeny of the family Halomonadaceae, we sequenced the entire genome of C. marina JCM 21022T using single molecule real-time sequencing technology (SMRT) and performed comparative genomics and phylogenomics analyses. The circular chromosome was 4 176 300 bp with an average GC content of 62.44% and contained 3 611 predicted coding sequences, 72 tRNA genes, and 21 rRNA genes. The C. marina JCM 21022T genome contained a set of crucial genes involved in surface colonization processes. The comparative genome analysis indicated the significant diff erences between C. marina JCM 21022T and Cobetia amphilecti KMM 296 (formerly named C. marina KMM 296) resulted from sequence insertions or deletions and chromosomal recombination. Despite these diff erences, pan and core genome analysis showed similar gene functions between the two strains. The phylogenomic study of the family Halomonadaceae is reported here for the first time. We found that the relationships were well resolved among every genera tested, including Chromohalobacter, Halomonas, Cobetia, Kushneria, Zymobacter, and Halotalea.
Genomic characterization reconfirms the taxonomic status of Lactobacillus parakefiri
TANIZAWA, Yasuhiro; KOBAYASHI, Hisami; KAMINUMA, Eli; SAKAMOTO, Mitsuo; OHKUMA, Moriya; NAKAMURA, Yasukazu; ARITA, Masanori; TOHNO, Masanori
2017-01-01
Whole-genome sequencing was performed for Lactobacillus parakefiri JCM 8573T to confirm its hitherto controversial taxonomic position. Here, we report its first reliable reference genome. Genome-wide metrics, such as average nucleotide identity and digital DNA-DNA hybridization, and phylogenomic analysis based on multiple genes supported its taxonomic status as a distinct species in the genus Lactobacillus. The availability of a reliable genome sequence will aid future investigations on the industrial applications of L. parakefiri in functional foods such as kefir grains. PMID:28748134
Phylogenomics and barcoding of Panax: toward the identification of ginseng species.
Manzanilla, V; Kool, A; Nguyen Nhat, L; Nong Van, H; Le Thi Thu, H; de Boer, H J
2018-04-03
The economic value of ginseng in the global medicinal plant trade is estimated to be in excess of US$2.1 billion. At the same time, the evolutionary placement of ginseng (Panax ginseng) and the complex evolutionary history of the genus is poorly understood despite several molecular phylogenetic studies. In this study, we use a full plastome phylogenomic framework to resolve relationships in Panax and to identify molecular markers for species discrimination. We used high-throughput sequencing of MBD2-Fc fractionated Panax DNA to supplement publicly available plastid genomes to create a phylogeny based on fully assembled and annotated plastid genomes from 60 accessions of 8 species. The plastome phylogeny based on a 163 kbp matrix resolves the sister relationship of Panax ginseng with P. quinquefolius. The closely related species P. vietnamensis is supported as sister of P. japonicus. The plastome matrix also shows that the markers trnC-rps16, trnS-trnG, and trnE-trnM could be used for unambiguous molecular identification of all the represented species in the genus. MBD2 depletion reduces the cost of plastome sequencing, which makes it a cost-effective alternative to Sanger sequencing based DNA barcoding for molecular identification. The plastome phylogeny provides a robust framework that can be used to study the evolution of morphological characters and biosynthesis pathways of ginsengosides for phylogenetic bioprospecting. Molecular identification of ginseng species is essential for authenticating ginseng in international trade and it provides an incentive for manufacturers to create authentic products with verified ingredients.
Andersson, Jan O; Sjögren, Åsa M; Horner, David S; Murphy, Colleen A; Dyal, Patricia L; Svärd, Staffan G; Logsdon, John M; Ragan, Mark A; Hirt, Robert P; Roger, Andrew J
2007-01-01
Background Comparative genomic studies of the mitochondrion-lacking protist group Diplomonadida (diplomonads) has been lacking, although Giardia lamblia has been intensively studied. We have performed a sequence survey project resulting in 2341 expressed sequence tags (EST) corresponding to 853 unique clones, 5275 genome survey sequences (GSS), and eleven finished contigs from the diplomonad fish parasite Spironucleus salmonicida (previously described as S. barkhanus). Results The analyses revealed a compact genome with few, if any, introns and very short 3' untranslated regions. Strikingly different patterns of codon usage were observed in genes corresponding to frequently sampled ESTs versus genes poorly sampled, indicating that translational selection is influencing the codon usage of highly expressed genes. Rigorous phylogenomic analyses identified 84 genes – mostly encoding metabolic proteins – that have been acquired by diplomonads or their relatively close ancestors via lateral gene transfer (LGT). Although most acquisitions were from prokaryotes, more than a dozen represent likely transfers of genes between eukaryotic lineages. Many genes that provide novel insights into the genetic basis of the biology and pathogenicity of this parasitic protist were identified including 149 that putatively encode variant-surface cysteine-rich proteins which are candidate virulence factors. A number of genomic properties that distinguish S. salmonicida from its human parasitic relative G. lamblia were identified such as nineteen putative lineage-specific gene acquisitions, distinct mutational biases and codon usage and distinct polyadenylation signals. Conclusion Our results highlight the power of comparative genomic studies to yield insights into the biology of parasitic protists and the evolution of their genomes, and suggest that genetic exchange between distantly-related protist lineages may be occurring at an appreciable rate in eukaryote genome evolution. PMID:17298675
Parks, Matthew B; Wickett, Norman J; Alverson, Andrew J
2018-01-01
Abstract Diatoms (Bacillariophyta) are a species-rich group of eukaryotic microbes diverse in morphology, ecology, and metabolism. Previous reconstructions of the diatom phylogeny based on one or a few genes have resulted in inconsistent resolution or low support for critical nodes. We applied phylogenetic paralog pruning techniques to a data set of 94 diatom genomes and transcriptomes to infer perennially difficult species relationships, using concatenation and summary-coalescent methods to reconstruct species trees from data sets spanning a wide range of thresholds for taxon and column occupancy in gene alignments. Conflicts between gene and species trees decreased with both increasing taxon occupancy and bootstrap cutoffs applied to gene trees. Concordance between gene and species trees was lowest for short internodes and increased logarithmically with increasing edge length, suggesting that incomplete lineage sorting disproportionately affects species tree inference at short internodes, which are a common feature of the diatom phylogeny. Although species tree topologies were largely consistent across many data treatments, concatenation methods appeared to outperform summary-coalescent methods for sparse alignments. Our results underscore that approaches to species-tree inference based on few loci are likely to be misled by unrepresentative sampling of gene histories, particularly in lineages that may have diversified rapidly. In addition, phylogenomic studies of diatoms, and potentially other hyperdiverse groups, should maximize the number of gene trees with high taxon occupancy, though there is clearly a limit to how many of these genes will be available. PMID:29040712
Phylogenomic Insights into Mouse Evolution Using a Pseudoreference Approach
Sarver, Brice A.J.; Keeble, Sara; Cosart, Ted; Tucker, Priscilla K.; Dean, Matthew D.
2017-01-01
Comparative genomic studies are now possible across a broad range of evolutionary timescales, but the generation and analysis of genomic data across many different species still present a number of challenges. The most sophisticated genotyping and down-stream analytical frameworks are still predominantly based on comparisons to high-quality reference genomes. However, established genomic resources are often limited within a given group of species, necessitating comparisons to divergent reference genomes that could restrict or bias comparisons across a phylogenetic sample. Here, we develop a scalable pseudoreference approach to iteratively incorporate sample-specific variation into a genome reference and reduce the effects of systematic mapping bias in downstream analyses. To characterize this framework, we used targeted capture to sequence whole exomes (∼54 Mbp) in 12 lineages (ten species) of mice spanning the Mus radiation. We generated whole exome pseudoreferences for all species and show that this iterative reference-based approach improved basic genomic analyses that depend on mapping accuracy while preserving the associated annotations of the mouse reference genome. We then use these pseudoreferences to resolve evolutionary relationships among these lineages while accounting for phylogenetic discordance across the genome, contributing an important resource for comparative studies in the mouse system. We also describe patterns of genomic introgression among lineages and compare our results to previous studies. Our general approach can be applied to whole or partitioned genomic data and is easily portable to any system with sufficient genomic resources, providing a useful framework for phylogenomic studies in mice and other taxa. PMID:28338821
Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns.
Shen, Hui; Jin, Dongmei; Shu, Jiang-Ping; Zhou, Xi-Le; Lei, Ming; Wei, Ran; Shang, Hui; Wei, Hong-Jin; Zhang, Rui; Liu, Li; Gu, Yu-Feng; Zhang, Xian-Chun; Yan, Yue-Hong
2018-02-01
Ferns, originated about 360 million years ago, are the sister group of seed plants. Despite the remarkable progress in our understanding of fern phylogeny, with conflicting molecular evidence and different morphological interpretations, relationships among major fern lineages remain controversial. With the aim to obtain a robust fern phylogeny, we carried out a large-scale phylogenomic analysis using high-quality transcriptome sequencing data, which covered 69 fern species from 38 families and 11 orders. Both coalescent-based and concatenation-based methods were applied to both nucleotide and amino acid sequences in species tree estimation. The resulting topologies are largely congruent with each other, except for the placement of Angiopteris fokiensis, Cheiropleuria bicuspis, Diplaziopsis brunoniana, Matteuccia struthiopteris, Elaphoglossum mcclurei, and Tectaria subpedata. Our result confirmed that Equisetales is sister to the rest of ferns, and Dennstaedtiaceae is sister to eupolypods. Moreover, our result strongly supported some relationships different from the current view of fern phylogeny, including that Marattiaceae may be sister to the monophyletic clade of Psilotaceae and Ophioglossaceae; that Gleicheniaceae and Hymenophyllaceae form a monophyletic clade sister to Dipteridaceae; and that Aspleniaceae is sister to the rest of the groups in eupolypods II. These results were interpreted with morphological traits, especially sporangia characters, and a new evolutionary route of sporangial annulus in ferns was suggested. This backbone phylogeny in ferns sets a foundation for further studies in biology and evolution in ferns, and therefore in plants. © The Authors 2017. Published by Oxford University Press.
Large-scale phylogenomic analysis resolves a backbone phylogeny in ferns
Shen, Hui; Jin, Dongmei; Shu, Jiang-Ping; Zhou, Xi-Le; Lei, Ming; Wei, Ran; Shang, Hui; Wei, Hong-Jin; Zhang, Rui; Liu, Li; Gu, Yu-Feng; Zhang, Xian-Chun; Yan, Yue-Hong
2018-01-01
Abstract Background Ferns, originated about 360 million years ago, are the sister group of seed plants. Despite the remarkable progress in our understanding of fern phylogeny, with conflicting molecular evidence and different morphological interpretations, relationships among major fern lineages remain controversial. Results With the aim to obtain a robust fern phylogeny, we carried out a large-scale phylogenomic analysis using high-quality transcriptome sequencing data, which covered 69 fern species from 38 families and 11 orders. Both coalescent-based and concatenation-based methods were applied to both nucleotide and amino acid sequences in species tree estimation. The resulting topologies are largely congruent with each other, except for the placement of Angiopteris fokiensis, Cheiropleuria bicuspis, Diplaziopsis brunoniana, Matteuccia struthiopteris, Elaphoglossum mcclurei, and Tectaria subpedata. Conclusions Our result confirmed that Equisetales is sister to the rest of ferns, and Dennstaedtiaceae is sister to eupolypods. Moreover, our result strongly supported some relationships different from the current view of fern phylogeny, including that Marattiaceae may be sister to the monophyletic clade of Psilotaceae and Ophioglossaceae; that Gleicheniaceae and Hymenophyllaceae form a monophyletic clade sister to Dipteridaceae; and that Aspleniaceae is sister to the rest of the groups in eupolypods II. These results were interpreted with morphological traits, especially sporangia characters, and a new evolutionary route of sporangial annulus in ferns was suggested. This backbone phylogeny in ferns sets a foundation for further studies in biology and evolution in ferns, and therefore in plants. PMID:29186447
Phylogenomic analyses of malaria parasites and evolution of their exported proteins
2011-01-01
Background Plasmodium falciparum is the most malignant agent of human malaria. It belongs to the taxon Laverania, which includes other ape-infecting Plasmodium species. The origin of the Laverania is still debated. P. falciparum exports pathogenicity-related proteins into the host cell using the Plasmodium export element (PEXEL). Predictions based on the presence of a PEXEL motif suggest that more than 300 proteins are exported by P. falciparum, while there are many fewer exported proteins in non-Laverania. Results A whole-genome approach was applied to resolve the phylogeny of eight Plasmodium species and four outgroup taxa. By using 218 orthologous proteins we received unanimous support for a sister group position of Laverania and avian malaria parasites. This observation was corroborated by the analyses of 28 exported proteins with orthologs present in all Plasmodium species. Most interestingly, several deviations from the P. falciparum PEXEL motif were found to be present in the orthologous sequences of non-Laverania. Conclusion Our phylogenomic analyses strongly support the hypotheses that the Laverania have been founded by a single Plasmodium species switching from birds to African great apes or vice versa. The deviations from the canonical PEXEL motif in orthologs may explain the comparably low number of exported proteins that have been predicted in non-Laverania. PMID:21676252
Kück, Patrick; Struck, Torsten H
2014-01-01
BaCoCa (BAse COmposition CAlculator) is a user-friendly software that combines multiple statistical approaches (like RCFV and C value calculations) to identify biases in aligned sequence data which potentially mislead phylogenetic reconstructions. As a result of its speed and flexibility, the program provides the possibility to analyze hundreds of pre-defined gene partitions and taxon subsets in one single process run. BaCoCa is command-line driven and can be easily integrated into automatic process pipelines of phylogenomic studies. Moreover, given the tab-delimited output style the results can be easily used for further analyses in programs like Excel or statistical packages like R. A built-in option of BaCoCa is the generation of heat maps with hierarchical clustering of certain results using R. As input files BaCoCa can handle FASTA and relaxed PHYLIP, which are commonly used in phylogenomic pipelines. BaCoCa is implemented in Perl and works on Windows PCs, Macs and Linux operating systems. The executable source code as well as example test files and a detailed documentation of BaCoCa are freely available at http://software.zfmk.de. Copyright © 2013 Elsevier Inc. All rights reserved.
Defining the phylogenomics of Shigella species: a pathway to diagnostics.
Sahl, Jason W; Morris, Carolyn R; Emberger, Jennifer; Fraser, Claire M; Ochieng, John Benjamin; Juma, Jane; Fields, Barry; Breiman, Robert F; Gilmour, Matthew; Nataro, James P; Rasko, David A
2015-03-01
Shigellae cause significant diarrheal disease and mortality in humans, as there are approximately 163 million episodes of shigellosis and 1.1 million deaths annually. While significant strides have been made in the understanding of the pathogenesis, few studies on the genomic content of the Shigella species have been completed. The goal of this study was to characterize the genomic diversity of Shigella species through sequencing of 55 isolates representing members of each of the four Shigella species: S. flexneri, S. sonnei, S. boydii, and S. dysenteriae. Phylogeny inferred from 336 available Shigella and Escherichia coli genomes defined exclusive clades of Shigella; conserved genomic markers that can identify each clade were then identified. PCR assays were developed for each clade-specific marker, which was combined with an amplicon for the conserved Shigella invasion antigen, IpaH3, into a multiplex PCR assay. This assay demonstrated high specificity, correctly identifying 218 of 221 presumptive Shigella isolates, and sensitivity, by not identifying any of 151 diverse E. coli isolates incorrectly as Shigella. This new phylogenomics-based PCR assay represents a valuable tool for rapid typing of uncharacterized Shigella isolates and provides a framework that can be utilized for the identification of novel genomic markers from genomic data. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Defining the Phylogenomics of Shigella Species: a Pathway to Diagnostics
Sahl, Jason W.; Morris, Carolyn R.; Emberger, Jennifer; Fraser, Claire M.; Ochieng, John Benjamin; Juma, Jane; Fields, Barry; Breiman, Robert F.; Gilmour, Matthew; Nataro, James P.
2015-01-01
Shigellae cause significant diarrheal disease and mortality in humans, as there are approximately 163 million episodes of shigellosis and 1.1 million deaths annually. While significant strides have been made in the understanding of the pathogenesis, few studies on the genomic content of the Shigella species have been completed. The goal of this study was to characterize the genomic diversity of Shigella species through sequencing of 55 isolates representing members of each of the four Shigella species: S. flexneri, S. sonnei, S. boydii, and S. dysenteriae. Phylogeny inferred from 336 available Shigella and Escherichia coli genomes defined exclusive clades of Shigella; conserved genomic markers that can identify each clade were then identified. PCR assays were developed for each clade-specific marker, which was combined with an amplicon for the conserved Shigella invasion antigen, IpaH3, into a multiplex PCR assay. This assay demonstrated high specificity, correctly identifying 218 of 221 presumptive Shigella isolates, and sensitivity, by not identifying any of 151 diverse E. coli isolates incorrectly as Shigella. This new phylogenomics-based PCR assay represents a valuable tool for rapid typing of uncharacterized Shigella isolates and provides a framework that can be utilized for the identification of novel genomic markers from genomic data. PMID:25588655
Machado, Lilian de Oliveira; Vieira, Leila do Nascimento; Stefenon, Valdir Marcos; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Guerra, Miguel Pedro; Nodari, Rubens Onofre
2017-04-01
Given their distribution, importance, and richness, Myrtaceae species comprise a model system for studying the evolution of tropical plant diversity. In addition, chloroplast (cp) genome sequencing is an efficient tool for phylogenetic relationship studies. Feijoa [Acca sellowiana (O. Berg) Burret; CN: pineapple-guava] is a Myrtaceae species that occurs naturally in southern Brazil and northern Uruguay. Feijoa is known for its exquisite perfume and flavorful fruits, pharmacological properties, ornamental value and increasing economic relevance. In the present work, we reported the complete cp genome of feijoa. The feijoa cp genome is a circular molecule of 159,370 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC 88,028 bp) and a Small Single Copy region (SSC 18,598 bp) separated by Inverted Repeat regions (IRs 26,372 bp). The genome structure, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. When compared to other cp genome sequences of Myrtaceae, feijoa showed closest relationship with pitanga (Eugenia uniflora L.). Furthermore, a comparison of pitanga synonymous (Ks) and nonsynonymous (Ka) substitution rates revealed extremely low values. Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of three Myrtoideae clades.
Spider phylogenomics: untangling the Spider Tree of Life.
Garrison, Nicole L; Rodriguez, Juanita; Agnarsson, Ingi; Coddington, Jonathan A; Griswold, Charles E; Hamilton, Christopher A; Hedin, Marshal; Kocot, Kevin M; Ledford, Joel M; Bond, Jason E
2016-01-01
Spiders (Order Araneae) are massively abundant generalist arthropod predators that are found in nearly every ecosystem on the planet and have persisted for over 380 million years. Spiders have long served as evolutionary models for studying complex mating and web spinning behaviors, key innovation and adaptive radiation hypotheses, and have been inspiration for important theories like sexual selection by female choice. Unfortunately, past major attempts to reconstruct spider phylogeny typically employing the "usual suspect" genes have been unable to produce a well-supported phylogenetic framework for the entire order. To further resolve spider evolutionary relationships we have assembled a transcriptome-based data set comprising 70 ingroup spider taxa. Using maximum likelihood and shortcut coalescence-based approaches, we analyze eight data sets, the largest of which contains 3,398 gene regions and 696,652 amino acid sites forming the largest phylogenomic analysis of spider relationships produced to date. Contrary to long held beliefs that the orb web is the crowning achievement of spider evolution, ancestral state reconstructions of web type support a phylogenetically ancient origin of the orb web, and diversification analyses show that the mostly ground-dwelling, web-less RTA clade diversified faster than orb weavers. Consistent with molecular dating estimates we report herein, this may reflect a major increase in biomass of non-flying insects during the Cretaceous Terrestrial Revolution 125-90 million years ago favoring diversification of spiders that feed on cursorial rather than flying prey. Our results also have major implications for our understanding of spider systematics. Phylogenomic analyses corroborate several well-accepted high level groupings: Opisthothele, Mygalomorphae, Atypoidina, Avicularoidea, Theraphosoidina, Araneomorphae, Entelegynae, Araneoidea, the RTA clade, Dionycha and the Lycosoidea. Alternatively, our results challenge the monophyly of Eresoidea, Orbiculariae, and Deinopoidea. The composition of the major paleocribellate and neocribellate clades, the basal divisions of Araneomorphae, appear to be falsified. Traditional Haplogynae is in need of revision, as our findings appear to support the newly conceived concept of Synspermiata. The sister pairing of filistatids with hypochilids implies that some peculiar features of each family may in fact be synapomorphic for the pair. Leptonetids now are seen as a possible sister group to the Entelegynae, illustrating possible intermediates in the evolution of the more complex entelegyne genitalic condition, spinning organs and respiratory organs.
Spider phylogenomics: untangling the Spider Tree of Life
Garrison, Nicole L.; Rodriguez, Juanita; Agnarsson, Ingi; Coddington, Jonathan A.; Griswold, Charles E.; Hamilton, Christopher A.; Hedin, Marshal; Kocot, Kevin M.; Ledford, Joel M.
2016-01-01
Spiders (Order Araneae) are massively abundant generalist arthropod predators that are found in nearly every ecosystem on the planet and have persisted for over 380 million years. Spiders have long served as evolutionary models for studying complex mating and web spinning behaviors, key innovation and adaptive radiation hypotheses, and have been inspiration for important theories like sexual selection by female choice. Unfortunately, past major attempts to reconstruct spider phylogeny typically employing the “usual suspect” genes have been unable to produce a well-supported phylogenetic framework for the entire order. To further resolve spider evolutionary relationships we have assembled a transcriptome-based data set comprising 70 ingroup spider taxa. Using maximum likelihood and shortcut coalescence-based approaches, we analyze eight data sets, the largest of which contains 3,398 gene regions and 696,652 amino acid sites forming the largest phylogenomic analysis of spider relationships produced to date. Contrary to long held beliefs that the orb web is the crowning achievement of spider evolution, ancestral state reconstructions of web type support a phylogenetically ancient origin of the orb web, and diversification analyses show that the mostly ground-dwelling, web-less RTA clade diversified faster than orb weavers. Consistent with molecular dating estimates we report herein, this may reflect a major increase in biomass of non-flying insects during the Cretaceous Terrestrial Revolution 125–90 million years ago favoring diversification of spiders that feed on cursorial rather than flying prey. Our results also have major implications for our understanding of spider systematics. Phylogenomic analyses corroborate several well-accepted high level groupings: Opisthothele, Mygalomorphae, Atypoidina, Avicularoidea, Theraphosoidina, Araneomorphae, Entelegynae, Araneoidea, the RTA clade, Dionycha and the Lycosoidea. Alternatively, our results challenge the monophyly of Eresoidea, Orbiculariae, and Deinopoidea. The composition of the major paleocribellate and neocribellate clades, the basal divisions of Araneomorphae, appear to be falsified. Traditional Haplogynae is in need of revision, as our findings appear to support the newly conceived concept of Synspermiata. The sister pairing of filistatids with hypochilids implies that some peculiar features of each family may in fact be synapomorphic for the pair. Leptonetids now are seen as a possible sister group to the Entelegynae, illustrating possible intermediates in the evolution of the more complex entelegyne genitalic condition, spinning organs and respiratory organs. PMID:26925338
Chavda, Kalyan D; Chen, Liang; Fouts, Derrick E; Sutton, Granger; Brinkac, Lauren; Jenkins, Stephen G; Bonomo, Robert A; Adams, Mark D; Kreiswirth, Barry N
2016-12-13
Knowledge regarding the genomic structure of Enterobacter spp., the second most prevalent carbapenemase-producing Enterobacteriaceae, remains limited. Here we sequenced 97 clinical Enterobacter species isolates that were both carbapenem susceptible and resistant from various geographic regions to decipher the molecular origins of carbapenem resistance and to understand the changing phylogeny of these emerging and drug-resistant pathogens. Of the carbapenem-resistant isolates, 30 possessed bla KPC-2 , 40 had bla KPC-3 , 2 had bla KPC-4 , and 2 had bla NDM-1 Twenty-three isolates were carbapenem susceptible. Six genomes were sequenced to completion, and their sizes ranged from 4.6 to 5.1 Mbp. Phylogenomic analysis placed 96 of these genomes, 351 additional Enterobacter genomes downloaded from NCBI GenBank, and six newly sequenced type strains into 19 phylogenomic groups-18 groups (A to R) in the Enterobacter cloacae complex and Enterobacter aerogenes Diverse mechanisms underlying the molecular evolutionary trajectory of these drug-resistant Enterobacter spp. were revealed, including the acquisition of an antibiotic resistance plasmid, followed by clonal spread, horizontal transfer of bla KPC -harboring plasmids between different phylogenomic groups, and repeated transposition of the bla KPC gene among different plasmid backbones. Group A, which comprises multilocus sequence type 171 (ST171), was the most commonly identified (23% of isolates). Genomic analysis showed that ST171 isolates evolved from a common ancestor and formed two different major clusters; each acquiring unique bla KPC -harboring plasmids, followed by clonal expansion. The data presented here represent the first comprehensive study of phylogenomic interrogation and the relationship between antibiotic resistance and plasmid discrimination among carbapenem-resistant Enterobacter spp., demonstrating the genetic diversity and complexity of the molecular mechanisms driving antibiotic resistance in this genus. Enterobacter spp., especially carbapenemase-producing Enterobacter spp., have emerged as a clinically significant cause of nosocomial infections. However, only limited information is available on the distribution of carbapenem resistance across this genus. Augmenting this problem is an erroneous identification of Enterobacter strains because of ambiguous typing methods and imprecise taxonomy. In this study, we used a whole-genome-based comparative phylogenetic approach to (i) revisit and redefine the genus Enterobacter and (ii) unravel the emergence and evolution of the Klebsiella pneumoniae carbapenemase-harboring Enterobacter spp. Using genomic analysis of 447 sequenced strains, we developed an improved understanding of the species designations within this complex genus and identified the diverse mechanisms driving the molecular evolution of carbapenem resistance. The findings in this study provide a solid genomic framework that will serve as an important resource in the future development of molecular diagnostics and in supporting drug discovery programs. Copyright © 2016 Chavda et al.
Jarvis, Erich D
2016-01-01
The rapid pace of advances in genome technology, with concomitant reductions in cost, makes it feasible that one day in our lifetime we will have available extant genomes of entire classes of species, including vertebrates. I recently helped cocoordinate the large-scale Avian Phylogenomics Project, which collected and sequenced genomes of 48 bird species representing most currently classified orders to address a range of questions in phylogenomics and comparative genomics. The consortium was able to answer questions not previously possible with just a few genomes. This success spurred on the creation of a project to sequence the genomes of at least one individual of all extant ∼10,500 bird species. The initiation of this project has led us to consider what questions now impossible to answer could be answered with all genomes, and could drive new questions now unimaginable. These include the generation of a highly resolved family tree of extant species, genome-wide association studies across species to identify genetic substrates of many complex traits, redefinition of species and the species concept, reconstruction of the genomes of common ancestors, and generation of new computational tools to address these questions. Here I present visions for the future by posing and answering questions regarding what scientists could potentially do with available genomes of an entire vertebrate class.
Phylogenomics of Colombian Helicobacter pylori isolates.
Gutiérrez-Escobar, Andrés Julián; Trujillo, Esperanza; Acevedo, Orlando; Bravo, María Mercedes
2017-01-01
During the Spanish colonisation of South America, African slaves and Europeans arrived in the continent with their corresponding load of pathogens, including Helicobacter pylori . Colombian strains have been clustered with the hpEurope population and with the hspWestAfrica subpopulation in multilocus sequence typing (MLST) studies. However, ancestry studies have revealed the presence of population components specific to H. pylori in Colombia. The aim of this study was to perform a thorough phylogenomic analysis to describe the evolution of the Colombian urban H. pylori isolates. A total of 115 genomes of H. pylori were sequenced with Illumina technology from H. pylori isolates obtained in Colombia in a region of high risk for gastric cancer. The genomes were assembled, annotated and underwent phylogenomic analysis with 36 reference strains. Additionally, population differentiation analyses were performed for two bacterial genes. The phylogenetic tree revealed clustering of the Colombian strains with hspWestAfrica and hpEurope, along with three clades formed exclusively by Colombian strains, suggesting the presence of independent evolutionary lines for Colombia. Additionally, the nucleotide diversity of horB and vacA genes from Colombian isolates was lower than in the reference strains and showed a significant genetic differentiation supporting the hypothesis of independent clades with recent evolution. The presence of specific lineages suggest the existence of an hspColombia subtype that emerged from a small and relatively isolated ancestral population that accompanied crossbreeding of human population in Colombia.
Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.
Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W
2018-05-18
Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.
Phylogenomic and Domain Analysis of Iterative Polyketide Synthases in Aspergillus Species
Lin, Shu-Hsi; Yoshimoto, Miwa; Lyu, Ping-Chiang; Tang, Chuan-Yi; Arita, Masanori
2012-01-01
Aspergillus species are industrially and agriculturally important as fermentors and as producers of various secondary metabolites. Among them, fungal polyketides such as lovastatin and melanin are considered a gold mine for bioactive compounds. We used a phylogenomic approach to investigate the distribution of iterative polyketide synthases (PKS) in eight sequenced Aspergilli and classified over 250 fungal genes. Their genealogy by the conserved ketosynthase (KS) domain revealed three large groups of nonreducing PKS, one group inside bacterial PKS, and more than 9 small groups of reducing PKS. Polyphyly of nonribosomal peptide synthase (NRPS)-PKS genes raised questions regarding the recruitment of the elegant conjugation machinery. High rates of gene duplication and divergence were frequent. All data are accessible through our web database at http://metabolomics.jp/wiki/Category:PK. PMID:22844193
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses
Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy
2015-01-01
Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579
Yang, Ya; Moore, Michael J.; Brockington, Samuel F.; Soltis, Douglas E.; Wong, Gane Ka-Shu; Carpenter, Eric J.; Zhang, Yong; Chen, Li; Yan, Zhixiang; Xie, Yinlong; Sage, Rowan F.; Covshoff, Sarah; Hibberd, Julian M.; Nelson, Matthew N.; Smith, Stephen A.
2015-01-01
Many phylogenomic studies based on transcriptomes have been limited to “single-copy” genes due to methodological challenges in homology and orthology inferences. Only a relatively small number of studies have explored analyses beyond reconstructing species relationships. We sampled 69 transcriptomes in the hyperdiverse plant clade Caryophyllales and 27 outgroups from annotated genomes across eudicots. Using a combined similarity- and phylogenetic tree-based approach, we recovered 10,960 homolog groups, where each was represented by at least eight ingroup taxa. By decomposing these homolog trees, and taking gene duplications into account, we obtained 17,273 ortholog groups, where each was represented by at least ten ingroup taxa. We reconstructed the species phylogeny using a 1,122-gene data set with a gene occupancy of 92.1%. From the homolog trees, we found that both synonymous and nonsynonymous substitution rates in herbaceous lineages are up to three times as fast as in their woody relatives. This is the first time such a pattern has been shown across thousands of nuclear genes with dense taxon sampling. We also pinpointed regions of the Caryophyllales tree that were characterized by relatively high frequencies of gene duplication, including three previously unrecognized whole-genome duplications. By further combining information from homolog tree topology and synonymous distance between paralog pairs, phylogenetic locations for 13 putative genome duplication events were identified. Genes that experienced the greatest gene family expansion were concentrated among those involved in signal transduction and oxidoreduction, including a cytochrome P450 gene that encodes a key enzyme in the betalain synthesis pathway. Our approach demonstrates a new approach for functional phylogenomic analysis in nonmodel species that is based on homolog groups in addition to inferred ortholog groups. PMID:25837578
Chen, Tsute; Siddiqui, Huma; Olsen, Ingar
2017-01-01
Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.
Tan, Joon Liang; Khang, Tsung Fei; Ngeow, Yun Fong; Choo, Siew Woh
2013-12-13
Mycobacterium abscessus is a rapidly growing mycobacterium that is often associated with human infections. The taxonomy of this species has undergone several revisions and is still being debated. In this study, we sequenced the genomes of 12 M. abscessus strains and used phylogenomic analysis to perform subspecies classification. A data mining approach was used to rank and select informative genes based on the relative entropy metric for the construction of a phylogenetic tree. The resulting tree topology was similar to that generated using the concatenation of five classical housekeeping genes: rpoB, hsp65, secA, recA and sodA. Additional support for the reliability of the subspecies classification came from the analysis of erm41 and ITS gene sequences, single nucleotide polymorphisms (SNPs)-based classification and strain clustering demonstrated by a variable number tandem repeat (VNTR) assay and a multilocus sequence analysis (MLSA). We subsequently found that the concatenation of a minimal set of three median-ranked genes: DNA polymerase III subunit alpha (polC), 4-hydroxy-2-ketovalerate aldolase (Hoa) and cell division protein FtsZ (ftsZ), is sufficient to recover the same tree topology. PCR assays designed specifically for these genes showed that all three genes could be amplified in the reference strain of M. abscessus ATCC 19977T. This study provides proof of concept that whole-genome sequence-based data mining approach can provide confirmatory evidence of the phylogenetic informativeness of existing markers, as well as lead to the discovery of a more economical and informative set of markers that produces similar subspecies classification in M. abscessus. The systematic procedure used in this study to choose the informative minimal set of gene markers can potentially be applied to species or subspecies classification of other bacteria.
Chen, Tsute; Siddiqui, Huma; Olsen, Ingar
2017-01-01
Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563
Tekle, Yonas I; Anderson, O Roger; Katz, Laura A; Maurer-Alcalá, Xyrus X; Romero, Mario Alberto Cerón; Molestina, Robert
2016-06-01
The majority of amoeboid lineages with flattened body forms are placed under a taxonomic hypothetical class 'Discosea' sensu Smirnov et al. (2011), which encompasses some of the most diverse morphs within Amoebozoa. However, its taxonomy and phylogeny is poorly understood. This is partly due to lack of support in studies that are based on limited gene sampling. In this study we use a phylogenomic approach including newly-generated RNA-Seq data and comprehensive taxon sampling to resolve the phylogeny of 'Discosea'. Our analysis included representatives from all orders of 'Discosea' and up to 550 genes, the largest gene sampling in Amoebozoa to date. We conducted extensive analyses to assess the robustness of our resulting phylogenies to effects of missing data and outgroup choice using probabilistic methods. All of our analyses, which explore the impact of varying amounts of missing data, consistently recover well-resolved and supported groups of Amoebozoa. Our results neither support the monophyly nor dichotomy of 'Discosea' as defined by Smirnov et al. (2011). Rather, we recover a robust well-resolved clade referred to as Eudiscosea encompassing the majority of discosean orders (seven of the nine studied here), while the Dactylopodida, Thecamoebida and Himatismenida, previously included in 'Discosea,' are non-monophyletic. We also recover novel relationships within the Eudiscosea that are largely congruent with morphology. Our analyses enabled us to place some incertae sedis lineages and previously unstable lineages such as Vermistella, Mayorella, Gocevia, and Stereomyxa. We recommend some phylogeny-based taxonomic amendments highlighting the new findings of this study and discuss the evolution of the group based on our current understanding. Copyright © 2016 Elsevier Inc. All rights reserved.
Lemieux, Claude; Otis, Christian; Turmel, Monique
2014-10-01
The green algae represent one of the most successful groups of photosynthetic eukaryotes, but compared to their land plant relatives, surprisingly little is known about their evolutionary history. This is in great part due to the difficulty of recognizing species diversity behind morphologically similar organisms. The Trebouxiophyceae is a species-rich class of the Chlorophyta that includes symbionts (e.g. lichenized algae) as well as free-living green algae. Members of this group display remarkable ecological variation, occurring in aquatic, terrestrial and aeroterrestrial environments. Because a reliable backbone phylogeny is essential to understand the evolutionary history of the Trebouxiophyceae, we sought to identify the relationships among the major trebouxiophycean lineages that have been previously recognized in nuclear-encoded 18S rRNA phylogenies. To this end, we used a chloroplast phylogenomic approach. We determined the sequences of 29 chlorophyte chloroplast genomes and assembled amino acid and nucleotide data sets derived from 79 chloroplast genes of 61 chlorophytes, including 35 trebouxiophyceans. The amino acid- and nucleotide-based phylogenies inferred using maximum likelihood and Bayesian methods and various models of sequence evolution revealed essentially the same relationships for the trebouxiophyceans. Two major groups were identified: a strongly supported clade of 29 taxa (core trebouxiophyceans) that is sister to the Chlorophyceae + Ulvophyceae and a clade comprising the Chlorellales and Pedinophyceae that represents a basal divergence relative to the former group. The core trebouxiophyceans form a grade of strongly supported clades that include a novel lineage represented by the desert crust alga Pleurastrosarcina brevispinosa. The assemblage composed of the Oocystis and Geminella clades is the deepest divergence of the core trebouxiophyceans. Like most of the chlorellaleans, early-diverging core trebouxiophyceans are predominantly planktonic species, whereas core trebouxiophyceans occupying more derived lineages are mostly terrestrial or aeroterrestrial algae. Our phylogenomic study provides a solid foundation for addressing fundamental questions related to the biology and ecology of the Trebouxiophyceae. The inferred trees reveal that this class is not monophyletic; they offer new insights not only into the internal structure of the class but also into the lifestyle of its founding members and subsequent adaptations to changing environments.
Lemieux, Claude; Vincent, Antony T; Labarre, Aurélie; Otis, Christian; Turmel, Monique
2015-12-01
The class Chlorophyceae (Chlorophyta) includes morphologically and ecologically diverse green algae. Most of the documented species belong to the clade formed by the Chlamydomonadales (also called Volvocales) and Sphaeropleales. Although studies based on the nuclear 18S rRNA gene or a few combined genes have shed light on the diversity and phylogenetic structure of the Chlamydomonadales, the positions of many of the monophyletic groups identified remain uncertain. Here, we used a chloroplast phylogenomic approach to delineate the relationships among these lineages. To generate the analyzed amino acid and nucleotide data sets, we sequenced the chloroplast DNAs (cpDNAs) of 24 chlorophycean taxa; these included representatives from 16 of the 21 primary clades previously recognized in the Chlamydomonadales, two taxa from a coccoid lineage (Jenufa) that was suspected to be sister to the Golenkiniaceae, and two sphaeroplealeans. Using Bayesian and/or maximum likelihood inference methods, we analyzed an amino acid data set that was assembled from 69 cpDNA-encoded proteins of 73 core chlorophyte (including 33 chlorophyceans), as well as two nucleotide data sets that were generated from the 69 genes coding for these proteins and 29 RNA-coding genes. The protein and gene phylogenies were congruent and robustly resolved the branching order of most of the investigated lineages. Within the Chlamydomonadales, 22 taxa formed an assemblage of five major clades/lineages. The earliest-diverging clade displayed Hafniomonas laevis and the Crucicarteria, and was followed by the Radicarteria and then by the Chloromonadinia. The latter lineage was sister to two superclades, one consisting of the Oogamochlamydinia and Reinhardtinia and the other of the Caudivolvoxa and Xenovolvoxa. To our surprise, the Jenufa species and the two spine-bearing green algae belonging to the Golenkinia and Treubaria genera were recovered in a highly supported monophyletic group that also included three taxa representing distinct families of the Sphaeropleales (Bracteacoccaceae, Mychonastaceae, and Scenedesmaceae). Our phylogenomic study advances our knowledge regarding the circumscription and internal structure of the Chlamydomonadales, suggesting that a previously unrecognized lineage is sister to the Sphaeropleales. In addition, it offers new insights into the flagellar structures of the founding members of both the Chlamydomonadales and Sphaeropleales.
The Comparative Genomics and Phylogenomics of Leishmania amazonensis Parasite.
Tschoeke, Diogo A; Nunes, Gisele L; Jardim, Rodrigo; Lima, Joana; Dumaresq, Aline Sr; Gomes, Monete R; de Mattos Pereira, Leandro; Loureiro, Daniel R; Stoco, Patricia H; de Matos Guedes, Herbert Leonel; de Miranda, Antonio Basilio; Ruiz, Jeronimo; Pitaluga, André; Silva, Floriano P; Probst, Christian M; Dickens, Nicholas J; Mottram, Jeremy C; Grisard, Edmundo C; Dávila, Alberto Mr
2014-01-01
Leishmaniasis is an infectious disease caused by Leishmania species. Leishmania amazonensis is a New World Leishmania species belonging to the Mexicana complex, which is able to cause all types of leishmaniasis infections. The L. amazonensis reference strain MHOM/BR/1973/M2269 was sequenced identifying 8,802 codifying sequences (CDS), most of them of hypothetical function. Comparative analysis using six Leishmania species showed a core set of 7,016 orthologs. L. amazonensis and Leishmania mexicana share the largest number of distinct orthologs, while Leishmania braziliensis presented the largest number of inparalogs. Additionally, phylogenomic analysis confirmed the taxonomic position for L. amazonensis within the "Mexicana complex", reinforcing understanding of the split of New and Old World Leishmania. Potential non-homologous isofunctional enzymes (NISE) were identified between L. amazonensis and Homo sapiens that could provide new drug targets for development.
The Comparative Genomics and Phylogenomics of Leishmania amazonensis Parasite
Tschoeke, Diogo A; Nunes, Gisele L; Jardim, Rodrigo; Lima, Joana; Dumaresq, Aline SR; Gomes, Monete R; de Mattos Pereira, Leandro; Loureiro, Daniel R; Stoco, Patricia H; de Matos Guedes, Herbert Leonel; de Miranda, Antonio Basilio; Ruiz, Jeronimo; Pitaluga, André; Silva, Floriano P; Probst, Christian M; Dickens, Nicholas J; Mottram, Jeremy C; Grisard, Edmundo C; Dávila, Alberto MR
2014-01-01
Leishmaniasis is an infectious disease caused by Leishmania species. Leishmania amazonensis is a New World Leishmania species belonging to the Mexicana complex, which is able to cause all types of leishmaniasis infections. The L. amazonensis reference strain MHOM/BR/1973/M2269 was sequenced identifying 8,802 codifying sequences (CDS), most of them of hypothetical function. Comparative analysis using six Leishmania species showed a core set of 7,016 orthologs. L. amazonensis and Leishmania mexicana share the largest number of distinct orthologs, while Leishmania braziliensis presented the largest number of inparalogs. Additionally, phylogenomic analysis confirmed the taxonomic position for L. amazonensis within the “Mexicana complex”, reinforcing understanding of the split of New and Old World Leishmania. Potential non-homologous isofunctional enzymes (NISE) were identified between L. amazonensis and Homo sapiens that could provide new drug targets for development. PMID:25336895
Welch, Andreanna J; Collins, Katherine; Ratan, Aakrosh; Drautz-Moses, Daniela I; Schuster, Stephan C; Lindqvist, Charlotte
2016-06-01
These data are presented in support of a plastid phylogenomic analysis of the recent radiation of the Hawaiian endemic mints (Lamiaceae), and their close relatives in the genus Stachys, "The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae)" [1]. Here we describe the chloroplast genome sequences for 12 mint taxa. Data presented include summaries of gene content and length for these taxa, structural comparison of the mint chloroplast genomes with published sequences from other species in the order Lamiales, and comparisons of variability among three Hawaiian taxa vs. three outgroup taxa. Finally, we provide a list of 108 primer pairs targeting the most variable regions within this group and designed specifically for amplification of DNA extracted from degraded herbarium material.
Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.
Chernomor, Olga; Minh, Bui Quang; von Haeseler, Arndt
2015-12-01
In phylogenomic analysis the collection of trees with identical score (maximum likelihood or parsimony score) may hamper tree search algorithms. Such collections are coined phylogenetic terraces. For sparse supermatrices with a lot of missing data, the number of terraces and the number of trees on the terraces can be very large. If terraces are not taken into account, a lot of computation time might be unnecessarily spent to evaluate many trees that in fact have identical score. To save computation time during the tree search, it is worthwhile to quickly identify such cases. The score of a species tree is the sum of scores for all the so-called induced partition trees. Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged. Here, we provide the conditions under which the three most widely used topological rearrangements (nearest neighbor interchange, subtree pruning and regrafting, and tree bisection and reconnection) change the topologies of induced partition trees. During the tree search, these conditions allow us to quickly identify whether we can save computation time on the evaluation of newly encountered trees. We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original "full" terrace. Hence, partial terrace is the more important factor of timesaving compared to full terrace. Therefore, taking into account the above conditions and the partial terrace concept will help to speed up the tree search in phylogenomic inference.
Nagy, László G; Riley, Robert; Bergmann, Philip J; Krizsán, Krisztina; Martin, Francis M; Grigoriev, Igor V; Cullen, Dan; Hibbett, David S
2017-01-01
Fungal decomposition of plant cell walls (PCW) is a complex process that has diverse industrial applications and huge impacts on the carbon cycle. White rot (WR) is a powerful mode of PCW decay in which lignin and carbohydrates are both degraded. Mechanistic studies of decay coupled with comparative genomic analyses have provided clues to the enzymatic components of WR systems and their evolutionary origins, but the complete suite of genes necessary for WR remains undetermined. Here, we use phylogenomic comparative methods, which we validate through simulations, to identify shifts in gene family diversification rates that are correlated with evolution of WR, using data from 62 fungal genomes. We detected 409 gene families that appear to be evolutionarily correlated with WR. The identified gene families encode well-characterized decay enzymes, e.g., fungal class II peroxidases and cellobiohydrolases, and enzymes involved in import and detoxification pathways, as well as 73 gene families that have no functional annotation. About 310 of the 409 identified gene families are present in the genome of the model WR fungus Phanerochaete chrysosporium and 192 of these (62%) have been shown to be upregulated under ligninolytic culture conditions, which corroborates the phylogeny-based functional inferences. These results illuminate the complexity of WR and suggest that its evolution has involved a general elaboration of the decay apparatus, including numerous gene families with as-yet unknown exact functions. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Phylogenomics and Morphology of Extinct Paleognaths Reveal the Origin and Evolution of the Ratites.
Yonezawa, Takahiro; Segawa, Takahiro; Mori, Hiroshi; Campos, Paula F; Hongoh, Yuichi; Endo, Hideki; Akiyoshi, Ayumi; Kohno, Naoki; Nishida, Shin; Wu, Jiaqi; Jin, Haofei; Adachi, Jun; Kishino, Hirohisa; Kurokawa, Ken; Nogi, Yoshifumi; Tanabe, Hideyuki; Mukoyama, Harutaka; Yoshida, Kunio; Rasoamiaramanana, Armand; Yamagishi, Satoshi; Hayashi, Yoshihiro; Yoshida, Akira; Koike, Hiroko; Akishinonomiya, Fumihito; Willerslev, Eske; Hasegawa, Masami
2017-01-09
The Palaeognathae comprise the flightless ratites and the volant tinamous, and together with the Neognathae constitute the extant members of class Aves. It is commonly believed that Palaeognathae originated in Gondwana since most of the living species are found in the Southern Hemisphere [1-3]. However, this hypothesis has been questioned because the fossil paleognaths are mostly from the Northern Hemisphere in their earliest time (Paleocene) and possessed many putative ancestral characters [4]. Uncertainties regarding the origin and evolution of Palaeognathae stem from the difficulty in estimating their divergence times [1, 2] and their remarkable morphological convergence. Here, we recovered nuclear genome fragments from extinct elephant birds, which enabled us to reconstruct a reliable phylogenomic time tree for the Palaeognathae. Based on the tree, we identified homoplasies in morphological traits of paleognaths and reconstructed their morphology-based phylogeny including fossil species without molecular data. In contrast to the prevailing theories, the fossil paleognaths from the Northern Hemisphere were placed as the basal lineages. Combined with our stable divergence time estimates that enabled a valid argument regarding the correlation with geological events, we propose a new evolutionary scenario that contradicts the traditional view. The ancestral Palaeognathae were volant, as estimated from their molecular evolutionary rates, and originated during the Late Cretaceous in the Northern Hemisphere. They migrated to the Southern Hemisphere and speciated explosively around the Cretaceous-Paleogene boundary. They then extended their distribution to the Gondwana-derived landmasses, such as New Zealand and Madagascar, by overseas dispersal. Gigantism subsequently occurred independently on each landmass. Copyright © 2017 Elsevier Ltd. All rights reserved.
Leclercq, Sébastien; Dittmer, Jessica; Bouchon, Didier; Cordaux, Richard
2014-02-01
Bacterial gut communities of arthropods are highly diverse and tightly related to host feeding habits. However, our understanding of the origin and role of the symbionts is often hindered by the lack of genetic information. "Candidatus Hepatoplasma crinochetorum" is a Mollicutes symbiont found in the midgut glands of terrestrial isopods. The only available nucleotide sequence for this symbiont is a partial 16S rRNA gene sequence. Here, we present the 657,101 bp assembled genome of Candidatus Hepatoplasma crinochetorum isolated from the terrestrial isopod Armadillidium vulgare. While previous 16S rRNA gene-based analyses have provided inconclusive results regarding the phylogenetic position of Candidatus Hepatoplasma crinochetorum within Mollicutes, we performed a phylogenomic analysis of 127 Mollicutes orthologous genes which confidently branches the species as a sister group to the Hominis group of Mycoplasma. Several genome properties of Candidatus Hepatoplasma crinochetorum are also highlighted compared with other Mollicutes genomes, including adjacent tryptophan tRNA genes, which further our understanding of the evolutionary dynamics of these genes in Mollicutes, and the presence of a probably inactivated CRISPR/Cas system, which constitutes a testimony of past interactions between Candidatus Hepatoplasma crinochetorum and mobile genetic elements, despite their current lack in this streamlined genome. Overall, the availability of the complete genome sequence of Candidatus Hepatoplasma crinochetorum paves the way for further investigation of its ecology and evolution.
To Be or Not to Be a Flatworm: The Acoel Controversy
Arendt, Detlev; Borgonie, Gaëtan; Funayama, Noriko; Gschwentner, Robert; Hartenstein, Volker; Hobmayer, Bert; Hooge, Matthew; Hrouda, Martina; Ishida, Sachiko; Kobayashi, Chiyoko; Kuales, Georg; Nishimura, Osamu; Pfister, Daniela; Rieger, Reinhard; Salvenmoser, Willi; Smith, Julian; Technau, Ulrich; Tyler, Seth; Agata, Kiyokazu; Salzburger, Walter; Ladurner, Peter
2009-01-01
Since first described, acoels were considered members of the flatworms (Platyhelminthes). However, no clear synapomorphies among the three large flatworm taxa - the Catenulida, the Acoelomorpha and the Rhabditophora - have been characterized to date. Molecular phylogenies, on the other hand, commonly positioned acoels separate from other flatworms. Accordingly, our own multi-locus phylogenetic analysis using 43 genes and 23 animal species places the acoel flatworm Isodiametra pulchra at the base of all Bilateria, distant from other flatworms. By contrast, novel data on the distribution and proliferation of stem cells and the specific mode of epidermal replacement constitute a strong synapomorphy for the Acoela plus the major group of flatworms, the Rhabditophora. The expression of a piwi-like gene not only in gonadal, but also in adult somatic stem cells is another unique feature among bilaterians. These two independent stem-cell-related characters put the Acoela into the Platyhelminthes-Lophotrochozoa clade and account for the most parsimonious evolutionary explanation of epidermal cell renewal in the Bilateria. Most available multigene analyses produce conflicting results regarding the position of the acoels in the tree of life. Given these phylogenomic conflicts and the contradiction of developmental and morphological data with phylogenomic results, the monophyly of the phylum Platyhelminthes and the position of the Acoela remain unresolved. By these data, both the inclusion of Acoela within Platyhelminthes, and their separation from flatworms as basal bilaterians are well-supported alternatives. PMID:19430533
MultiPhyl: a high-throughput phylogenomics webserver using distributed computing
Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.
2007-01-01
With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837
The origin of modern metabolic networks inferred from phylogenomic analysis of protein architecture.
Caetano-Anollés, Gustavo; Kim, Hee Shin; Mittenthal, Jay E
2007-05-29
Metabolism represents a complex collection of enzymatic reactions and transport processes that convert metabolites into molecules capable of supporting cellular life. Here we explore the origins and evolution of modern metabolism. Using phylogenomic information linked to the structure of metabolic enzymes, we sort out recruitment processes and discover that most enzymatic activities were associated with the nine most ancient and widely distributed protein fold architectures. An analysis of newly discovered functions showed enzymatic diversification occurred early, during the onset of the modern protein world. Most importantly, phylogenetic reconstruction exercises and other evidence suggest strongly that metabolism originated in enzymes with the P-loop hydrolase fold in nucleotide metabolism, probably in pathways linked to the purine metabolic subnetwork. Consequently, the first enzymatic takeover of an ancient biochemistry or prebiotic chemistry was related to the synthesis of nucleotides for the RNA world.
Replicated divergence in cichlid radiations mirrors a major vertebrate innovation.
McGee, Matthew D; Faircloth, Brant C; Borstein, Samuel R; Zheng, Jimmy; Darrin Hulsey, C; Wainwright, Peter C; Alfaro, Michael E
2016-01-13
Decoupling of the upper jaw bones--jaw kinesis--is a distinctive feature of the ray-finned fishes, but it is not clear how the innovation is related to the extraordinary diversity of feeding behaviours and feeding ecology in this group. We address this issue in a lineage of ray-finned fishes that is well known for its ecological and functional diversity--African rift lake cichlids. We sequenced ultraconserved elements to generate a phylogenomic tree of the Lake Tanganyika and Lake Malawi cichlid radiations. We filmed a diverse array of over 50 cichlid species capturing live prey and quantified the extent of jaw kinesis in the premaxillary and maxillary bones. Our combination of phylogenomic and kinematic data reveals a strong association between biting modes of feeding and reduced jaw kinesis, suggesting that the contrasting demands of biting and suction feeding have strongly influenced cranial evolution in both cichlid radiations. © 2016 The Author(s).
Hellmuth, Marc; Wieseke, Nicolas; Lechner, Marcus; Lenhof, Hans-Peter; Middendorf, Martin; Stadler, Peter F.
2015-01-01
Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclusively 1:1 orthologos. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics, we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from tree-free estimates of orthology, cograph editing can sufficiently reduce the noise to find correct event-annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. Although the resolution is very poor for individual gene families, we show that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer. PMID:25646426
An expanded nuclear phylogenomic PCR toolkit for Sapindales1
Collins, Elizabeth S.; Gostel, Morgan R.; Weeks, Andrea
2016-01-01
Premise of the study: We tested PCR amplification of 91 low-copy nuclear gene loci in taxa from Sapindales using primers developed for Bursera simaruba (Burseraceae). Methods and Results: Cross-amplification of these markers among 10 taxa tested was related to their phylogenetic distance from B. simaruba. On average, each Sapindalean taxon yielded product for 53 gene regions (range: 16–90). Arabidopsis thaliana (Brassicales), by contrast, yielded product for two. Single representatives of Anacardiaceae and Rutacaeae yielded 34 and 26 products, respectively. Twenty-six primer pairs worked for all Burseraceae species tested if highly divergent Aucoumea klaineana is excluded, and eight of these amplified product in every Sapindalean taxon. Conclusions: Our study demonstrates that customized primers for Bursera can amplify product in a range of Sapindalean taxa. This collection of primer pairs, therefore, is a valuable addition to the toolkit for nuclear phylogenomic analyses of Sapindales and warrants further investigation. PMID:28101434
Caetano-Anollés, Gustavo; Kim, Kyung Mo; Caetano-Anollés, Derek
2012-02-01
The complexity of modern biochemistry developed gradually on early Earth as new molecules and structures populated the emerging cellular systems. Here, we generate a historical account of the gradual discovery of primordial proteins, cofactors, and molecular functions using phylogenomic information in the sequence of 420 genomes. We focus on structural and functional annotations of the 54 most ancient protein domains. We show how primordial functions are linked to folded structures and how their interaction with cofactors expanded the functional repertoire. We also reveal protocell membranes played a crucial role in early protein evolution and show translation started with RNA and thioester cofactor-mediated aminoacylation. Our findings allow elaboration of an evolutionary model of early biochemistry that is firmly grounded in phylogenomic information and biochemical, biophysical, and structural knowledge. The model describes how primordial α-helical bundles stabilized membranes, how these were decorated by layered arrangements of β-sheets and α-helices, and how these arrangements became globular. Ancient forms of aminoacyl-tRNA synthetase (aaRS) catalytic domains and ancient non-ribosomal protein synthetase (NRPS) modules gave rise to primordial protein synthesis and the ability to generate a code for specificity in their active sites. These structures diversified producing cofactor-binding molecular switches and barrel structures. Accretion of domains and molecules gave rise to modern aaRSs, NRPS, and ribosomal ensembles, first organized around novel emerging cofactors (tRNA and carrier proteins) and then more complex cofactor structures (rRNA). The model explains how the generation of protein structures acted as scaffold for nucleic acids and resulted in crystallization of modern translation.
Genomic anatomy of Escherichia coli O157:H7 outbreaks
Eppinger, Mark; Mammel, Mark K.; Leclerc, Joseph E.; Ravel, Jacques; Cebula, Thomas A.
2011-01-01
The rapid emergence of Escherichia coli O157:H7 from an unknown strain in 1982 to the dominant hemorrhagic E. coli serotype in the United States and the cause of widespread outbreaks of human food-borne illness highlights a need to evaluate critically the extent to which genomic plasticity of this important enteric pathogen contributes to its pathogenic potential and its evolution as well as its adaptation in different ecological niches. Aimed at a better understanding of the evolution of the E. coli O157:H7 pathogenome, the present study presents the high-quality sequencing and comparative phylogenomic analysis of a comprehensive panel of 25 E. coli O157:H7 strains associated with three nearly simultaneous food-borne outbreaks of human disease in the United States. Here we present a population genetic analysis of more than 200 related strains recovered from patients, contaminated produce, and zoonotic sources. High-resolution phylogenomic approaches allow the dynamics of pathogenome evolution to be followed at a high level of phylogenetic accuracy and resolution. SNP discovery and study of genome architecture and prophage content identified numerous biomarkers to assess the extent of genetic diversity within a set of clinical and environmental strains. A total of 1,225 SNPs were identified in the present study and are now available for typing of the E. coli O157:H7 lineage. These data should prove useful for the development of a refined phylogenomic framework for forensic, diagnostic, and epidemiological studies to define better risk in response to novel and emerging E. coli O157:H7 resistance and virulence phenotypes. PMID:22135463
Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference
Minh, Bui Quang; von Haeseler, Arndt
2015-01-01
Abstract In phylogenomic analysis the collection of trees with identical score (maximum likelihood or parsimony score) may hamper tree search algorithms. Such collections are coined phylogenetic terraces. For sparse supermatrices with a lot of missing data, the number of terraces and the number of trees on the terraces can be very large. If terraces are not taken into account, a lot of computation time might be unnecessarily spent to evaluate many trees that in fact have identical score. To save computation time during the tree search, it is worthwhile to quickly identify such cases. The score of a species tree is the sum of scores for all the so-called induced partition trees. Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged. Here, we provide the conditions under which the three most widely used topological rearrangements (nearest neighbor interchange, subtree pruning and regrafting, and tree bisection and reconnection) change the topologies of induced partition trees. During the tree search, these conditions allow us to quickly identify whether we can save computation time on the evaluation of newly encountered trees. We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original “full” terrace. Hence, partial terrace is the more important factor of timesaving compared to full terrace. Therefore, taking into account the above conditions and the partial terrace concept will help to speed up the tree search in phylogenomic inference. PMID:26448206
Eyun, Seong-Il
2017-01-19
Copepods play a critical role in marine ecosystems but have been poorly investigated in phylogenetic studies. Morphological evidence supports the monophyly of copepods, whereas interordinal relationships continue to be debated. In particular, the phylogenetic position of the order Harpacticoida is still ambiguous and inconsistent among studies. Until now, a small number of molecular studies have been done using only a limited number or even partial genes and thus there is so far no consensus at the order-level. This study attempted to resolve phylogenetic relationships among and within four major copepod orders including Harpacticoida and the phylogenetic position of Copepoda among five other crustacean groups (Anostraca, Cladocera, Sessilia, Amphipoda, and Decapoda) using 24 nuclear protein-coding genes. Phylogenomics has confirmed the monophyly of Copepoda and Podoplea. However, this study reveals surprising differences with the majority of the copepod phylogenies and unexpected similarities with postembryonic characters and earlier proposed morphological phylogenies; More precisely, Cyclopoida is more closely related to Siphonostomatoida than to Harpacticoida which is likely the most basally-branching group of Podoplea. Divergence time estimation suggests that the origin of Harpacticoida can be traced back to the Devonian, corresponding well with recently discovered fossil evidence. Copepoda has a close affinity to the clade of Malacostraca and Thecostraca but not to Branchiopoda. This result supports the hypothesis of the newly proposed clades, Communostraca, Multicrustacea, and Allotriocarida but further challenges the validity of Hexanauplia and Vericrustacea. The first phylogenomic study of Copepoda provides new insights into taxonomic relationships and represents a valuable resource that improves our understanding of copepod evolution and their wide range of ecological adaptations.
Single-Copy Genes as Molecular Markers for Phylogenomic Studies in Seed Plants
De La Torre, Amanda R.; Sterck, Lieven; Cánovas, Francisco M.; Avila, Concepción; Merino, Irene; Cabezas, José Antonio; Cervera, María Teresa; Ingvarsson, Pär K.
2017-01-01
Phylogenetic relationships among seed plant taxa, especially within the gymnosperms, remain contested. In contrast to angiosperms, for which several genomic, transcriptomic and phylogenetic resources are available, there are few, if any, molecular markers that allow broad comparisons among gymnosperm species. With few gymnosperm genomes available, recently obtained transcriptomes in gymnosperms are a great addition to identifying single-copy gene families as molecular markers for phylogenomic analysis in seed plants. Taking advantage of an increasing number of available genomes and transcriptomes, we identified single-copy genes in a broad collection of seed plants and used these to infer phylogenetic relationships between major seed plant taxa. This study aims at extending the current phylogenetic toolkit for seed plants, assessing its ability for resolving seed plant phylogeny, and discussing potential factors affecting phylogenetic reconstruction. In total, we identified 3,072 single-copy genes in 31 gymnosperms and 2,156 single-copy genes in 34 angiosperms. All studied seed plants shared 1,469 single-copy genes, which are generally involved in functions like DNA metabolism, cell cycle, and photosynthesis. A selected set of 106 single-copy genes provided good resolution for the seed plant phylogeny except for gnetophytes. Although some of our analyses support a sister relationship between gnetophytes and other gymnosperms, phylogenetic trees from concatenated alignments without 3rd codon positions and amino acid alignments under the CAT + GTR model, support gnetophytes as a sister group to Pinaceae. Our phylogenomic analyses demonstrate that, in general, single-copy genes can uncover both recent and deep divergences of seed plant phylogeny. PMID:28460034
Phylogenomic Reconstruction of the Oomycete Phylogeny Derived from 37 Genomes
McCarthy, Charley G. P.
2017-01-01
ABSTRACT The oomycetes are a class of microscopic, filamentous eukaryotes within the Stramenopiles-Alveolata-Rhizaria (SAR) supergroup which includes ecologically significant animal and plant pathogens, most infamously the causative agent of potato blight Phytophthora infestans. Single-gene and concatenated phylogenetic studies both of individual oomycete genera and of members of the larger class have resulted in conflicting conclusions concerning species phylogenies within the oomycetes, particularly for the large Phytophthora genus. Genome-scale phylogenetic studies have successfully resolved many eukaryotic relationships by using supertree methods, which combine large numbers of potentially disparate trees to determine evolutionary relationships that cannot be inferred from individual phylogenies alone. With a sufficient amount of genomic data now available, we have undertaken the first whole-genome phylogenetic analysis of the oomycetes using data from 37 oomycete species and 6 SAR species. In our analysis, we used established supertree methods to generate phylogenies from 8,355 homologous oomycete and SAR gene families and have complemented those analyses with both phylogenomic network and concatenated supermatrix analyses. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and individual clades within the problematic Phytophthora genus. Support for the resolution of the inferred relationships between individual Phytophthora clades varies depending on the methodology used. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. IMPORTANCE The oomycetes are a class of eukaryotes and include ecologically significant animal and plant pathogens. Single-gene and multigene phylogenetic studies of individual oomycete genera and of members of the larger classes have resulted in conflicting conclusions concerning interspecies relationships among these species, particularly for the Phytophthora genus. The onset of next-generation sequencing techniques now means that a wealth of oomycete genomic data is available. For the first time, we have used genome-scale phylogenetic methods to resolve oomycete phylogenetic relationships. We used supertree methods to generate single-gene and multigene species phylogenies. Overall, our supertree analyses utilized phylogenetic data from 8,355 oomycete gene families. We have also complemented our analyses with superalignment phylogenies derived from 131 single-copy ubiquitous gene families. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and clades. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. PMID:28435885
Petitjean, Céline; Deschamps, Philippe; López-García, Purificación; Moreira, David
2014-12-19
The first 16S rRNA-based phylogenies of the Archaea showed a deep division between two groups, the kingdoms Euryarchaeota and Crenarchaeota. This bipartite classification has been challenged by the recent discovery of new deeply branching lineages (e.g., Thaumarchaeota, Aigarchaeota, Nanoarchaeota, Korarchaeota, Parvarchaeota, Aenigmarchaeota, Diapherotrites, and Nanohaloarchaeota) which have also been given the same taxonomic status of kingdoms. However, the phylogenetic position of some of these lineages is controversial. In addition, phylogenetic analyses of the Archaea have often been carried out without outgroup sequences, making it difficult to determine if these taxa actually define lineages at the same level as the Euryarchaeota and Crenarchaeota. We have addressed the question of the position of the root of the Archaea by reconstructing rooted archaeal phylogenetic trees using bacterial sequences as outgroup. These trees were based on commonly used conserved protein markers (32 ribosomal proteins) as well as on 38 new markers identified through phylogenomic analysis. We thus gathered a total of 70 conserved markers that we analyzed as a concatenated data set. In contrast with previous analyses, our trees consistently placed the root of the archaeal tree between the Euryarchaeota (including the Nanoarchaeota and other fast-evolving lineages) and the rest of archaeal species, which we propose to class within the new kingdom Proteoarchaeota. This implies the relegation of several groups previously classified as kingdoms (e.g., Crenarchaeota, Thaumarchaeota, Aigarchaeota, and Korarchaeota) to a lower taxonomic rank. In addition to taxonomic implications, this profound reorganization of the archaeal phylogeny has also consequences on our appraisal of the nature of the last archaeal ancestor, which most likely was a complex organism with a gene-rich genome. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Petitjean, Céline; Deschamps, Philippe; López-García, Purificación; Moreira, David
2015-01-01
The first 16S rRNA-based phylogenies of the Archaea showed a deep division between two groups, the kingdoms Euryarchaeota and Crenarchaeota. This bipartite classification has been challenged by the recent discovery of new deeply branching lineages (e.g., Thaumarchaeota, Aigarchaeota, Nanoarchaeota, Korarchaeota, Parvarchaeota, Aenigmarchaeota, Diapherotrites, and Nanohaloarchaeota) which have also been given the same taxonomic status of kingdoms. However, the phylogenetic position of some of these lineages is controversial. In addition, phylogenetic analyses of the Archaea have often been carried out without outgroup sequences, making it difficult to determine if these taxa actually define lineages at the same level as the Euryarchaeota and Crenarchaeota. We have addressed the question of the position of the root of the Archaea by reconstructing rooted archaeal phylogenetic trees using bacterial sequences as outgroup. These trees were based on commonly used conserved protein markers (32 ribosomal proteins) as well as on 38 new markers identified through phylogenomic analysis. We thus gathered a total of 70 conserved markers that we analyzed as a concatenated data set. In contrast with previous analyses, our trees consistently placed the root of the archaeal tree between the Euryarchaeota (including the Nanoarchaeota and other fast-evolving lineages) and the rest of archaeal species, which we propose to class within the new kingdom Proteoarchaeota. This implies the relegation of several groups previously classified as kingdoms (e.g., Crenarchaeota, Thaumarchaeota, Aigarchaeota, and Korarchaeota) to a lower taxonomic rank. In addition to taxonomic implications, this profound reorganization of the archaeal phylogeny has also consequences on our appraisal of the nature of the last archaeal ancestor, which most likely was a complex organism with a gene-rich genome. PMID:25527841
Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.
Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark
2016-01-01
Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies.
Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks
Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark
2016-01-01
Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies. PMID:27446025
Whelan, Nathan V; Halanych, Kenneth M
2017-03-01
As phylogenetic datasets have increased in size, site-heterogeneous substitution models such as CAT-F81 and CAT-GTR have been advocated in favor of other models because they purportedly suppress long-branch attraction (LBA). These models are two of the most commonly used models in phylogenomics, and they have been applied to a variety of taxa, ranging from Drosophila to land plants. However, many arguments in favor of CAT models have been based on tenuous assumptions about the true phylogeny, rather than rigorous testing with known trees via simulation. Moreover, CAT models have not been compared to other approaches for handling substitutional heterogeneity such as data partitioning with site-homogeneous substitution models. We simulated amino acid sequence datasets with substitutional heterogeneity on a variety of tree shapes including those susceptible to LBA. Data were analyzed with both CAT models and partitioning to explore model performance; in total over 670,000 CPU hours were used, of which over 97% was spent running analyses with CAT models. In many cases, all models recovered branching patterns that were identical to the known tree. However, CAT-F81 consistently performed worse than other models in inferring the correct branching patterns, and both CAT models often overestimated substitutional heterogeneity. Additionally, reanalysis of two empirical metazoan datasets supports the notion that CAT-F81 tends to recover less accurate trees than data partitioning and CAT-GTR. Given these results, we conclude that partitioning and CAT-GTR perform similarly in recovering accurate branching patterns. However, computation time can be orders of magnitude less for data partitioning, with commonly used implementations of CAT-GTR often failing to reach completion in a reasonable time frame (i.e., for Bayesian analyses to converge). Practices such as removing constant sites and parsimony uninformative characters, or using CAT-F81 when CAT-GTR is deemed too computationally expensive, cannot be logically justified. Given clear problems with CAT-F81, phylogenies previously inferred with this model should be reassessed. [Data partitioning; phylogenomics, simulation, site-heterogeneity, substitution models.]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Zhao, Lei; Zhang, Ning; Ma, Peng-Fei; Liu, Qi; Li, De-Zhu; Guo, Zhen-Hua
2013-01-01
BEP clade of the grass family (Poaceae) is composed of three subfamilies, i.e. Bambusoideae, Ehrhartoideae, and Pooideae. Controversies on the phylogenetic relationships among three subfamilies still persist in spite of great efforts. However, previous evidence was mainly provided from plastid genes with only a few nuclear genes utilized. Given different evolutionary histories recorded by plastid and nuclear genes, it is indispensable to uncover their relationships based on nuclear genes. Here, eleven species with whole-sequenced genome and six species with transcriptomic data were included in this study. A total of 121 one-to-one orthologous groups (OGs) were identified and phylogenetic trees were reconstructed by different tree-building methods. Genes which might have undergone positive selection and played important roles in adaptive evolution were also investigated from 314 and 173 one-to-one OGs in two bamboo species and 14 grass species, respectively. Our results support the ((B, P) E) topology with high supporting values. Besides, our findings also indicate that 24 and nine orthologs with statistically significant evidence of positive selection are mainly involved in abiotic and biotic stress response, reproduction and development, plant metabolism and enzyme etc. from two bamboo species and 14 grass species, respectively. In summary, this study demonstrates the power of phylogenomic approach to shed lights on the evolutionary relationships within the BEP clade, and offers valuable insights into adaptive evolution of the grass family.
Zhao, Lei; Zhang, Ning; Ma, Peng-Fei; Liu, Qi; Li, De-Zhu; Guo, Zhen-Hua
2013-01-01
BEP clade of the grass family (Poaceae) is composed of three subfamilies, i.e. Bambusoideae, Ehrhartoideae, and Pooideae. Controversies on the phylogenetic relationships among three subfamilies still persist in spite of great efforts. However, previous evidence was mainly provided from plastid genes with only a few nuclear genes utilized. Given different evolutionary histories recorded by plastid and nuclear genes, it is indispensable to uncover their relationships based on nuclear genes. Here, eleven species with whole-sequenced genome and six species with transcriptomic data were included in this study. A total of 121 one-to-one orthologous groups (OGs) were identified and phylogenetic trees were reconstructed by different tree-building methods. Genes which might have undergone positive selection and played important roles in adaptive evolution were also investigated from 314 and 173 one-to-one OGs in two bamboo species and 14 grass species, respectively. Our results support the ((B, P) E) topology with high supporting values. Besides, our findings also indicate that 24 and nine orthologs with statistically significant evidence of positive selection are mainly involved in abiotic and biotic stress response, reproduction and development, plant metabolism and enzyme etc. from two bamboo species and 14 grass species, respectively. In summary, this study demonstrates the power of phylogenomic approach to shed lights on the evolutionary relationships within the BEP clade, and offers valuable insights into adaptive evolution of the grass family. PMID:23734211
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Weiwen; Culley, David E.; Gritsenko, Marina A.
2006-11-03
ABSTRACT In the previous study, the whole-genome gene expression profiles of D. vulgaris in response to oxidative stress and heat shock were determined. The results showed 24-28% of the responsive genes were hypothetical proteins that have not been experimentally characterized or whose function can not be deduced by simple sequence comparison. To further explore the protecting mechanisms employed in D. vulgaris against the oxidative stress and heat shock, attempt was made in this study to infer functions of these hypothetical proteins by phylogenomic profiling along with detailed sequence comparison against various publicly available databases. By this approach we were abletomore » assign possible functions to 25 responsive hypothetical proteins. The findings included that DVU0725, induced by oxidative stress, may be involved in lipopolysaccharide biosynthesis, implying that the alternation of lipopolysaccharide on cell surface might service as a mechanism against oxidative stress in D. vulgaris. In addition, two responsive proteins, DVU0024 encoding a putative transcriptional regulator and DVU1670 encoding predicted redox protein, were sharing co-evolution atterns with rubrerythrin in Archaeoglobus fulgidus and Clostridium perfringens, respectively, implying that they might be part of the stress response and protective systems in D. vulgaris. The study demonstrated that phylogenomic profiling is a useful tool in interpretation of experimental genomics data, and also provided further insight on cellular response to oxidative stress and heat shock in D. vulgaris.« less
Ruane, Sara; Raxworthy, Christopher J; Lemmon, Alan R; Lemmon, Emily Moriarty; Burbrink, Frank T
2015-10-12
Using molecular data generated by high throughput next generation sequencing (NGS) platforms to infer phylogeny is becoming common as costs go down and the ability to capture loci from across the genome goes up. While there is a general consensus that greater numbers of independent loci should result in more robust phylogenetic estimates, few studies have compared phylogenies resulting from smaller datasets for commonly used genetic markers with the large datasets captured using NGS. Here, we determine how a 5-locus Sanger dataset compares with a 377-locus anchored genomics dataset for understanding the evolutionary history of the pseudoxyrhophiine snake radiation centered in Madagascar. The Pseudoxyrhophiinae comprise ~86 % of Madagascar's serpent diversity, yet they are poorly known with respect to ecology, behavior, and systematics. Using the 377-locus NGS dataset and the summary statistics species-tree methods STAR and MP-EST, we estimated a well-supported species tree that provides new insights concerning intergeneric relationships for the pseudoxyrhophiines. We also compared how these and other methods performed with respect to estimating tree topology using datasets with varying numbers of loci. Using Sanger sequencing and an anchored phylogenomics approach, we sequenced datasets comprised of 5 and 377 loci, respectively, for 23 pseudoxyrhophiine taxa. For each dataset, we estimated phylogenies using both gene-tree (concatenation) and species-tree (STAR, MP-EST) approaches. We determined the similarity of resulting tree topologies from the different datasets using Robinson-Foulds distances. In addition, we examined how subsets of these data performed compared to the complete Sanger and anchored datasets for phylogenetic accuracy using the same tree inference methodologies, as well as the program *BEAST to determine if a full coalescent model for species tree estimation could generate robust results with fewer loci compared to the summary statistics species tree approaches. We also examined the individual gene trees in comparison to the 377-locus species tree using the program MetaTree. Using the full anchored dataset under a variety of methods gave us the same, well-supported phylogeny for pseudoxyrhophiines. The African pseudoxyrhophiine Duberria is the sister taxon to the Malagasy pseudoxyrhophiines genera, providing evidence for a monophyletic radiation in Madagascar. In addition, within Madagascar, the two major clades inferred correspond largely to the aglyphous and opisthoglyphous genera, suggesting that feeding specializations associated with tooth venom delivery may have played a major role in the early diversification of this radiation. The comparison of tree topologies from the concatenated and species-tree methods using different datasets indicated the 5-locus dataset cannot beused to infer a correct phylogeny for the pseudoxyrhophiines under any method tested here and that summary statistics methods require 50 or more loci to consistently recover the species-tree inferred using the complete anchored dataset. However, as few as 15 loci may infer the correct topology when using the full coalescent species tree method *BEAST. MetaTree analyses of each gene tree from the Sanger and anchored datasets found that none of the individual gene trees matched the 377-locus species tree, and that no gene trees were identical with respect to topology. Our results suggest that ≥50 loci may be necessary to confidently infer phylogenies when using summaryspecies-tree methods, but that the coalescent-based method *BEAST consistently recovers the same topology using only 15 loci. These results reinforce that datasets with small numbers of markers may result in misleading topologies, and further, that the method of inference used to generate a phylogeny also has a major influence on the number of loci necessary to infer robust species trees.
Phylogenomic and biogeographic reconstruction of the Trichinella complex
USDA-ARS?s Scientific Manuscript database
Trichinellosis is a globally important food-borne parasitic disease of humans. It is caused by roundworms of the Trichinella complex. Extensive biodiversity is reflected in substantial ecological and genetic variability within and among taxa, and major controversy surrounds the systematics of this c...
Phylogenomics of the Zygomycete lineages: Exploring phylogeny and genome evolution
USDA-ARS?s Scientific Manuscript database
The Zygomycete lineages mark the major transition from zoosporic life histories of the common ancestors of Fungi and the earliest diverging chytrid lineages (Chytridiomycota and Blastocladiomycota). Genome comparisons from these lineages may reveal gene content changes that reflect the transition to...
Duvall, Melvin R; Yadav, Shrirang R; Burke, Sean V; Wysocki, William P
2017-02-01
We investigated the little-studied Arundinoideae/Micrairoideae clade of grasses with an innovative plastome phylogenomic approach. This method gives robust results for taxa of uncertain phylogenetic placement. Arundinoideae comprise ∼45 species, although historically was much larger. Arundinoideae is notable for the widely invasive Phragmites australis . Micrairoideae comprise nine genera and ∼200 species. Some are threatened with extinction, including Hubbardia , some Isachne spp., and Limnopoa . Two micrairoid genera, Eriachne and Pheidochloa , exhibit C 4 photosynthesis in this otherwise C 3 subfamily and represent an independent origin of the C 4 pathway among grasses. Five new plastomes were sequenced with next-generation sequencing-by-synthesis methods. Plastomes were assembled by de novo methods and phylogenetically analyzed with eight other recently published arundinoid or micrairoid plastomes and 11 outgroup species. Stable carbon isotope ratios were determined for micrairoid and arundinoid species to investigate ambiguities in the proxy evidence for C 4 photosynthesis. Phylogenomic analyses showed strong support for ingroup nodes in the Arundinoideae/Micrairoideae subtree, including a paraphyletic clade of Hubbardieae with Isachneae. Anatomical, biochemical, and positively selected sites data are ambiguous with regard to the photosynthetic pathways in Micrairoideae. Species of Hubbardia , Isachne , and Limnopoa were definitively shown by δ 13 C measurements to be C 3 and Eriachne to be C 4 . Our plastome phylogenomic analyses for Micrairoideae are the first phylogenetic results to indicate paraphyly between Isachneae and Hubbardieae. The definitive δ 13 C data for four genera of Micrairoideae indicates the breadth of variation possible in the proxy evidence for photosynthetic pathways of both C 3 and C 4 taxa. © 2017 Duvall et al. Published by the Botanical Society of America. This work is licensed under a Creative Commons Attribution License (CC-BY-NC).
Guilliams, C Matt; Hasenstab-Lehman, Kristen E; Mabry, Makenzie E; Simpson, Michael G
2017-11-23
American amphitropical disjunction (AAD) is an important but understudied New World biogeographic pattern in which related plants occur in extratropical North America and South America, but are absent in the intervening tropics. Subtribe Amsinckiinae (Boraginaceae) is one of the richest groups of plants displaying the AAD pattern. Here, we infer a time-calibrated molecular phylogeny of the group to evaluate the number, timing, and directionality of AAD events, which yields generalizable insights into the mechanism of AAD. We perform a phylogenomic analysis of 139 samples of subtribe Amsinckiinae and infer divergence times using two calibration schemes: with only fossil calibrations and with fossils plus a secondary calibration from a recent family level analysis. Biogeographic analysis was performed in the R package BioGeoBEARS. We document 18 examples of AAD in the Amsinckiinae. Inferred divergence times of these AAD examples were strongly asynchronous, ranging from Miocene (17.1 million years ago [Ma]) to Pleistocene (0.33 Ma), with most (12) occurring <5 Ma. Four events occurred 10-5 Ma, during the second rise of the Andes. All AAD examples had a North America to South America directionality. Second only to the hyperdiverse Poaceae in number of documented AAD examples, the Amsinckiinae is an ideal system for the study of AAD. Asynchronous divergence times support the hypothesis of long-distance dispersal by birds as the mechanism of AAD in the subtribe and more generally. Further comparative phylogenomic studies may permit biogeographic hypothesis testing and examination of the relationship between AAD and fruit morphology, reproductive biology, and ploidy. © 2017 Botanical Society of America.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk
2014-10-09
To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accuratemore » comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.« less
Walker, Joseph F; Yang, Ya; Feng, Tao; Timoneda, Alfonso; Mikenas, Jessica; Hutchison, Vera; Edwards, Caroline; Wang, Ning; Ahluwalia, Sonia; Olivieri, Julia; Walker-Hale, Nathanael; Majure, Lucas C; Puente, Raúl; Kadereit, Gudrun; Lauterbach, Maximilian; Eggli, Urs; Flores-Olvera, Hilda; Ochoterena, Helga; Brockington, Samuel F; Moore, Michael J; Smith, Stephen A
2018-03-01
The Caryophyllales contain ~12,500 species and are known for their cosmopolitan distribution, convergence of trait evolution, and extreme adaptations. Some relationships within the Caryophyllales, like those of many large plant clades, remain unclear, and phylogenetic studies often recover alternative hypotheses. We explore the utility of broad and dense transcriptome sampling across the order for resolving evolutionary relationships in Caryophyllales. We generated 84 transcriptomes and combined these with 224 publicly available transcriptomes to perform a phylogenomic analysis of Caryophyllales. To overcome the computational challenge of ortholog detection in such a large data set, we developed an approach for clustering gene families that allowed us to analyze >300 transcriptomes and genomes. We then inferred the species relationships using multiple methods and performed gene-tree conflict analyses. Our phylogenetic analyses resolved many clades with strong support, but also showed significant gene-tree discordance. This discordance is not only a common feature of phylogenomic studies, but also represents an opportunity to understand processes that have structured phylogenies. We also found taxon sampling influences species-tree inference, highlighting the importance of more focused studies with additional taxon sampling. Transcriptomes are useful both for species-tree inference and for uncovering evolutionary complexity within lineages. Through analyses of gene-tree conflict and multiple methods of species-tree inference, we demonstrate that phylogenomic data can provide unparalleled insight into the evolutionary history of Caryophyllales. We also discuss a method for overcoming computational challenges associated with homolog clustering in large data sets. © 2018 The Authors. American Journal of Botany is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.
The First Mitochondrial Genome for Caddisfly (Insecta: Trichoptera) with Phylogenetic Implications
Wang, Yuyu; Liu, Xingyue; Yang, Ding
2014-01-01
The Trichoptera (caddisflies) is a holometabolous insect order with 14,300 described species forming the second most species-rich monophyletic group of animals in freshwater. Hitherto, there is no mitochondrial genome reported of this order. Herein, we describe the complete mitochondrial (mt) genome of a caddisfly species, Eubasilissa regina (McLachlan, 1871). A phylogenomic analysis was carried out based on the mt genomic sequences of 13 mt protein coding genes (PCGs) and two rRNA genes of 24 species belonging to eight holometabolous orders. Both maximum likelihood and Bayesian inference analyses highly support the sister relationship between Trichoptera and Lepidoptera. PMID:24391451
Li, De-Zhu
2011-01-01
Background Bambusoideae is the only subfamily that contains woody members in the grass family, Poaceae. In phylogenetic analyses, Bambusoideae, Pooideae and Ehrhartoideae formed the BEP clade, yet the internal relationships of this clade are controversial. The distinctive life history (infrequent flowering and predominance of asexual reproduction) of woody bamboos makes them an interesting but taxonomically difficult group. Phylogenetic analyses based on large DNA fragments could only provide a moderate resolution of woody bamboo relationships, although a robust phylogenetic tree is needed to elucidate their evolutionary history. Phylogenomics is an alternative choice for resolving difficult phylogenies. Methodology/Principal Findings Here we present the complete nucleotide sequences of six woody bamboo chloroplast (cp) genomes using Illumina sequencing. These genomes are similar to those of other grasses and rather conservative in evolution. We constructed a phylogeny of Poaceae from 24 complete cp genomes including 21 grass species. Within the BEP clade, we found strong support for a sister relationship between Bambusoideae and Pooideae. In a substantial improvement over prior studies, all six nodes within Bambusoideae were supported with ≥0.95 posterior probability from Bayesian inference and 5/6 nodes resolved with 100% bootstrap support in maximum parsimony and maximum likelihood analyses. We found that repeats in the cp genome could provide phylogenetic information, while caution is needed when using indels in phylogenetic analyses based on few selected genes. We also identified relatively rapidly evolving cp genome regions that have the potential to be used for further phylogenetic study in Bambusoideae. Conclusions/Significance The cp genome of Bambusoideae evolved slowly, and phylogenomics based on whole cp genome could be used to resolve major relationships within the subfamily. The difficulty in resolving the diversification among three clades of temperate woody bamboos, even with complete cp genome sequences, suggests that these lineages may have diverged very rapidly. PMID:21655229
Phylogenomics of zygomycete fungi: impacts on a phylogenetic classification of Kingdom Fungi
USDA-ARS?s Scientific Manuscript database
The zygomycetous fungi (”zygomycetes”) mark the major transition from zoosporic life histories of the common ancestor of Fungi and the earliest diverging chytrid lineages (Chytridiomycota and Blastocladiomycota). Their ecological and economic importance range from the earliest documented symbionts o...
Sun, Zhihong; Harris, Hugh M B; McCann, Angela; Guo, Chenyi; Argimón, Silvia; Zhang, Wenyi; Yang, Xianwei; Jeffery, Ian B; Cooney, Jakki C; Kagawa, Todd F; Liu, Wenjun; Song, Yuqin; Salvetti, Elisa; Wrobel, Agnieszka; Rasinkangas, Pia; Parkhill, Julian; Rea, Mary C; O'Sullivan, Orla; Ritari, Jarmo; Douillard, François P; Paul Ross, R; Yang, Ruifu; Briner, Alexandra E; Felis, Giovanna E; de Vos, Willem M; Barrangou, Rodolphe; Klaenhammer, Todd R; Caufield, Page W; Cui, Yujun; Zhang, Heping; O'Toole, Paul W
2015-09-29
Lactobacilli are a diverse group of species that occupy diverse nutrient-rich niches associated with humans, animals, plants and food. They are used widely in biotechnology and food preservation, and are being explored as therapeutics. Exploiting lactobacilli has been complicated by metabolic diversity, unclear species identity and uncertain relationships between them and other commercially important lactic acid bacteria. The capacity for biotransformations catalysed by lactobacilli is an untapped biotechnology resource. Here we report the genome sequences of 213 Lactobacillus strains and associated genera, and their encoded genetic catalogue for modifying carbohydrates and proteins. In addition, we describe broad and diverse presence of novel CRISPR-Cas immune systems in lactobacilli that may be exploited for genome editing. We rationalize the phylogenomic distribution of host interaction factors and bacteriocins that affect their natural and industrial environments, and mechanisms to withstand stress during technological processes. We present a robust phylogenomic framework of existing species and for classifying new species.
Papazisi, Leka; Ratnayake, Shashikala; Remortel, Brian G.; Bock, Geoffrey R.; Liang, Wei; Saeed, Alexander I.; Liu, Jia; Fleischmann, Robert D.; Kilian, Mogens; Peterson, Scott N.
2010-01-01
Here we report the use of a multi-genome DNA microarray to elucidate the genomic events associated with the emergence of the clonal variants of H. influenzae biogroup aegyptius causing Brazilian Purpuric Fever (BPF), an important pediatric disease with a high mortality rate. We performed directed genome sequencing of strain HK1212 unique loci to construct a species DNA microarray. Comparative genome hybridization using this microarray enabled us to determine and compare gene complements, and infer reliable phylogenomic relationships among members of the species. The higher genomic variability observed in the genomes of BPF-related strains (clones) and their close relatives may be characterized by significant gene flux related to a subset of functional role categories. We found that the acquisition of a large number of virulence determinants featuring numerous cell membrane proteins coupled to the loss of genes involved in transport, central biosynthetic pathways and in particular, energy production pathways to be characteristics of the BPF genomic variants. PMID:20654709
Phylogenomic evolutionary surveys of subtilase superfamily genes in fungi.
Li, Juan; Gu, Fei; Wu, Runian; Yang, JinKui; Zhang, Ke-Qin
2017-03-30
Subtilases belong to a superfamily of serine proteases which are ubiquitous in fungi and are suspected to have developed distinct functional properties to help fungi adapt to different ecological niches. In this study, we conducted a large-scale phylogenomic survey of subtilase protease genes in 83 whole genome sequenced fungal species in order to identify the evolutionary patterns and subsequent functional divergences of different subtilase families among the main lineages of the fungal kingdom. Our comparative genomic analyses of the subtilase superfamily indicated that extensive gene duplications, losses and functional diversifications have occurred in fungi, and that the four families of subtilase enzymes in fungi, including proteinase K-like, Pyrolisin, kexin and S53, have distinct evolutionary histories which may have facilitated the adaptation of fungi to a broad array of life strategies. Our study provides new insights into the evolution of the subtilase superfamily in fungi and expands our understanding of the evolution of fungi with different lifestyles.
Sun, Zhihong; Harris, Hugh M. B.; McCann, Angela; Guo, Chenyi; Argimón, Silvia; Zhang, Wenyi; Yang, Xianwei; Jeffery, Ian B; Cooney, Jakki C.; Kagawa, Todd F.; Liu, Wenjun; Song, Yuqin; Salvetti, Elisa; Wrobel, Agnieszka; Rasinkangas, Pia; Parkhill, Julian; Rea, Mary C.; O'Sullivan, Orla; Ritari, Jarmo; Douillard, François P.; Paul Ross, R.; Yang, Ruifu; Briner, Alexandra E.; Felis, Giovanna E.; de Vos, Willem M.; Barrangou, Rodolphe; Klaenhammer, Todd R.; Caufield, Page W.; Cui, Yujun; Zhang, Heping; O'Toole, Paul W.
2015-01-01
Lactobacilli are a diverse group of species that occupy diverse nutrient-rich niches associated with humans, animals, plants and food. They are used widely in biotechnology and food preservation, and are being explored as therapeutics. Exploiting lactobacilli has been complicated by metabolic diversity, unclear species identity and uncertain relationships between them and other commercially important lactic acid bacteria. The capacity for biotransformations catalysed by lactobacilli is an untapped biotechnology resource. Here we report the genome sequences of 213 Lactobacillus strains and associated genera, and their encoded genetic catalogue for modifying carbohydrates and proteins. In addition, we describe broad and diverse presence of novel CRISPR-Cas immune systems in lactobacilli that may be exploited for genome editing. We rationalize the phylogenomic distribution of host interaction factors and bacteriocins that affect their natural and industrial environments, and mechanisms to withstand stress during technological processes. We present a robust phylogenomic framework of existing species and for classifying new species. PMID:26415554
Peters, Ralph S; Meusemann, Karen; Petersen, Malte; Mayer, Christoph; Wilbrandt, Jeanne; Ziesmann, Tanja; Donath, Alexander; Kjer, Karl M; Aspöck, Ulrike; Aspöck, Horst; Aberer, Andre; Stamatakis, Alexandros; Friedrich, Frank; Hünefeld, Frank; Niehuis, Oliver; Beutel, Rolf G; Misof, Bernhard
2014-03-20
Despite considerable progress in systematics, a comprehensive scenario of the evolution of phenotypic characters in the mega-diverse Holometabola based on a solid phylogenetic hypothesis was still missing. We addressed this issue by de novo sequencing transcriptome libraries of representatives of all orders of holometabolan insects (13 species in total) and by using a previously published extensive morphological dataset. We tested competing phylogenetic hypotheses by analyzing various specifically designed sets of amino acid sequence data, using maximum likelihood (ML) based tree inference and Four-cluster Likelihood Mapping (FcLM). By maximum parsimony-based mapping of the morphological data on the phylogenetic relationships we traced evolutionary transformations at the phenotypic level and reconstructed the groundplan of Holometabola and of selected subgroups. In our analysis of the amino acid sequence data of 1,343 single-copy orthologous genes, Hymenoptera are placed as sister group to all remaining holometabolan orders, i.e., to a clade Aparaglossata, comprising two monophyletic subunits Mecopterida (Amphiesmenoptera + Antliophora) and Neuropteroidea (Neuropterida + Coleopterida). The monophyly of Coleopterida (Coleoptera and Strepsiptera) remains ambiguous in the analyses of the transcriptome data, but appears likely based on the morphological data. Highly supported relationships within Neuropterida and Antliophora are Raphidioptera + (Neuroptera + monophyletic Megaloptera), and Diptera + (Siphonaptera + Mecoptera). ML tree inference and FcLM yielded largely congruent results. However, FcLM, which was applied here for the first time to large phylogenomic supermatrices, displayed additional signal in the datasets that was not identified in the ML trees. Our phylogenetic results imply that an orthognathous larva belongs to the groundplan of Holometabola, with compound eyes and well-developed thoracic legs, externally feeding on plants or fungi. Ancestral larvae of Aparaglossata were prognathous, equipped with single larval eyes (stemmata), and possibly agile and predacious. Ancestral holometabolan adults likely resembled in their morphology the groundplan of adult neopteran insects. Within Aparaglossata, the adult's flight apparatus and ovipositor underwent strong modifications. We show that the combination of well-resolved phylogenies obtained by phylogenomic analyses and well-documented extensive morphological datasets is an appropriate basis for reconstructing complex morphological transformations and for the inference of evolutionary histories.
USDA-ARS?s Scientific Manuscript database
We describe new methods for characterizing gene tree discordance in phylogenomic datasets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allow comparison of the patterns of discordance induced by various analysis choices. Usin...
Resolving the Mortierellaceae phylogeny through Multi-Locus Sequence Typing (MLST) and phylogenomics
USDA-ARS?s Scientific Manuscript database
The Mortierellaceae (Mortierellomycotina) are a diverse family of fungi that are of evolutionary and ecological relevance. They are the closest lineage to the arbuscular mycorrhizae (Glomeromycotina) and include some of the first species to evolve fruiting body production. The Mortierellaceae are es...
Resolving the Evolution of Extant and Extinct Ruminants With High-Throughput Phylogenomics
USDA-ARS?s Scientific Manuscript database
The Pecorans (higher ruminants) are believed to have rapidly speciated in the Mid-Eocene, resulting in five distinct extant families; Antilocapridae, Giraffidae, Moschidae, Cervidae, and Bovidae. Due to the rapid radiation, the Pecoran phylogeny has proven difficult to resolve and eleven of the fift...
Patterns and processes of Mycobacterium bovis evolution revealed by phylogenomic analyses
USDA-ARS?s Scientific Manuscript database
Mycobacterium bovis is an important animal pathogen worldwide that parasitizes wild and domesticated vertebrate livestock as well as humans. A comparison of the five M. bovis complete genomes from UK, South Korea, Brazil and USA revealed four novel large-scale structural variations of at least 2,000...
Insights into transcriptomes of Big and Low sagebrush
Mark D. Huynh; Justin T. Page; Bryce A. Richardson; Joshua A. Udall
2015-01-01
We report the sequencing and assembly of three transcriptomes from Big (Artemisia tridentatassp. wyomingensis and A. tridentatassp. tridentata) and Low (A. arbuscula ssp. arbuscula) sagebrush. The sequence reads are available in the Sequence Read Archive of NCBI. We demonstrate the utilities of these transcriptomes for gene discovery and phylogenomic analysis. An...
HIGH-THROUGHPUT PHYLOGENOMICS: FROM ANCIENT DNA TO SIGNATURES OF HUMAN ANIMAL HUSBANDRY
USDA-ARS?s Scientific Manuscript database
We utilized the Illumina BovineSNP50 BeadChip with 54,693 single nucleotide polymorphism loci developed for Bos taurus taurus to rapidly genotype 677 individuals representing 61 Pecoran (horned ruminant) species diverged by up to 29 million years. We produced a completely bifurcating tree, the first...
Lischer, Heidi E L; Excoffier, Laurent; Heckel, Gerald
2014-04-01
Phylogenetic reconstruction of the evolutionary history of closely related organisms may be difficult because of the presence of unsorted lineages and of a relatively high proportion of heterozygous sites that are usually not handled well by phylogenetic programs. Genomic data may provide enough fixed polymorphisms to resolve phylogenetic trees, but the diploid nature of sequence data remains analytically challenging. Here, we performed a phylogenomic reconstruction of the evolutionary history of the common vole (Microtus arvalis) with a focus on the influence of heterozygosity on the estimation of intraspecific divergence times. We used genome-wide sequence information from 15 voles distributed across the European range. We provide a novel approach to integrate heterozygous information in existing phylogenetic programs by repeated random haplotype sampling from sequences with multiple unphased heterozygous sites. We evaluated the impact of the use of full, partial, or no heterozygous information for tree reconstructions on divergence time estimates. All results consistently showed four deep and strongly supported evolutionary lineages in the vole data. These lineages undergoing divergence processes split only at the end or after the last glacial maximum based on calibration with radiocarbon-dated paleontological material. However, the incorporation of information from heterozygous sites had a significant impact on absolute and relative branch length estimations. Ignoring heterozygous information led to an overestimation of divergence times between the evolutionary lineages of M. arvalis. We conclude that the exclusion of heterozygous sites from evolutionary analyses may cause biased and misleading divergence time estimates in closely related taxa.
Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gregory, Ann C.; Solonenko, Sergei A.; Ignacio-Espinoza, J. Cesar
Genetic recombination is a driving force in genome evolution. Among viruses it has a dual role. For genomes with higher fitness, it maintains genome integrity in the face of high mutation rates. Conversely, for genomes with lower fitness, it provides immediate access to sequence space that cannot be reached by mutation alone. Understanding how recombination impacts the cohesion and dissolution of individual whole genomes within viral sequence space is poorly understood across double-stranded DNA bacteriophages (a.k.a phages) due to the challenges of obtaining appropriately scaled genomic datasets. Here in this study we explore the role of recombination in both maintainingmore » and differentiating whole genomes of 142 wild double-stranded DNA marine cyanophages. Phylogenomic analysis across the 51 core genes revealed ten lineages, six of which were well represented. These phylogenomic lineages represent discrete genotypic populations based on comparisons of intra- and inter- lineage shared gene content, genome-wide average nucleotide identity, as well as detected gaps in the distribution of pairwise differences between genomes. McDonald-Kreitman selection tests identified putative niche-differentiating genes under positive selection that differed across the six well-represented genotypic populations and that may have driven initial divergence. Concurrent with patterns of recombination of discrete populations, recombination analyses of both genic and intergenic regions largely revealed decreased genetic exchange across individual genomes between relative to within populations. Lastly, these findings suggest that discrete double-stranded DNA marine cyanophage populations occur in nature and are maintained by patterns of recombination akin to those observed in bacteria, archaea and in sexual eukaryotes.« less
Terán, Lucrecia C; Coeuret, Gwendoline; Raya, Raúl; Zagorec, Monique; Champomier-Vergès, Marie-Christine; Chaillou, Stéphane
2018-06-01
Lactobacillus curvatus is a lactic acid bacterium encountered in many different types of fermented food (meat, seafood, vegetables, and cereals). Although this species plays an important role in the preservation of these foods, few attempts have been made to assess its genomic diversity. This study uses comparative analyses of 13 published genomes (complete or draft) to better understand the evolutionary processes acting on the genome of this species. Phylogenomic analysis, based on a coalescent model of evolution, revealed that the 6,742 sites of single nucleotide polymorphism within the L. curvatus core genome delineate two major groups, with lineage 1 represented by the newly sequenced strain FLEC03, and lineage 2 represented by the type-strain DSM20019. The two lineages could also be distinguished by the content of their accessory genome, which sheds light on a long-term evolutionary process of lineage-dependent genetic acquisition and the possibility of population structure. Interestingly, one clade from lineage 2 shared more accessory genes with strains of lineage 1 than with other strains of lineage 2, indicating recent convergence in carbohydrate catabolism. Both lineages had a wide repertoire of accessory genes involved in the fermentation of plant-derived carbohydrates that are released from polymers of α/β-glucans, α/β-fructans, and N-acetylglucosan. Other gene clusters were distributed among strains according to the type of food from which the strains were isolated. These results give new insight into the ecological niches in which L. curvatus may naturally thrive (such as silage or compost heaps) in addition to fermented food.
Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer
Gregory, Ann C.; Solonenko, Sergei A.; Ignacio-Espinoza, J. Cesar; ...
2016-11-16
Genetic recombination is a driving force in genome evolution. Among viruses it has a dual role. For genomes with higher fitness, it maintains genome integrity in the face of high mutation rates. Conversely, for genomes with lower fitness, it provides immediate access to sequence space that cannot be reached by mutation alone. Understanding how recombination impacts the cohesion and dissolution of individual whole genomes within viral sequence space is poorly understood across double-stranded DNA bacteriophages (a.k.a phages) due to the challenges of obtaining appropriately scaled genomic datasets. Here in this study we explore the role of recombination in both maintainingmore » and differentiating whole genomes of 142 wild double-stranded DNA marine cyanophages. Phylogenomic analysis across the 51 core genes revealed ten lineages, six of which were well represented. These phylogenomic lineages represent discrete genotypic populations based on comparisons of intra- and inter- lineage shared gene content, genome-wide average nucleotide identity, as well as detected gaps in the distribution of pairwise differences between genomes. McDonald-Kreitman selection tests identified putative niche-differentiating genes under positive selection that differed across the six well-represented genotypic populations and that may have driven initial divergence. Concurrent with patterns of recombination of discrete populations, recombination analyses of both genic and intergenic regions largely revealed decreased genetic exchange across individual genomes between relative to within populations. Lastly, these findings suggest that discrete double-stranded DNA marine cyanophage populations occur in nature and are maintained by patterns of recombination akin to those observed in bacteria, archaea and in sexual eukaryotes.« less
The construction of an EST database for Bombyx mori and its application
Mita, Kazuei; Morimyo, Mitsuoki; Okano, Kazuhiro; Koike, Yoshiko; Nohata, Junko; Kawasaki, Hideki; Kadono-Okuda, Keiko; Yamamoto, Kimiko; Suzuki, Masataka G.; Shimada, Toru; Goldsmith, Marian R.; Maeda, Susumu
2003-01-01
To build a foundation for the complete genome analysis of Bombyx mori, we have constructed an EST database. Because gene expression patterns deeply depend on tissues as well as developmental stages, we analyzed many cDNA libraries prepared from various tissues and different developmental stages to cover the entire set of Bombyx genes. So far, the Bombyx EST database contains 35,000 ESTs from 36 cDNA libraries, which are grouped into ≈11,000 nonredundant ESTs with the average length of 1.25 kb. The comparison with FlyBase suggests that the present EST database, SilkBase, covers >55% of all genes of Bombyx. The fraction of library-specific ESTs in each cDNA library indicates that we have not yet reached saturation, showing the validity of our strategy for constructing an EST database to cover all genes. To tackle the coming saturation problem, we have checked two methods, subtraction and normalization, to increase coverage and decrease the number of housekeeping genes, resulting in a 5–11% increase of library-specific ESTs. The identification of a number of genes and comprehensive cloning of gene families have already emerged from the SilkBase search. Direct links of SilkBase with FlyBase and WormBase provide ready identification of candidate Lepidoptera-specific genes. PMID:14614147
Building the tree of life from scratch: an end-to-end work flow for phylogenomic studies
USDA-ARS?s Scientific Manuscript database
Whole genome sequences are rich sources of information about organisms that are superbly useful for addressing a wide variety of evolutionary questions. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understan...
Papazisi, Leka; Ratnayake, Shashikala; Remortel, Brian G; Bock, Geoffrey R; Liang, Wei; Saeed, Alexander I; Liu, Jia; Fleischmann, Robert D; Kilian, Mogens; Peterson, Scott N
2010-11-01
Here we report the use of a multi-genome DNA microarray to elucidate the genomic events associated with the emergence of the clonal variants of Haemophilus influenzae biogroup aegyptius causing Brazilian Purpuric Fever (BPF), an important pediatric disease with a high mortality rate. We performed directed genome sequencing of strain HK1212 unique loci to construct a species DNA microarray. Comparative genome hybridization using this microarray enabled us to determine and compare gene complements, and infer reliable phylogenomic relationships among members of the species. The higher genomic variability observed in the genomes of BPF-related strains (clones) and their close relatives may be characterized by significant gene flux related to a subset of functional role categories. We found that the acquisition of a large number of virulence determinants featuring numerous cell membrane proteins coupled to the loss of genes involved in transport, central biosynthetic pathways and in particular, energy production pathways to be characteristics of the BPF genomic variants. Copyright © 2010 Elsevier Inc. All rights reserved.
Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen
2010-07-01
We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.
Domingos, Fabricius M C B; Colli, Guarino R; Lemmon, Alan; Lemmon, Emily Moriarty; Beheregaray, Luciano B
2017-02-01
The recognition of cryptic diversity within geographically widespread species is gradually becoming a trend in the highly speciose Neotropical biomes. The statistical methods to recognise such cryptic lineages are rapidly advancing, but have rarely been applied to genomic-scale datasets. Herein, we used phylogenomic data to investigate phylogenetic history and cryptic diversity within Tropidurus itambere, a lizard endemic to the Cerrado biodiversity hotspot. We applied a series of phylogenetic methods to reconstruct evolutionary relationships and a coalescent Bayesian species delimitation approach (BPP) to clarify species limits. The BPP results suggest that the widespread nominal taxon comprises a complex of 5 highly supported and geographically structured cryptic species. We highlight and discuss the different topological patterns recovered by concatenated and coalescent species tree methods for these closely related lineages. Finally, we suggest that the existence of cryptic lineages in the Cerrado is much more common than traditionally thought, highlighting the value of using NGS data and coalescent techniques to investigate patterns of species diversity. Copyright © 2016 Elsevier Inc. All rights reserved.
Phylogenomics of nonavian reptiles and the structure of the ancestral amniote genome
Shedlock, Andrew M.; Botka, Christopher W.; Zhao, Shaying; Shetty, Jyoti; Zhang, Tingting; Liu, Jun S.; Deschavanne, Patrick J.; Edwards, Scott V.
2007-01-01
We report results of a megabase-scale phylogenomic analysis of the Reptilia, the sister group of mammals. Large-scale end-sequence scanning of genomic clones of a turtle, alligator, and lizard reveals diverse, mammal-like landscapes of retroelements and simple sequence repeats (SSRs) not found in the chicken. Several global genomic traits, including distinctive phylogenetic lineages of CR1-like long interspersed elements (LINEs) and a paucity of A-T rich SSRs, characterize turtles and archosaur genomes, whereas higher frequencies of tandem repeats and a lower global GC content reveal mammal-like features in Anolis. Nonavian reptile genomes also possess a high frequency of diverse and novel 50-bp unit tandem duplications not found in chicken or mammals. The frequency distributions of ≈65,000 8-mer oligonucleotides suggest that rates of DNA-word frequency change are an order of magnitude slower in reptiles than in mammals. These results suggest a diverse array of interspersed and SSRs in the common ancestor of amniotes and a genomic conservatism and gradual loss of retroelements in reptiles that culminated in the minimalist chicken genome. PMID:17307883
Systematic Error in Seed Plant Phylogenomics
Zhong, Bojian; Deusch, Oliver; Goremykin, Vadim V.; Penny, David; Biggs, Patrick J.; Atherton, Robin A.; Nikiforova, Svetlana V.; Lockhart, Peter James
2011-01-01
Resolving the closest relatives of Gnetales has been an enigmatic problem in seed plant phylogeny. The problem is known to be difficult because of the extent of divergence between this diverse group of gymnosperms and their closest phylogenetic relatives. Here, we investigate the evolutionary properties of conifer chloroplast DNA sequences. To improve taxon sampling of Cupressophyta (non-Pinaceae conifers), we report sequences from three new chloroplast (cp) genomes of Southern Hemisphere conifers. We have applied a site pattern sorting criterion to study compositional heterogeneity, heterotachy, and the fit of conifer chloroplast genome sequences to a general time reversible + G substitution model. We show that non-time reversible properties of aligned sequence positions in the chloroplast genomes of Gnetales mislead phylogenetic reconstruction of these seed plants. When 2,250 of the most varied sites in our concatenated alignment are excluded, phylogenetic analyses favor a close evolutionary relationship between the Gnetales and Pinaceae—the Gnepine hypothesis. Our analytical protocol provides a useful approach for evaluating the robustness of phylogenomic inferences. Our findings highlight the importance of goodness of fit between substitution model and data for understanding seed plant phylogeny. PMID:22016337
An Expanded Genomic Representation of the Phylum Cyanobacteria
Soo, Rochelle M.; Skennerton, Connor T.; Sekiguchi, Yuji; Imelfort, Michael; Paech, Samuel J.; Dennis, Paul G.; Steen, Jason A.; Parks, Donovan H.; Tyson, Gene W.; Hugenholtz, Philip
2014-01-01
Molecular surveys of aphotic habitats have indicated the presence of major uncultured lineages phylogenetically classified as members of the Cyanobacteria. One of these lineages has recently been proposed as a nonphotosynthetic sister phylum to the Cyanobacteria, the Melainabacteria, based on recovery of population genomes from human gut and groundwater samples. Here, we expand the phylogenomic representation of the Melainabacteria through sequencing of six diverse population genomes from gut and bioreactor samples supporting the inference that this lineage is nonphotosynthetic, but not the assertion that they are strictly fermentative. We propose that the Melainabacteria is a class within the phylogenetically defined Cyanobacteria based on robust monophyly and shared ancestral traits with photosynthetic representatives. Our findings are consistent with theories that photosynthesis occurred late in the Cyanobacteria and involved extensive lateral gene transfer and extends the recognized functionality of members of this phylum. PMID:24709563
Burke, Sean V.; Wysocki, William P.; Clark, Lynn G.
2018-01-01
The systematics of grasses has advanced through applications of plastome phylogenomics, although studies have been largely limited to subfamilies or other subgroups of Poaceae. Here we present a plastome phylogenomic analysis of 250 complete plastomes (179 genera) sampled from 44 of the 52 tribes of Poaceae. Plastome sequences were determined from high throughput sequencing libraries and the assemblies represent over 28.7 Mbases of sequence data. Phylogenetic signal was characterized in 14 partitions, including (1) complete plastomes; (2) protein coding regions; (3) noncoding regions; and (4) three loci commonly used in single and multi-gene studies of grasses. Each of the four main partitions was further refined, alternatively including or excluding positively selected codons and also the gaps introduced by the alignment. All 76 protein coding plastome loci were found to be predominantly under purifying selection, but specific codons were found to be under positive selection in 65 loci. The loci that have been widely used in multi-gene phylogenetic studies had among the highest proportions of positively selected codons, suggesting caution in the interpretation of these earlier results. Plastome phylogenomic analyses confirmed the backbone topology for Poaceae with maximum bootstrap support (BP). Among the 14 analyses, 82 clades out of 309 resolved were maximally supported in all trees. Analyses of newly sequenced plastomes were in agreement with current classifications. Five of seven partitions in which alignment gaps were removed retrieved Panicoideae as sister to the remaining PACMAD subfamilies. Alternative topologies were recovered in trees from partitions that included alignment gaps. This suggests that ambiguities in aligning these uncertain regions might introduce a false signal. Resolution of these and other critical branch points in the phylogeny of Poaceae will help to better understand the selective forces that drove the radiation of the BOP and PACMAD clades comprising more than 99.9% of grass diversity. PMID:29416954
USDA-ARS?s Scientific Manuscript database
Virulence determines the impact a pathogen has on the fitness of its host, yet current understanding of the evolutionary origins and causes of virulence of many pathogens is surprisingly incomplete. Here, we explore the evolution of Marek’s disease virus (MDV), a herpesvirus commonly afflicting chic...
de Santana Lopes, Amanda; Pacheco, Túlio Gomes; do Nascimento Vieira, Leila; Guerra, Miguel Pedro; Nodari, Rubens Onofre; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Rogalski, Marcelo
2018-05-23
Crambe abyssinica is an important oilseed crop that accumulates high levels of erucic acid, which is being recognized as a potential oil platform for several industrial purposes. It belongs to the family Brassicaceae, assigned within the tribe Brassiceae. Both family and tribe have been the subject of several phylogenetic studies, but the relationship between some lineages and genera remains unclear. Here, we report the complete sequencing and characterization of the C. abyssinica plastome. Plastome structure, gene order, and gene content of C. abyssinica are similar to other species of the family Brassicaceae. The only exception is the rps16 gene, which is absent in many genera within the family Brassicaceae, but seems to be functional in the tribe Brassiceae, including C. abyssinica. However, the analysis of gene divergence shows that the rps16 is the most divergent gene in C. abyssinica and within the tribe Brassiceae. In addition, species of the tribe Brassiceae also show similar SSR loci distribution, with some regions containing a high number of SSRs, which are located mainly at the single copy regions. Six hotspots of nucleotide divergence among Brassiceae species were located in the single copy regions by sliding window analysis. Brassicaceae phylogenomic analysis, based on the complete plastomes of 72 taxa, resulted in a well-supported and well-resolved tree. The genus Crambe is positioned within the Brassiceae clade together with the genera Brassica, Raphanus, Sinapis, Cakile, Orychophragmus and Sinalliaria. Moreover, we report several losses and gains of RNA editing sites that occurred in plastomes of Brassiceae species during evolution. Copyright © 2017. Published by Elsevier B.V.
The natural history of biocatalytic mechanisms.
Nath, Neetika; Mitchell, John B O; Caetano-Anollés, Gustavo
2014-05-01
Phylogenomic analysis of the occurrence and abundance of protein domains in proteomes has recently showed that the α/β architecture is probably the oldest fold design. This holds important implications for the origins of biochemistry. Here we explore structure-function relationships addressing the use of chemical mechanisms by ancestral enzymes. We test the hypothesis that the oldest folds used the most mechanisms. We start by tracing biocatalytic mechanisms operating in metabolic enzymes along a phylogenetic timeline of the first appearance of homologous superfamilies of protein domain structures from CATH. A total of 335 enzyme reactions were retrieved from MACiE and were mapped over fold age. We define a mechanistic step type as one of the 51 mechanistic annotations given in MACiE, and each step of each of the 335 mechanisms was described using one or more of these annotations. We find that the first two folds, the P-loop containing nucleotide triphosphate hydrolase and the NAD(P)-binding Rossmann-like homologous superfamilies, were α/β architectures responsible for introducing 35% (18/51) of the known mechanistic step types. We find that these two oldest structures in the phylogenomic analysis of protein domains introduced many mechanistic step types that were later combinatorially spread in catalytic history. The most common mechanistic step types included fundamental building blocks of enzyme chemistry: "Proton transfer," "Bimolecular nucleophilic addition," "Bimolecular nucleophilic substitution," and "Unimolecular elimination by the conjugate base." They were associated with the most ancestral fold structure typical of P-loop containing nucleotide triphosphate hydrolases. Over half of the mechanistic step types were introduced in the evolutionary timeline before the appearance of structures specific to diversified organisms, during a period of architectural diversification. The other half unfolded gradually after organismal diversification and during a period that spanned ∼2 billion years of evolutionary history.
Phylogenomics reveals an extensive history of genome duplication in diatoms (Bacillariophyta).
Parks, Matthew B; Nakov, Teofil; Ruck, Elizabeth C; Wickett, Norman J; Alverson, Andrew J
2018-03-01
Diatoms are one of the most species-rich lineages of microbial eukaryotes. Similarities in clade age, species richness, and primary productivity motivate comparisons to angiosperms, whose genomes have been inordinately shaped by whole-genome duplication (WGD). WGDs have been linked to speciation, increased rates of lineage diversification, and identified as a principal driver of angiosperm evolution. We synthesized a large but scattered body of evidence that suggests polyploidy may be common in diatoms as well. We used gene counts, gene trees, and distributions of synonymous divergence to carry out a phylogenomic analysis of WGD across a diverse set of 37 diatom species. Several methods identified WGDs of varying age across diatoms. Determining the occurrence, exact number, and placement of events was greatly impacted by uncertainty in gene trees. WGDs inferred from synonymous divergence of paralogs varied depending on how redundancy in transcriptomes was assessed, gene families were assembled, and synonymous distances (Ks) were calculated. Our results highlighted a need for systematic evaluation of key methodological aspects of Ks-based approaches to WGD inference. Gene tree reconciliations supported allopolyploidy as the predominant mode of polyploid formation, with strong evidence for ancient allopolyploid events in the thalassiosiroid and pennate diatom clades. Our results suggest that WGD has played a major role in the evolution of diatom genomes. We outline challenges in reconstructing paleopolyploid events in diatoms that, together with these results, offer a framework for understanding the impact of genome duplication in a group that likely harbors substantial genomic diversity. © 2018 The Authors. American Journal of Botany is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.
Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae
Huang, Yuan; Wang, Jun; Yang, Yongping; Fan, Chuanzhu; Chen, Jiahui
2017-01-01
Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs) and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in Salicaceae provide resources to better understand the successful adaptation of Salicaceae species. PMID:28676809
Longo, S J; Faircloth, B C; Meyer, A; Westneat, M W; Alfaro, M E; Wainwright, P C
2017-08-01
Phylogenetics is undergoing a revolution as large-scale molecular datasets reveal unexpected but repeatable rearrangements of clades that were previously thought to be disparate lineages. One of the most unusual clades of fishes that has been found using large-scale molecular datasets is an expanded Syngnathiformes including traditional long-snouted syngnathiform lineages (Aulostomidae, Centriscidae, Fistulariidae, Solenostomidae, Syngnathidae), as well as a diverse set of largely benthic-associated fishes (Callionymoidei, Dactylopteridae, Mullidae, Pegasidae) that were previously dispersed across three orders. The monophyly of this surprising clade of fishes has been upheld by recent studies utilizing both nuclear and mitogenomic data, but the relationships among major lineages within Syngnathiformes remain ambiguous; previous analyses have inconsistent topologies and are plagued by low support at deep divergences between the major lineages. In this study, we use a dataset of ultraconserved elements (UCEs) to conduct the first phylogenomic study of Syngnathiformes. UCEs have been effective markers for resolving deep phylogenetic relationships in fishes and, combined with increased taxon sampling, we expected UCEs to resolve problematic syngnathiform relationships. Overall, UCEs were effective at resolving relationships within Syngnathiformes at a range of evolutionary timescales. We find consistent support for the monophyly of traditional long-snouted syngnathiform lineages (Aulostomidae, Centriscidae, Fistulariidae, Solenostomidae, Syngnathidae), which better agrees with morphological hypotheses than previously published topologies from molecular data. This result was supported by all Bayesian and maximum likelihood analyses, was robust to differences in matrix completeness and potential sources of bias, and was highly supported in coalescent-based analyses in ASTRAL when matrices were filtered to contain the most phylogenetically informative loci. While Bayesian and maximum likelihood analyses found support for a benthic-associated clade (Callionymidae, Dactylopteridae, Mullidae, and Pegasidae) as sister to the long-snouted clade, this result was not replicated in the ASTRAL analyses. The base of our phylogeny is characterized by short internodes separating major syngnathiform lineages and is consistent with the hypothesis of an ancient rapid radiation at the base of Syngnathiformes. Syngnathiformes therefore present an exciting opportunity to study patterns of morphological variation and functional innovation arising from rapid but ancient radiation. Copyright © 2017 Elsevier Inc. All rights reserved.
Structure and Evolution of Insect Sperm: New Interpretations in the Age of Phylogenomics.
Dallai, Romano; Gottardo, Marco; Beutel, Rolf Georg
2016-01-01
This comprehensive review of the structure of sperm in all orders of insects evaluates phylogenetic implications, with the background of a phylogeny based on transcriptomes. Sperm characters strongly support several major branches of the phylogeny of insects-for instance, Cercophora, Dicondylia, and Psocodea-and also different infraordinal groups. Some closely related taxa, such as Trichoptera and Lepidoptera (Amphiesmenoptera), differ greatly in sperm structure. Sperm characters are very conservative in some groups (Heteroptera, Odonata) but highly variable in others, including Zoraptera, a small and morphologically uniform group with a tremendously accelerated rate of sperm evolution. Unusual patterns such as sperm dimorphism, the formation of bundles, or aflagellate and immotile sperm have evolved independently in several groups.
Hyb-Seq: combining target enrichment and genome skimming for plant phylogenomics
Kevin Weitemier; Shannon C.K. Straub; Richard C. Cronn; Mark Fishbein; Roswitha Schmickl; Angela McDonnell; Aaron Liston
2014-01-01
⢠Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. ⢠Methods and Results: Genome and transcriptome assemblies for milkweed ( Asclepias syriaca ) were used to design enrichment probes for 3385...
USDA-ARS?s Scientific Manuscript database
‘Bacillus vanillea’ XY18T (=CGMCC 8629 T =NCCB 100507 T) was isolated from cured vanilla beans and involved in the formation of vanilla aroma compounds. A draft genome of this type strain was assembled and yielded a length of 3.72 Mbp and a GC content of 46.3%. Comparative genomic analysis with its ...
Sudianto, Edi; Wu, Chung-Shien; Lin, Ching-Ping; Chaw, Shu-Miaw
2016-01-01
Phylogeny of the ten Pinaceous genera has long been contentious. Plastid genomes (plastomes) provide an opportunity to resolve this problem because they contain rich evolutionary information. To comprehend the plastid phylogenomics of all ten Pinaceous genera, we sequenced the plastomes of two previously unavailable genera, Pseudolarix amabilis (122,234 bp) and Tsuga chinensis (120,859 bp). Both plastomes share similar gene repertoire and order. Here for the first time we report a unique insertion of tandem repeats in accD of T. chinensis. From the 65 plastid protein-coding genes common to all Pinaceous genera, we re-examined the phylogenetic relationship among all Pinaceous genera. Our two phylogenetic trees are congruent in an identical tree topology, with the five genera of the Abietoideae subfamily constituting a monophyletic clade separate from the other three subfamilies: Pinoideae, Piceoideae, and Laricoideae. The five genera of Abietoideae were grouped into two sister clades consisting of (1) Cedrus alone and (2) two sister subclades of Pseudolarix—Tsuga and Abies—Keteleeria, with the former uniquely losing the gene psaM and the latter specifically excluding the 3 psbA from the residual inverted repeat. PMID:27352945
Tsagkogeorga, Georgia; McGowen, Michael R; Davies, Kalina T J; Jarman, Simon; Polanowski, Andrea; Bertelsen, Mads F; Rossiter, Stephen J
2015-09-01
Recent studies have reported multiple cases of molecular adaptation in cetaceans related to their aquatic abilities. However, none of these has included the hippopotamus, precluding an understanding of whether molecular adaptations in cetaceans occurred before or after they split from their semi-aquatic sister taxa. Here, we obtained new transcriptomes from the hippopotamus and humpback whale, and analysed these together with available data from eight other cetaceans. We identified more than 11 000 orthologous genes and compiled a genome-wide dataset of 6845 coding DNA sequences among 23 mammals, to our knowledge the largest phylogenomic dataset to date for cetaceans. We found positive selection in nine genes on the branch leading to the common ancestor of hippopotamus and whales, and 461 genes in cetaceans compared to 64 in hippopotamus. Functional annotation revealed adaptations in diverse processes, including lipid metabolism, hypoxia, muscle and brain function. By combining these findings with data on protein-protein interactions, we found evidence suggesting clustering among gene products relating to nervous and muscular systems in cetaceans. We found little support for shared ancestral adaptations in the two taxa; most molecular adaptations in extant cetaceans occurred after their split with hippopotamids.
Architecture of a Species: Phylogenomics of Staphylococcus aureus.
Planet, Paul J; Narechania, Apurva; Chen, Liang; Mathema, Barun; Boundy, Sam; Archer, Gordon; Kreiswirth, Barry
2017-02-01
A deluge of whole-genome sequencing has begun to give insights into the patterns and processes of microbial evolution, but genome sequences have accrued in a haphazard manner, with biased sampling of natural variation that is driven largely by medical and epidemiological priorities. For instance, there is a strong bias for sequencing epidemic lineages of methicillin-resistant Staphylococcus aureus (MRSA) over sensitive isolates (methicillin-sensitive S. aureus: MSSA). As more diverse genomes are sequenced the emerging picture is of a highly subdivided species with a handful of relatively clonal groups (complexes) that, at any given moment, dominate in particular geographical regions. The establishment of hegemony of particular clones appears to be a dynamic process of successive waves of replacement of the previously dominant clone. Here we review the phylogenomic structure of a diverse range of S. aureus, including both MRSA and MSSA. We consider the utility of the concept of the 'core' genome and the impact of recombination and horizontal transfer. We argue that whole-genome surveillance of S. aureus populations could lead to better forecasting of antibiotic resistance and virulence of emerging clones, and a better understanding of the elusive biological factors that determine repeated strain replacement. Copyright © 2016. Published by Elsevier Ltd.
A phylogenomic data-driven exploration of viral origins and evolution
Nasir, Arshan; Caetano-Anollés, Gustavo
2015-01-01
The origin of viruses remains mysterious because of their diverse and patchy molecular and functional makeup. Although numerous hypotheses have attempted to explain viral origins, none is backed by substantive data. We take full advantage of the wealth of available protein structural and functional data to explore the evolution of the proteomic makeup of thousands of cells and viruses. Despite the extremely reduced nature of viral proteomes, we established an ancient origin of the “viral supergroup” and the existence of widespread episodes of horizontal transfer of genetic information. Viruses harboring different replicon types and infecting distantly related hosts shared many metabolic and informational protein structural domains of ancient origin that were also widespread in cellular proteomes. Phylogenomic analysis uncovered a universal tree of life and revealed that modern viruses reduced from multiple ancient cells that harbored segmented RNA genomes and coexisted with the ancestors of modern cells. The model for the origin and evolution of viruses and cells is backed by strong genomic and structural evidence and can be reconciled with existing models of viral evolution if one considers viruses to have originated from ancient cells and not from modern counterparts. PMID:26601271
Gutiérrez-García, Karina; Neira-González, Adriana; Pérez-Gutiérrez, Rosa Martha; Granados-Ramírez, Giovana; Zarraga, Ramon; Wrobel, Kazimierz; Barona-Gómez, Francisco; Flores-Cotera, Luis B
2017-07-28
2,4-Diacetylphloroglucinol (DAPG) (1) is a phenolic polyketide produced by some plant-associated Pseudomonas species, with many biological activities and ecological functions. Here, we aimed at reconstructing the natural history of DAPG using phylogenomics focused at its biosynthetic gene cluster or phl genes. In addition to around 1500 publically available genomes, we obtained and analyzed the sequences of nine novel Pseudomonas endophytes isolated from the antidiabetic medicinal plant Piper auritum. We found that 29 organisms belonging to six Pseudomonas species contain the phl genes at different frequencies depending on the species. The evolution of the phl genes was then reconstructed, leading to at least two clades postulated to correlate with the known chemical diversity surrounding DAPG biosynthesis. Moreover, two of the newly obtained Pseudomonas endophytes with high antiglycation activity were shown to exert their inhibitory activity against the formation of advanced glycation end-products via DAPG and related congeners. Its isomer, 5-hydroxyferulic acid (2), detected during bioactivity-guided fractionation, together with other DAPG congeners, were found to enhance the detected inhibitory activity. This report provides evidence of a link between the evolution and chemical diversity of DAPG and congeners.
New phylogenomic and comparative analyses provide corroborating evidence that Myxozoa is Cnidaria.
Feng, Jin-Mei; Xiong, Jie; Zhang, Jin-Yong; Yang, Ya-Lin; Yao, Bin; Zhou, Zhi-Gang; Miao, Wei
2014-12-01
Myxozoa, a diverse group of morphologically simplified endoparasites, are well known fish parasites causing substantial economic losses in aquaculture. Despite active research, the phylogenetic position of Myxozoa remains ambiguous. After obtaining the genome and transcriptome data of the myxozoan Thelohanellus kitauei, we examined the phylogenetic position of Myxozoa from three different perspectives. First, phylogenomic analyses with the newly sequenced genomic data strongly supported the monophyly of Myxozoa and that Myxozoa is sister to Medusozoa within Cnidaria. Second, we detected two homologs to cnidarian-specific minicollagens in the T. kitauei genome with molecular characteristics similar to cnidarian-specific minicollagens, suggesting that the minicollagen homologs in T. kitauei may have functions similar to those in Cnidaria and that Myxozoa is Cnidaria. Additionally, phylogenetic analyses revealed that the minicollagens in myxozoans and medusozoans have a common ancestor. Third, we detected 11 of the 19 proto-mesodermalgenes in the T. kitauei genome, which were also present in the cnidarian Hydra magnipapillata, indicating Myxozoa is within Cnidaria. Thus, our results robustly support Myxozoa as a derived cnidarian taxon with an affinity to Medusozoa, helping to understand the diversity of the morphology, development and life cycle of Cnidaria and its evolution. Copyright © 2014 Elsevier Inc. All rights reserved.
Tsagkogeorga, Georgia; McGowen, Michael R.; Davies, Kalina T. J.; Jarman, Simon; Polanowski, Andrea; Bertelsen, Mads F.; Rossiter, Stephen J.
2015-01-01
Recent studies have reported multiple cases of molecular adaptation in cetaceans related to their aquatic abilities. However, none of these has included the hippopotamus, precluding an understanding of whether molecular adaptations in cetaceans occurred before or after they split from their semi-aquatic sister taxa. Here, we obtained new transcriptomes from the hippopotamus and humpback whale, and analysed these together with available data from eight other cetaceans. We identified more than 11 000 orthologous genes and compiled a genome-wide dataset of 6845 coding DNA sequences among 23 mammals, to our knowledge the largest phylogenomic dataset to date for cetaceans. We found positive selection in nine genes on the branch leading to the common ancestor of hippopotamus and whales, and 461 genes in cetaceans compared to 64 in hippopotamus. Functional annotation revealed adaptations in diverse processes, including lipid metabolism, hypoxia, muscle and brain function. By combining these findings with data on protein–protein interactions, we found evidence suggesting clustering among gene products relating to nervous and muscular systems in cetaceans. We found little support for shared ancestral adaptations in the two taxa; most molecular adaptations in extant cetaceans occurred after their split with hippopotamids. PMID:26473040
The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions
Merchant, Sabeeha S.; Prochnik, Simon E.; Vallon, Olivier; Harris, Elizabeth H.; Karpowicz, Steven J.; Witman, George B.; Terry, Astrid; Salamov, Asaf; Fritz-Laylin, Lillian K.; Maréchal-Drouard, Laurence; Marshall, Wallace F.; Qu, Liang-Hu; Nelson, David R.; Sanderfoot, Anton A.; Spalding, Martin H.; Kapitonov, Vladimir V.; Ren, Qinghu; Ferris, Patrick; Lindquist, Erika; Shapiro, Harris; Lucas, Susan M.; Grimwood, Jane; Schmutz, Jeremy; Cardol, Pierre; Cerutti, Heriberto; Chanfreau, Guillaume; Chen, Chun-Long; Cognat, Valérie; Croft, Martin T.; Dent, Rachel; Dutcher, Susan; Fernández, Emilio; Ferris, Patrick; Fukuzawa, Hideya; González-Ballester, David; González-Halphen, Diego; Hallmann, Armin; Hanikenne, Marc; Hippler, Michael; Inwood, William; Jabbari, Kamel; Kalanon, Ming; Kuras, Richard; Lefebvre, Paul A.; Lemaire, Stéphane D.; Lobanov, Alexey V.; Lohr, Martin; Manuell, Andrea; Meier, Iris; Mets, Laurens; Mittag, Maria; Mittelmeier, Telsa; Moroney, James V.; Moseley, Jeffrey; Napoli, Carolyn; Nedelcu, Aurora M.; Niyogi, Krishna; Novoselov, Sergey V.; Paulsen, Ian T.; Pazour, Greg; Purton, Saul; Ral, Jean-Philippe; Riaño-Pachón, Diego Mauricio; Riekhof, Wayne; Rymarquis, Linda; Schroda, Michael; Stern, David; Umen, James; Willows, Robert; Wilson, Nedra; Zimmer, Sara Lana; Allmer, Jens; Balk, Janneke; Bisova, Katerina; Chen, Chong-Jian; Elias, Marek; Gendler, Karla; Hauser, Charles; Lamb, Mary Rose; Ledford, Heidi; Long, Joanne C.; Minagawa, Jun; Page, M. Dudley; Pan, Junmin; Pootakham, Wirulda; Roje, Sanja; Rose, Annkatrin; Stahlberg, Eric; Terauchi, Aimee M.; Yang, Pinfen; Ball, Steven; Bowler, Chris; Dieckmann, Carol L.; Gladyshev, Vadim N.; Green, Pamela; Jorgensen, Richard; Mayfield, Stephen; Mueller-Roeber, Bernd; Rajamani, Sathish; Sayre, Richard T.; Brokstein, Peter; Dubchak, Inna; Goodstein, David; Hornick, Leila; Huang, Y. Wayne; Jhaveri, Jinal; Luo, Yigong; Martínez, Diego; Ngau, Wing Chi Abby; Otillar, Bobby; Poliakov, Alexander; Porter, Aaron; Szajkowski, Lukasz; Werner, Gregory; Zhou, Kemin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Grossman, Arthur R.
2010-01-01
Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the ∼120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella. PMID:17932292
Parasitism and mutualism in Wolbachia: what the phylogenomic trees can and cannot say.
Bordenstein, Seth R; Paraskevopoulos, Charalampos; Dunning Hotopp, Julie C; Sapountzis, Panagiotis; Lo, Nathan; Bandi, Claudio; Tettelin, Hervé; Werren, John H; Bourtzis, Kostas
2009-01-01
Ecological and evolutionary theories predict that parasitism and mutualism are not fixed endpoints of the symbiotic spectrum. Rather, parasitism and mutualism may be host or environment dependent, induced by the same genetic machinery, and shifted due to selection. These models presume the existence of genetic or environmental variation that can spur incipient changes in symbiotic lifestyle. However, for obligate intracellular bacteria whose genomes are highly reduced, studies specify that discrete symbiotic associations can be evolutionarily stable for hundreds of millions of years. Wolbachia is an inherited obligate, intracellular infection of invertebrates containing taxa that act broadly as both parasites in arthropods and mutualists in certain roundworms. Here, we analyze the ancestry of mutualism and parasitism in Wolbachia and the evolutionary trajectory of this variation in symbiotic lifestyle with a comprehensive, phylogenomic analysis. Contrary to previous claims, we show unequivocally that the transition in lifestyle cannot be reconstructed with current methods due to long-branch attraction (LBA) artifacts of the distant Anaplasma and Ehrlichia outgroups. Despite the use of 1) site-heterogenous phylogenomic methods that can overcome systematic error, 2) a taxonomically rich set of taxa, and 3) statistical assessments of the genes, tree topologies, and models of evolution, we conclude that the LBA artifact is serious enough to afflict past and recent claims including the root lies in the middle of the Wolbachia mutualists and parasites. We show that different inference methods yield different results and high bootstrap support did not equal phylogenetic accuracy. Recombination was rare among this taxonomically diverse data set, indicating that elevated levels of recombination in Wolbachia are restricted to specific coinfecting groups. In conclusion, we attribute the inability to root the tree to rate heterogeneity between the ingroup and outgroup. Site-heterogenous models of evolution did improve the placement of aberrant taxa in the ingroup phylogeny. Finally, in the unrooted topology, the distribution of parasitism and mutualism across the tree suggests that at least two interphylum transfers shaped the origins of nematode mutualism and arthropod parasitism. We suggest that the ancestry of mutualism and parasitism is not resolvable without more suitable outgroups or complete genome sequences from all Wolbachia supergroups.
New Insights into the Diversity of the Genus Faecalibacterium.
Benevides, Leandro; Burman, Sriti; Martin, Rebeca; Robert, Véronique; Thomas, Muriel; Miquel, Sylvie; Chain, Florian; Sokol, Harry; Bermudez-Humaran, Luis G; Morrison, Mark; Langella, Philippe; Azevedo, Vasco A; Chatel, Jean-Marc; Soares, Siomar
2017-01-01
Faecalibacterium prausnitzii is a commensal bacterium, ubiquitous in the gastrointestinal tracts of animals and humans. This species is a functionally important member of the microbiota and studies suggest it has an impact on the physiology and health of the host. F. prausnitzii is the only identified species in the genus Faecalibacterium , but a recent study clustered strains of this species in two different phylogroups. Here, we propose the existence of distinct species in this genus through the use of comparative genomics. Briefly, we performed analyses of 16S rRNA gene phylogeny, phylogenomics, whole genome Multi-Locus Sequence Typing (wgMLST), Average Nucleotide Identity (ANI), gene synteny, and pangenome to better elucidate the phylogenetic relationships among strains of Faecalibacterium . For this, we used 12 newly sequenced, assembled, and curated genomes of F. prausnitzii , which were isolated from feces of healthy volunteers from France and Australia, and combined these with published data from 5 strains downloaded from public databases. The phylogenetic analysis of the 16S rRNA sequences, together with the wgMLST profiles and a phylogenomic tree based on comparisons of genome similarity, all supported the clustering of Faecalibacterium strains in different genospecies. Additionally, the global analysis of gene synteny among all strains showed a highly fragmented profile, whereas the intra-cluster analyses revealed larger and more conserved collinear blocks. Finally, ANI analysis substantiated the presence of three distinct clusters-A, B, and C-composed of five, four, and four strains, respectively. The pangenome analysis of each cluster corroborated the classification of these clusters into three distinct species, each containing less variability than that found within the global pangenome of all strains. Here, we propose that comparison of pangenome subsets and their associated α values may be used as an alternative approach, together with ANI, in the in silico classification of new species. Altogether, our results provide evidence not only for the reconsideration of the phylogenetic and genomic relatedness among strains currently assigned to F. prausnitzii , but also the need for lineage (strain-based) differentiation of this taxon to better define how specific members might be associated with positive or negative host interactions.
Sudianto, Edi; Wu, Chung-Shien; Lin, Ching-Ping; Chaw, Shu-Miaw
2016-06-27
Phylogeny of the ten Pinaceous genera has long been contentious. Plastid genomes (plastomes) provide an opportunity to resolve this problem because they contain rich evolutionary information. To comprehend the plastid phylogenomics of all ten Pinaceous genera, we sequenced the plastomes of two previously unavailable genera, Pseudolarix amabilis (122,234 bp) and Tsuga chinensis (120,859 bp). Both plastomes share similar gene repertoire and order. Here for the first time we report a unique insertion of tandem repeats in accD of T. chinensis From the 65 plastid protein-coding genes common to all Pinaceous genera, we re-examined the phylogenetic relationship among all Pinaceous genera. Our two phylogenetic trees are congruent in an identical tree topology, with the five genera of the Abietoideae subfamily constituting a monophyletic clade separate from the other three subfamilies: Pinoideae, Piceoideae, and Laricoideae. The five genera of Abietoideae were grouped into two sister clades consisting of (1) Cedrus alone and (2) two sister subclades of Pseudolarix-Tsuga and Abies-Keteleeria, with the former uniquely losing the gene psaM and the latter specifically excluding the 3 psbA from the residual inverted repeat. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Phylogenomics of Rhodobacteraceae reveals evolutionary adaptation to marine and non-marine habitats
Simon, Meinhard; Scheuner, Carmen; Meier-Kolthoff, Jan P; Brinkhoff, Thorsten; Wagner-Döbler, Irene; Ulbrich, Marcus; Klenk, Hans-Peter; Schomburg, Dietmar; Petersen, Jörn; Göker, Markus
2017-01-01
Marine Rhodobacteraceae (Alphaproteobacteria) are key players of biogeochemical cycling, comprise up to 30% of bacterial communities in pelagic environments and are often mutualists of eukaryotes. As ‘Roseobacter clade', these ‘roseobacters' are assumed to be monophyletic, but non-marine Rhodobacteraceae have not yet been included in phylogenomic analyses. Therefore, we analysed 106 genome sequences, particularly emphasizing gene sampling and its effect on phylogenetic stability, and investigated relationships between marine versus non-marine habitat, evolutionary origin and genomic adaptations. Our analyses, providing no unequivocal evidence for the monophyly of roseobacters, indicate several shifts between marine and non-marine habitats that occurred independently and were accompanied by characteristic changes in genomic content of orthologs, enzymes and metabolic pathways. Non-marine Rhodobacteraceae gained high-affinity transporters to cope with much lower sulphate concentrations and lost genes related to the reduced sodium chloride and organohalogen concentrations in their habitats. Marine Rhodobacteraceae gained genes required for fucoidan desulphonation and synthesis of the plant hormone indole 3-acetic acid and the compatible solutes ectoin and carnitin. However, neither plasmid composition, even though typical for the family, nor the degree of oligotrophy shows a systematic difference between marine and non-marine Rhodobacteraceae. We suggest the operational term ‘Roseobacter group' for the marine Rhodobacteraceae strains. PMID:28106881
Jameson Kiesling, Natalie M; Yi, Soojin V; Xu, Ke; Gianluca Sperone, F; Wildman, Derek E
2015-01-01
The development and evolution of organisms is heavily influenced by their environment. Thus, understanding the historical biogeography of taxa can provide insights into their evolutionary history, adaptations and trade-offs realized throughout time. In the present study we have taken a phylogenomic approach to infer New World monkey phylogeny, upon which we have reconstructed the biogeographic history of extant platyrrhines. In order to generate sufficient phylogenetic signal within the New World monkey clade, we carried out a large-scale phylogenetic analysis of approximately 40 kb of non-genic genomic DNA sequence in a 36 species subset of extant New World monkeys. Maximum parsimony, maximum likelihood and Bayesian inference analysis all converged on a single optimal tree topology. Divergence dating and biogeographic analysis reconstruct the timing and geographic location of divergence events. The ancestral area reconstruction describes the geographic locations of the last common ancestor of extant platyrrhines and provides insight into key biogeographic events occurring during platyrrhine diversification. Through these analyses we conclude that the diversification of the platyrrhines took place concurrently with the establishment and diversification of the Amazon rainforest. This suggests that an expanding rainforest environment rather than geographic isolation drove platyrrhine diversification. Copyright © 2014 Elsevier Inc. All rights reserved.
Phylogenomics of Rhodobacteraceae reveals evolutionary adaptation to marine and non-marine habitats.
Simon, Meinhard; Scheuner, Carmen; Meier-Kolthoff, Jan P; Brinkhoff, Thorsten; Wagner-Döbler, Irene; Ulbrich, Marcus; Klenk, Hans-Peter; Schomburg, Dietmar; Petersen, Jörn; Göker, Markus
2017-06-01
Marine Rhodobacteraceae (Alphaproteobacteria) are key players of biogeochemical cycling, comprise up to 30% of bacterial communities in pelagic environments and are often mutualists of eukaryotes. As 'Roseobacter clade', these 'roseobacters' are assumed to be monophyletic, but non-marine Rhodobacteraceae have not yet been included in phylogenomic analyses. Therefore, we analysed 106 genome sequences, particularly emphasizing gene sampling and its effect on phylogenetic stability, and investigated relationships between marine versus non-marine habitat, evolutionary origin and genomic adaptations. Our analyses, providing no unequivocal evidence for the monophyly of roseobacters, indicate several shifts between marine and non-marine habitats that occurred independently and were accompanied by characteristic changes in genomic content of orthologs, enzymes and metabolic pathways. Non-marine Rhodobacteraceae gained high-affinity transporters to cope with much lower sulphate concentrations and lost genes related to the reduced sodium chloride and organohalogen concentrations in their habitats. Marine Rhodobacteraceae gained genes required for fucoidan desulphonation and synthesis of the plant hormone indole 3-acetic acid and the compatible solutes ectoin and carnitin. However, neither plasmid composition, even though typical for the family, nor the degree of oligotrophy shows a systematic difference between marine and non-marine Rhodobacteraceae. We suggest the operational term 'Roseobacter group' for the marine Rhodobacteraceae strains.
Romiguier, Jonathan; Rolland, Jonathan; Morandin, Claire; Keller, Laurent
2018-03-28
The ants of the Formica genus are classical model species in evolutionary biology. In particular, Darwin used Formica as model species to better understand the evolution of slave-making, a parasitic behaviour where workers of another species are stolen to exploit their workforce. In his book "On the Origin of Species" (1859), Darwin first hypothesized that slave-making behaviour in Formica evolved in incremental steps from a free-living ancestor. The absence of a well-resolved phylogenetic tree of the genus prevent an assessment of whether relationships among Formica subgenera are compatible with this scenario. In this study, we resolve the relationships among the 4 palearctic Formica subgenera (Formica str. s., Coptoformica, Raptiformica and Serviformica) using a phylogenomic dataset of 945 genes for 16 species. We provide a reference tree resolving the relationships among the main Formica subgenera with high bootstrap supports. The branching order of our tree suggests that the free-living lifestyle is ancestral in the Formica genus and that parasitic colony founding could have evolved a single time, probably acting as a pre-adaptation to slave-making behaviour. This phylogenetic tree provides a solid backbone for future evolutionary studies in the Formica genus and slave-making behaviour.
Phylogenomics databases for facilitating functional genomics in rice.
Jung, Ki-Hong; Cao, Peijian; Sharma, Rita; Jain, Rashmi; Ronald, Pamela C
2015-12-01
The completion of whole genome sequence of rice (Oryza sativa) has significantly accelerated functional genomics studies. Prior to the release of the sequence, only a few genes were assigned a function each year. Since sequencing was completed in 2005, the rate has exponentially increased. As of 2014, 1,021 genes have been described and added to the collection at The Overview of functionally characterized Genes in Rice online database (OGRO). Despite this progress, that number is still very low compared with the total number of genes estimated in the rice genome. One limitation to progress is the presence of functional redundancy among members of the same rice gene family, which covers 51.6 % of all non-transposable element-encoding genes. There remain a significant portion or rice genes that are not functionally redundant, as reflected in the recovery of loss-of-function mutants. To more accurately analyze functional redundancy in the rice genome, we have developed a phylogenomics databases for six large gene families in rice, including those for glycosyltransferases, glycoside hydrolases, kinases, transcription factors, transporters, and cytochrome P450 monooxygenases. In this review, we introduce key features and applications of these databases. We expect that they will serve as a very useful guide in the post-genomics era of research.
Archaea: The First Domain of Diversified Life
Caetano-Anollés, Gustavo; Nasir, Arshan; Zhou, Kaiyue; Caetano-Anollés, Derek; Mittenthal, Jay E.; Sun, Feng-Jie; Kim, Kyung Mo
2014-01-01
The study of the origin of diversified life has been plagued by technical and conceptual difficulties, controversy, and apriorism. It is now popularly accepted that the universal tree of life is rooted in the akaryotes and that Archaea and Eukarya are sister groups to each other. However, evolutionary studies have overwhelmingly focused on nucleic acid and protein sequences, which partially fulfill only two of the three main steps of phylogenetic analysis, formulation of realistic evolutionary models, and optimization of tree reconstruction. In the absence of character polarization, that is, the ability to identify ancestral and derived character states, any statement about the rooting of the tree of life should be considered suspect. Here we show that macromolecular structure and a new phylogenetic framework of analysis that focuses on the parts of biological systems instead of the whole provide both deep and reliable phylogenetic signal and enable us to put forth hypotheses of origin. We review over a decade of phylogenomic studies, which mine information in a genomic census of millions of encoded proteins and RNAs. We show how the use of process models of molecular accumulation that comply with Weston's generality criterion supports a consistent phylogenomic scenario in which the origin of diversified life can be traced back to the early history of Archaea. PMID:24987307
Integral Phylogenomic Approach over Ilex L. Species from Southern South America
Cascales, Jimena; Bracco, Mariana; Garberoglio, Mariana J.; Poggio, Lidia; Gottlieb, Alexandra M.
2017-01-01
The use of molecular markers with inadequate variation levels has resulted in poorly resolved phylogenetic relationships within Ilex. Focusing on southern South American and Asian species, we aimed at contributing informative plastid markers. Also, we intended to gain insights into the nature of morphological and physiological characters used to identify species. We obtained the chloroplast genomes of I. paraguariensis and I. dumosa, and combined these with all the congeneric plastomes currently available to accomplish interspecific comparisons and multilocus analyses. We selected seven introns and nine IGSs as variable non-coding markers that were used in phylogenomic analyses. Eight extra IGSs were proposed as candidate markers. Southern South American species formed one lineage, except for I. paraguariensis, I. dumosa and I. argentina, which occupied intermediate positions among sampled taxa; Euroasiatic species formed two lineages. Some concordant relationships were retrieved from nuclear sequence data. We also conducted integral analyses, involving a supernetwork of molecular data, and a simultaneous analysis of quantitative and qualitative morphological and phytochemical characters, together with molecular data. The total evidence tree was used to study the evolution of non-molecular data, evidencing fifteen non-ambiguous synapomorphic character states and consolidating the relationships among southern South American species. More South American representatives should be incorporated to elucidate their origin. PMID:29165335
Broad phylogenomic sampling and the sister lineage of land plants.
Timme, Ruth E; Bachvaroff, Tsvetan R; Delwiche, Charles F
2012-01-01
The tremendous diversity of land plants all descended from a single charophyte green alga that colonized the land somewhere between 430 and 470 million years ago. Six orders of charophyte green algae, in addition to embryophytes, comprise the Streptophyta s.l. Previous studies have focused on reconstructing the phylogeny of organisms tied to this key colonization event, but wildly conflicting results have sparked a contentious debate over which lineage gave rise to land plants. The dominant view has been that 'stoneworts,' or Charales, are the sister lineage, but an alternative hypothesis supports the Zygnematales (often referred to as "pond scum") as the sister lineage. In this paper, we provide a well-supported, 160-nuclear-gene phylogenomic analysis supporting the Zygnematales as the closest living relative to land plants. Our study makes two key contributions to the field: 1) the use of an unbiased method to collect a large set of orthologs from deeply diverging species and 2) the use of these data in determining the sister lineage to land plants. We anticipate this updated phylogeny not only will hugely impact lesson plans in introductory biology courses, but also will provide a solid phylogenetic tree for future green-lineage research, whether it be related to plants or green algae.
Chloroplast Phylogenomics Indicates that Ginkgo biloba Is Sister to Cycads
Wu, Chung-Shien; Chaw, Shu-Miaw; Huang, Ya-Yi
2013-01-01
Molecular phylogenetic studies have not yet reached a consensus on the placement of Ginkgoales, which is represented by the only living species, Ginkgo biloba (common name: ginkgo). At least six discrepant placements of ginkgo have been proposed. This study aimed to use the chloroplast phylogenomic approach to examine possible factors that lead to such disagreeing placements. We found the sequence types used in the analyses as the most critical factor in the conflicting placements of ginkgo. In addition, the placement of ginkgo varied in the trees inferred from nucleotide (NU) sequences, which notably depended on breadth of taxon sampling, tree-building methods, codon positions, positions of Gnetopsida (common name: gnetophytes), and including or excluding gnetophytes in data sets. In contrast, the trees inferred from amino acid (AA) sequences congruently supported the monophyly of a ginkgo and Cycadales (common name: cycads) clade, regardless of which factors were examined. Our site-stripping analysis further revealed that the high substitution saturation of NU sequences mainly derived from the third codon positions and contributed to the variable placements of ginkgo. In summary, the factors we surveyed did not affect results inferred from analyses of AA sequences. Congruent topologies in our AA trees give more confidence in supporting the ginkgo–cycad sister-group hypothesis. PMID:23315384
Qu, Xiao-Jian; Jin, Jian-Jun; Chaw, Shu-Miaw; Li, De-Zhu; Yi, Ting-Shuang
2017-01-01
Long-branch attraction (LBA) is a major obstacle in phylogenetic reconstruction. The phylogenetic relationships among Juniperus (J), Cupressus (C) and the Hesperocyparis-Callitropsis-Xanthocyparis (HCX) subclades of Cupressoideae are controversial. Our initial analyses of plastid protein-coding gene matrix revealed both J and C with much longer stem branches than those of HCX, so their sister relationships may be attributed to LBA. We used multiple measures including data filtering and modifying, evolutionary model selection and coalescent phylogenetic reconstruction to alleviate the LBA artifact. Data filtering by strictly removing unreliable aligned regions and removing substitution saturation genes and rapidly evolving sites could significantly reduce branch lengths of subclades J and C and recovered a relationship of J (C, HCX). In addition, using coalescent phylogenetic reconstruction could elucidate the LBA artifact and recovered J (C, HCX). However, some valid methods for other taxa were inefficient in alleviating the LBA artifact in J-C-HCX. Different strategies should be carefully considered and justified to reduce LBA in phylogenetic reconstruction of different groups. Three subclades of J-C-HCX were estimated to have experienced ancient rapid divergence within a short period, which could be another major obstacle in resolving relationships. Furthermore, our plastid phylogenomic analyses fully resolved the intergeneric relationships of Cupressoideae. PMID:28120880
Qu, Xiao-Jian; Jin, Jian-Jun; Chaw, Shu-Miaw; Li, De-Zhu; Yi, Ting-Shuang
2017-01-25
Long-branch attraction (LBA) is a major obstacle in phylogenetic reconstruction. The phylogenetic relationships among Juniperus (J), Cupressus (C) and the Hesperocyparis-Callitropsis-Xanthocyparis (HCX) subclades of Cupressoideae are controversial. Our initial analyses of plastid protein-coding gene matrix revealed both J and C with much longer stem branches than those of HCX, so their sister relationships may be attributed to LBA. We used multiple measures including data filtering and modifying, evolutionary model selection and coalescent phylogenetic reconstruction to alleviate the LBA artifact. Data filtering by strictly removing unreliable aligned regions and removing substitution saturation genes and rapidly evolving sites could significantly reduce branch lengths of subclades J and C and recovered a relationship of J (C, HCX). In addition, using coalescent phylogenetic reconstruction could elucidate the LBA artifact and recovered J (C, HCX). However, some valid methods for other taxa were inefficient in alleviating the LBA artifact in J-C-HCX. Different strategies should be carefully considered and justified to reduce LBA in phylogenetic reconstruction of different groups. Three subclades of J-C-HCX were estimated to have experienced ancient rapid divergence within a short period, which could be another major obstacle in resolving relationships. Furthermore, our plastid phylogenomic analyses fully resolved the intergeneric relationships of Cupressoideae.
Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus
Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna
2016-01-01
Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria. PMID:26900859
Puche, Rafael; Ferrés, Ignacio; Caraballo, Lizeth; Rangel, Yaritza; Picardeau, Mathieu; Takiff, Howard; Iraola, Gregorio
2018-02-01
Three strains, CLM-U50 T , CLM-R50 and IVIC-Bov1, belonging to the genus Leptospira, were isolated in Venezuela from a patient with leptospirosis, a domestic rat (Rattus norvegicus) and a cow (Bos taurus), respectively. The initial characterisation of these strains based on the rrs gene (16S rRNA) suggested their designation as a novel species within the 'intermediates' group of the genus Leptospira. Further phylogenomic characterisation based on single copy core genes was consistent with their separation into a novel species. The average nucleotide identity between these three strains was >99 %, but below 89 % with respect to any previously described leptospiral species, also supporting their designation as a novel species. Given this evidence, these three isolates were considered to represent a novel species, for which the name Leptospiravenezuelensis sp. nov. is proposed, with CLM-U50 T (=CIP 111407 T =DSM 105752 T ) as the type strain.
Ragsdale, Erik J.; Baldwin, James G.
2010-01-01
Modern morphology-based systematics, including questions of incongruence with molecular data, emphasizes analysis over similarity criteria to assess homology. Yet detailed examination of a few key characters, using new tools and processes such as computerized, three-dimensional ultrastructural reconstruction of cell complexes, can resolve apparent incongruence by re-examining primary homologies. In nematodes of Tylenchomorpha, a parasitic feeding phenotype is thus reconciled with immediate free-living outgroups. Closer inspection of morphology reveals phenotypes congruent with molecular-based phylogeny and points to a new locus of homology in mouthparts. In nematode models, the study of individually homologous cells reveals a conserved modality of evolution among dissimilar feeding apparati adapted to divergent lifestyles. Conservatism of cellular components, consistent with that of other body systems, allows meaningful comparative morphology in difficult groups of microscopic organisms. The advent of phylogenomics is synergistic with morphology in systematics, providing an honest test of homology in the evolution of phenotype. PMID:20106846
TreSpEx—Detection of Misleading Signal in Phylogenetic Reconstructions Based on Tree Information
Struck, Torsten H
2014-01-01
Phylogenies of species or genes are commonplace nowadays in many areas of comparative biological studies. However, for phylogenetic reconstructions one must refer to artificial signals such as paralogy, long-branch attraction, saturation, or conflict between different datasets. These signals might eventually mislead the reconstruction even in phylogenomic studies employing hundreds of genes. Unfortunately, there has been no program allowing the detection of such effects in combination with an implementation into automatic process pipelines. TreSpEx (Tree Space Explorer) now combines different approaches (including statistical tests), which utilize tree-based information like nodal support or patristic distances (PDs) to identify misleading signals. The program enables the parallel analysis of hundreds of trees and/or predefined gene partitions, and being command-line driven, it can be integrated into automatic process pipelines. TreSpEx is implemented in Perl and supported on Linux, Mac OS X, and MS Windows. Source code, binaries, and additional material are freely available at http://www.annelida.de/research/bioinformatics/software.html. PMID:24701118
The Natural History of Biocatalytic Mechanisms
Nath, Neetika; Mitchell, John B. O.; Caetano-Anollés, Gustavo
2014-01-01
Phylogenomic analysis of the occurrence and abundance of protein domains in proteomes has recently showed that the α/β architecture is probably the oldest fold design. This holds important implications for the origins of biochemistry. Here we explore structure-function relationships addressing the use of chemical mechanisms by ancestral enzymes. We test the hypothesis that the oldest folds used the most mechanisms. We start by tracing biocatalytic mechanisms operating in metabolic enzymes along a phylogenetic timeline of the first appearance of homologous superfamilies of protein domain structures from CATH. A total of 335 enzyme reactions were retrieved from MACiE and were mapped over fold age. We define a mechanistic step type as one of the 51 mechanistic annotations given in MACiE, and each step of each of the 335 mechanisms was described using one or more of these annotations. We find that the first two folds, the P-loop containing nucleotide triphosphate hydrolase and the NAD(P)-binding Rossmann-like homologous superfamilies, were α/β architectures responsible for introducing 35% (18/51) of the known mechanistic step types. We find that these two oldest structures in the phylogenomic analysis of protein domains introduced many mechanistic step types that were later combinatorially spread in catalytic history. The most common mechanistic step types included fundamental building blocks of enzyme chemistry: “Proton transfer,” “Bimolecular nucleophilic addition,” “Bimolecular nucleophilic substitution,” and “Unimolecular elimination by the conjugate base.” They were associated with the most ancestral fold structure typical of P-loop containing nucleotide triphosphate hydrolases. Over half of the mechanistic step types were introduced in the evolutionary timeline before the appearance of structures specific to diversified organisms, during a period of architectural diversification. The other half unfolded gradually after organismal diversification and during a period that spanned ∼2 billion years of evolutionary history. PMID:24874434
Novais, Carla; Tedim, Ana P.; Lanza, Val F.; Freitas, Ana R.; Silveira, Eduarda; Escada, Ricardo; Roberts, Adam P.; Al-Haroni, Mohammed; Baquero, Fernando; Peixe, Luísa; Coque, Teresa M.
2016-01-01
Ampicillin resistance has greatly contributed to the recent dramatic increase of a cluster of human adapted Enterococcus faecium lineages (ST17, ST18, and ST78) in hospital-based infections. Changes in the chromosomal pbp5 gene have been associated with different levels of ampicillin susceptibility, leading to protein variants (designated as PBP5 C-types to keep the nomenclature used in previous works) with diverse degrees of reduction in penicillin affinity. Our goal was to use a comparative genomics approach to evaluate the relationship between the diversity of PBP5 among E. faecium isolates of different phylogenomic groups as well as to assess the pbp5 transferability among isolates of disparate clonal lineages. The analyses of 78 selected E. faecium strains as well as published E. faecium genomes, suggested that the diversity of pbp5 mirrors the phylogenomic diversification of E. faecium. The presence of identical PBP5 C-types as well as similar pbp5 genetic environments in different E. faecium lineages and clones from quite different geographical and environmental origin was also documented and would indicate their horizontal gene transfer among E. faecium populations. This was supported by experimental assays showing transfer of large (≈180–280 kb) chromosomal genetic platforms containing pbp5 alleles, ponA (transglycosilase) and other metabolic and adaptive features, from E. faecium donor isolates to suitable E. faecium recipient strains. Mutation profile analysis of PBP5 from available genomes and strains from this study suggests that the spread of PBP5 C-types might have occurred even in the absence of a significant ampicillin resistance phenotype. In summary, genetic platforms containing pbp5 sequences were stably maintained in particular E. faecium lineages, but were also able to be transferred among E. faecium clones of different origins, emphasizing the growing risk of further spread of ampicillin resistance in this nosocomial pathogen. PMID:27766095
2012-01-01
Background The morphological peculiarities of turtles have, for a long time, impeded their accurate placement in the phylogeny of amniotes. Molecular data used to address this major evolutionary question have so far been limited to a handful of markers and/or taxa. These studies have supported conflicting topologies, positioning turtles as either the sister group to all other reptiles, to lepidosaurs (tuatara, lizards and snakes), to archosaurs (birds and crocodiles), or to crocodilians. Genome-scale data have been shown to be useful in resolving other debated phylogenies, but no such adequate dataset is yet available for amniotes. Results In this study, we used next-generation sequencing to obtain seven new transcriptomes from the blood, liver, or jaws of four turtles, a caiman, a lizard, and a lungfish. We used a phylogenomic dataset based on 248 nuclear genes (187,026 nucleotide sites) for 16 vertebrate taxa to resolve the origins of turtles. Maximum likelihood and Bayesian concatenation analyses and species tree approaches performed under the most realistic models of the nucleotide and amino acid substitution processes unambiguously support turtles as a sister group to birds and crocodiles. The use of more simplistic models of nucleotide substitution for both concatenation and species tree reconstruction methods leads to the artefactual grouping of turtles and crocodiles, most likely because of substitution saturation at third codon positions. Relaxed molecular clock methods estimate the divergence between turtles and archosaurs around 255 million years ago. The most recent common ancestor of living turtles, corresponding to the split between Pleurodira and Cryptodira, is estimated to have occurred around 157 million years ago, in the Upper Jurassic period. This is a more recent estimate than previously reported, and questions the interpretation of controversial Lower Jurassic fossils as being part of the extant turtles radiation. Conclusions These results provide a phylogenetic framework and timescale with which to interpret the evolution of the peculiar morphological, developmental, and molecular features of turtles within the amniotes. PMID:22839781
Negrisolo, Enrico; Kuhl, Heiner; Forcato, Claudio; Vitulo, Nicola; Reinhardt, Richard; Patarnello, Tomaso; Bargelloni, Luca
2010-12-01
Comparative genomics holds the promise to magnify the information obtained from individual genome sequencing projects, revealing common features conserved across genomes and identifying lineage-specific characteristics. To implement such a comparative approach, a robust phylogenetic framework is required to accurately reconstruct evolution at the genome level. Among vertebrate taxa, teleosts represent the second best characterized group, with high-quality draft genome sequences for five model species (Danio rerio, Gasterosteus aculeatus, Oryzias latipes, Takifugu rubripes, and Tetraodon nigroviridis), and several others are in the finishing lane. However, the relationships among the acanthomorph teleost model fishes remain an unresolved taxonomic issue. Here, a genomic region spanning over 1.2 million base pairs was sequenced in the teleost fish Dicentrarchus labrax. Together with genomic data available for the above fish models, the new sequence was used to identify unique orthologous genomic regions shared across all target taxa. Different strategies were applied to produce robust multiple gene and genomic alignments spanning from 11,802 to 186,474 amino acid/nucleotide positions. Ten data sets were analyzed according to Bayesian inference, maximum likelihood, maximum parsimony, and neighbor joining methods. Extensive analyses were performed to explore the influence of several factors (e.g., alignment methodology, substitution model, data set partitions, and long-branch attraction) on the tree topology. Although a general consensus was observed for a closer relationship between G. aculeatus (Gasterosteidae) and Di. labrax (Moronidae) with the atherinomorph O. latipes (Beloniformes) sister taxon of this clade, with the tetraodontiform group Ta. rubripes and Te. nigroviridis (Tetraodontiformes) representing a more distantly related taxon among acanthomorph model fish species, conflicting results were obtained between data sets and methods, especially with respect to the choice of alignment methodology applied to noncoding parts of the genomic region under study. This may limit the use of intergenic/noncoding sequences in phylogenomics until more robust alignment algorithms are developed.
ESAP plus: a web-based server for EST-SSR marker development.
Ponyared, Piyarat; Ponsawat, Jiradej; Tongsima, Sissades; Seresangtakul, Pusadee; Akkasaeng, Chutipong; Tantisuwichwong, Nathpapat
2016-12-22
Simple sequence repeats (SSRs) have become widely used as molecular markers in plant genetic studies due to their abundance, high allelic variation at each locus and simplicity to analyze using conventional PCR amplification. To study plants with unknown genome sequence, SSR markers from Expressed Sequence Tags (ESTs), which can be obtained from the plant mRNA (converted to cDNA), must be utilized. With the advent of high-throughput sequencing technology, huge EST sequence data have been generated and are now accessible from many public databases. However, SSR marker identification from a large in-house or public EST collection requires a computational pipeline that makes use of several standard bioinformatic tools to design high quality EST-SSR primers. Some of these computational tools are not users friendly and must be tightly integrated with reference genomic databases. A web-based bioinformatic pipeline, called EST Analysis Pipeline Plus (ESAP Plus), was constructed for assisting researchers to develop SSR markers from a large EST collection. ESAP Plus incorporates several bioinformatic scripts and some useful standard software tools necessary for the four main procedures of EST-SSR marker development, namely 1) pre-processing, 2) clustering and assembly, 3) SSR mining and 4) SSR primer design. The proposed pipeline also provides two alternative steps for reducing EST redundancy and identifying SSR loci. Using public sugarcane ESTs, ESAP Plus automatically executed the aforementioned computational pipeline via a simple web user interface, which was implemented using standard PHP, HTML, CSS and Java scripts. With ESAP Plus, users can upload raw EST data and choose various filtering options and parameters to analyze each of the four main procedures through this web interface. All input EST data and their predicted SSR results will be stored in the ESAP Plus MySQL database. Users will be notified via e-mail when the automatic process is completed and they can download all the results through the web interface. ESAP Plus is a comprehensive and convenient web-based bioinformatic tool for SSR marker development. ESAP Plus offers all necessary EST-SSR development processes with various adjustable options that users can easily use to identify SSR markers from a large EST collection. With familiar web interface, users can upload the raw EST using the data submission page and visualize/download the corresponding EST-SSR information from within ESAP Plus. ESAP Plus can handle considerably large EST datasets. This EST-SSR discovery tool can be accessed directly from: http://gbp.kku.ac.th/esap_plus/ .
ESTimating plant phylogeny: lessons from partitioning
de la Torre, Jose EB; Egan, Mary G; Katari, Manpreet S; Brenner, Eric D; Stevenson, Dennis W; Coruzzi, Gloria M; DeSalle, Rob
2006-01-01
Background While Expressed Sequence Tags (ESTs) have proven a viable and efficient way to sample genomes, particularly those for which whole-genome sequencing is impractical, phylogenetic analysis using ESTs remains difficult. Sequencing errors and orthology determination are the major problems when using ESTs as a source of characters for systematics. Here we develop methods to incorporate EST sequence information in a simultaneous analysis framework to address controversial phylogenetic questions regarding the relationships among the major groups of seed plants. We use an automated, phylogenetically derived approach to orthology determination called OrthologID generate a phylogeny based on 43 process partitions, many of which are derived from ESTs, and examine several measures of support to assess the utility of EST data for phylogenies. Results A maximum parsimony (MP) analysis resulted in a single tree with relatively high support at all nodes in the tree despite rampant conflict among trees generated from the separate analysis of individual partitions. In a comparison of broader-scale groupings based on cellular compartment (ie: chloroplast, mitochondrial or nuclear) or function, only the nuclear partition tree (based largely on EST data) was found to be topologically identical to the tree based on the simultaneous analysis of all data. Despite topological conflict among the broader-scale groupings examined, only the tree based on morphological data showed statistically significant differences. Conclusion Based on the amount of character support contributed by EST data which make up a majority of the nuclear data set, and the lack of conflict of the nuclear data set with the simultaneous analysis tree, we conclude that the inclusion of EST data does provide a viable and efficient approach to address phylogenetic questions within a parsimony framework on a genomic scale, if problems of orthology determination and potential sequencing errors can be overcome. In addition, approaches that examine conflict and support in a simultaneous analysis framework allow for a more precise understanding of the evolutionary history of individual process partitions and may be a novel way to understand functional aspects of different kinds of cellular classes of gene products. PMID:16776834
Streicher, Jeffrey W; Schulte, James A; Wiens, John J
2016-01-01
Targeted sequence capture is becoming a widespread tool for generating large phylogenomic data sets to address difficult phylogenetic problems. However, this methodology often generates data sets in which increasing the number of taxa and loci increases amounts of missing data. Thus, a fundamental (but still unresolved) question is whether sampling should be designed to maximize sampling of taxa or genes, or to minimize the inclusion of missing data cells. Here, we explore this question for an ancient, rapid radiation of lizards, the pleurodont iguanians. Pleurodonts include many well-known clades (e.g., anoles, basilisks, iguanas, and spiny lizards) but relationships among families have proven difficult to resolve strongly and consistently using traditional sequencing approaches. We generated up to 4921 ultraconserved elements with sampling strategies including 16, 29, and 44 taxa, from 1179 to approximately 2.4 million characters per matrix and approximately 30% to 60% total missing data. We then compared mean branch support for interfamilial relationships under these 15 different sampling strategies for both concatenated (maximum likelihood) and species tree (NJst) approaches (after showing that mean branch support appears to be related to accuracy). We found that both approaches had the highest support when including loci with up to 50% missing taxa (matrices with ~40-55% missing data overall). Thus, our results show that simply excluding all missing data may be highly problematic as the primary guiding principle for the inclusion or exclusion of taxa and genes. The optimal strategy was somewhat different for each approach, a pattern that has not been shown previously. For concatenated analyses, branch support was maximized when including many taxa (44) but fewer characters (1.1 million). For species-tree analyses, branch support was maximized with minimal taxon sampling (16) but many loci (4789 of 4921). We also show that the choice of these sampling strategies can be critically important for phylogenomic analyses, since some strategies lead to demonstrably incorrect inferences (using the same method) that have strong statistical support. Our preferred estimate provides strong support for most interfamilial relationships in this important but phylogenetically challenging group. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Bewick, Adam J; Chain, Frédéric J J; Heled, Joseph; Evans, Ben J
2012-12-01
The estimation of phylogenetic relationships is an essential component of understanding evolution. Accurate phylogenetic estimation is difficult, however, when internodes are short and old, when genealogical discordance is common due to large ancestral effective population sizes or ancestral population structure, and when homoplasy is prevalent. Inference of divergence times is also hampered by unknown and uneven rates of evolution, the incomplete fossil record, uncertainty in relationships between fossil and extant lineages, and uncertainty in the age of fossils. Ideally, these challenges can be overcome by developing large "phylogenomic" data sets and by analyzing them with methods that accommodate features of the evolutionary process, such as genealogical discordance, recurrent substitution, recombination, ancestral population structure, gene flow after speciation among sampled and unsampled taxa, and variation in evolutionary rates. In some phylogenetic problems, it is possible to use information that is independent of fossils, such as the geological record, to identify putative triggers for diversification whose associated estimated divergence times can then be compared a posteriori with estimated relationships and ages of fossils. The history of diversification of pipid frog genera Pipa, Hymenochirus, Silurana, and Xenopus, for instance, is characterized by many of these evolutionary and analytical challenges. These frogs diversified dozens of millions of years ago, they have a relatively rich fossil record, their distributions span continental plates with a well characterized geological record of ancient connectivity, and there is considerable disagreement across studies in estimated evolutionary relationships. We used high throughput sequencing and public databases to generate a large phylogenomic data set with which we estimated evolutionary relationships using multilocus coalescence methods. We collected sequence data from Pipa, Hymenochirus, Silurana, and Xenopus and the outgroup taxon Rhinophrynus dorsalis from coding sequence of 113 autosomal regions, averaging ∼300 bp in length (range: 102-1695 bp) and also a portion of the mitochondrial genome. Analysis of these data using multiple approaches recovers strong support for the ((Xenopus, Silurana)(Pipa, Hymenochirus)) topology, and geologically calibrated divergence time estimates that are consistent with estimated ages and phylogenetic affinities of many fossils. These results provide new insights into the biogeography and chronology of pipid diversification during the breakup of Gondwanaland and illustrate how phylogenomic data may be necessary to tackle tough problems in molecular systematics. [Coalescence; gene tree; high-throughout sequencing; lineage sorting; pipid; species tree; Xenopus.].
Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara
2009-10-15
The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.
Lu, Ashley; Armstrong, Karen F.
2015-01-01
Pectobacterium species are economically important bacteria that cause soft rotting of potato tubers in the field and in storage. Here, we report the draft genome sequence of the type strain for P. carotovorum subsp. carotovorum, ICMP 5702 (ATCC 15713). The genome sequence of ICMP 5702 will provide an important reference for future phylogenomic and taxonomic studies of the phytopathogenic Enterobacteriaceae. PMID:26251498
Chen, Xin; Lemmon, Alan R; Lemmon, Emily Moriarty; Pyron, R Alexander; Burbrink, Frank T
2017-06-01
Globally distributed groups may show regionally distinct rates of diversification, where speciation is elevated given timing and sources of ecological opportunity. However, for most organisms, nearly complete sampling at genomic-data scales to reduce topological error in all regions is unattainable, thus hampering conclusions related to biogeographic origins and rates of diversification. We explore processes leading to the diversity of global ratsnakes and test several important hypotheses related to areas of origin and enhanced diversification upon colonizing new continents. We estimate species trees inferred from phylogenomic scale data (304 loci) while exploring several strategies that consider topological error from each individual gene tree. With a dated species tree, we examine taxonomy and test previous hypotheses that suggest the ratsnakes originated in the Old World (OW) and dispersed to New World (NW). Furthermore, we determine if dispersal to the NW represented a source of ecological opportunity, which should show elevated rates of species diversification. We show that ratsnakes originated in the OW during the mid-Oligocene and subsequently dispersed to the NW by the mid-Miocene; diversification was also elevated in a subclade of NW taxa. Finally, the optimal biogeographic region-dependent speciation model shows that the uptick in ratsnake diversification was associated with colonization of the NW. We consider several alternative explanations that account for regionally distinct diversification rates. Copyright © 2017 Elsevier Inc. All rights reserved.
Broad-scale phylogenomics provides insights into retrovirus–host evolution
Hayward, Alexander; Grabherr, Manfred; Jern, Patric
2013-01-01
Genomic data provide an excellent resource to improve understanding of retrovirus evolution and the complex relationships among viruses and their hosts. In conjunction with broad-scale in silico screening of vertebrate genomes, this resource offers an opportunity to complement data on the evolution and frequency of past retroviral spread and so evaluate future risks and limitations for horizontal transmission between different host species. Here, we develop a methodology for extracting phylogenetic signal from large endogenous retrovirus (ERV) datasets by collapsing information to facilitate broad-scale phylogenomics across a wide sample of hosts. Starting with nearly 90,000 ERVs from 60 vertebrate host genomes, we construct phylogenetic hypotheses and draw inferences regarding the designation, host distribution, origin, and transmission of the Gammaretrovirus genus and associated class I ERVs. Our results uncover remarkable depths in retroviral sequence diversity, supported within a phylogenetic context. This finding suggests that current infectious exogenous retrovirus diversity may be underestimated, adding credence to the possibility that many additional exogenous retroviruses may remain to be discovered in vertebrate taxa. We demonstrate a history of frequent horizontal interorder transmissions from a rodent reservoir and suggest that rats may have acted as important overlooked facilitators of gammaretrovirus spread across diverse mammalian hosts. Together, these results demonstrate the promise of the methodology used here to analyze large ERV datasets and improve understanding of retroviral evolution and diversity for utilization in wider applications. PMID:24277832
Broad-scale phylogenomics provides insights into retrovirus-host evolution.
Hayward, Alexander; Grabherr, Manfred; Jern, Patric
2013-12-10
Genomic data provide an excellent resource to improve understanding of retrovirus evolution and the complex relationships among viruses and their hosts. In conjunction with broad-scale in silico screening of vertebrate genomes, this resource offers an opportunity to complement data on the evolution and frequency of past retroviral spread and so evaluate future risks and limitations for horizontal transmission between different host species. Here, we develop a methodology for extracting phylogenetic signal from large endogenous retrovirus (ERV) datasets by collapsing information to facilitate broad-scale phylogenomics across a wide sample of hosts. Starting with nearly 90,000 ERVs from 60 vertebrate host genomes, we construct phylogenetic hypotheses and draw inferences regarding the designation, host distribution, origin, and transmission of the Gammaretrovirus genus and associated class I ERVs. Our results uncover remarkable depths in retroviral sequence diversity, supported within a phylogenetic context. This finding suggests that current infectious exogenous retrovirus diversity may be underestimated, adding credence to the possibility that many additional exogenous retroviruses may remain to be discovered in vertebrate taxa. We demonstrate a history of frequent horizontal interorder transmissions from a rodent reservoir and suggest that rats may have acted as important overlooked facilitators of gammaretrovirus spread across diverse mammalian hosts. Together, these results demonstrate the promise of the methodology used here to analyze large ERV datasets and improve understanding of retroviral evolution and diversity for utilization in wider applications.
Lin, Mei Fang; Chou, Wen Hwa; Kitahara, Marcelo V.; Chen, Chao Lun Allen
2016-01-01
Calcification is one of the most distinctive traits of scleractinian corals. Their hard skeletons form the substratum of reef ecosystems and confer on corals their remarkable diversity of shapes. Corallimorpharians are non-calcifying, close relatives of scleractinian corals, and the evolutionary relationship between these two groups is key to understanding the evolution of calcification in the coral lineage. One pivotal question is whether scleractinians are a monophyletic group, paraphyly being an alternative possibility if corallimorpharians are corals that have lost their ability to calcify, as is implied by the “naked-coral” hypothesis. Despite major efforts, relationships between scleractinians and corallimorpharians remain equivocal and controversial. Although the complete mitochondrial genomes of a range of scleractinians and corallimorpharians have been obtained, heterogeneity in composition and evolutionary rates means that mitochondrial sequences are insufficient to understand the relationship between these two groups. To overcome these limitations, transcriptome data were generated for three representative corallimorpharians. These were used in combination with sequences available for a representative range of scleractinians to identify 291 orthologous single copy protein-coding nuclear markers. Unlike the mitochondrial sequences, these nuclear markers do not display any distinct compositional bias in their nucleotide or amino-acid sequences. A range of phylogenomic approaches congruently reveal a topology consistent with scleractinian monophyly and corallimorpharians as the sister clade of scleractinians. PMID:27761308
Haponski, Amanda E; Lee, Taehwan; Ó Foighil, Diarmaid
2017-01-01
Natural history museum collections provide a biodiversity window into the past and are of particular importance to the study of extinction-impacted clades such as the Pacific Island tree snail family Partulidae. Deliberate introduction of the predatory rosy wolf snail Euglandina rosea in the late 20th century led to the extinction/extirpation of 55/61 Society Island Partulidae species. In this study, we phylogenomically investigated the inter-relationships of the three surviving Society Island valley Partula species: P. taeniata (Moorea), P. clara and P. hyalina (Tahiti). All three formed a distinct clade in earlier mitochondrial phylogenies. Using Next Generation Sequencing (NGS) double digested Restriction Associated DNA sequencing (ddRADseq), we found that 46-year-old lyophilized museum specimens produced similar numbers of reads, sequencing depth, and loci as 10-year old ethanol-preserved collections. Phylogenomic trees indicated that Tahitian P. clara and P. hyalina are the result of a single founding lineage from Moorea, contrasting previous mitochondrial results and clarifying the enigmatic taxonomic status of P. c. incrassa. Our study highlights the utility and viability of NGS techniques for museum specimens and their increased resolution of evolutionary patterns. Sampling will be expanded to include the remaining Society Island partulid taxa to further explore the evolutionary history of this radiation. Copyright © 2016 Elsevier Inc. All rights reserved.
Zhang, Yan-Cong; Lin, Kui
2015-01-01
Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828
Gavelis, Gregory S; Wakeman, Kevin C; Tillmann, Urban; Ripken, Christina; Mitarai, Satoshi; Herranz, Maria; Özbek, Suat; Holstein, Thomas; Keeling, Patrick J; Leander, Brian S
2017-03-01
We examine the origin of harpoon-like secretory organelles (nematocysts) in dinoflagellate protists. These ballistic organelles have been hypothesized to be homologous to similarly complex structures in animals (cnidarians); but we show, using structural, functional, and phylogenomic data, that nematocysts evolved independently in both lineages. We also recorded the first high-resolution videos of nematocyst discharge in dinoflagellates. Unexpectedly, our data suggest that different types of dinoflagellate nematocysts use two fundamentally different types of ballistic mechanisms: one type relies on a single pressurized capsule for propulsion, whereas the other type launches 11 to 15 projectiles from an arrangement similar to a Gatling gun. Despite their radical structural differences, these nematocysts share a single origin within dinoflagellates and both potentially use a contraction-based mechanism to generate ballistic force. The diversity of traits in dinoflagellate nematocysts demonstrates a stepwise route by which simple secretory structures diversified to yield elaborate subcellular weaponry.
PhySortR: a fast, flexible tool for sorting phylogenetic trees in R.
Stephens, Timothy G; Bhattacharya, Debashish; Ragan, Mark A; Chan, Cheong Xin
2016-01-01
A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, particularly robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for classifying phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa based on tip labels (i.e., leaves) on a tree, with customisable options to assess clades within the context of the whole tree. Using simulated and empirical datasets, we demonstrate the potential and scalability of PhySortR in analysis of thousands of phylogenetic trees without a priori assumption of tree-rooting, and in yielding readily interpretable trees that unambiguously satisfy the query. PhySortR is a command-line tool that is freely available and easily automatable.
Mitochondrial genomes of two Australian fishflies with an evolutionary timescale of Chauliodinae.
Yang, Fan; Jiang, Yunlan; Yang, Ding; Liu, Xingyue
2017-06-30
Fishflies (Corydalidae: Chauliodinae) with a total of ca. 130 extant species are one of the major groups of the holometabolous insect order Megaloptera. As a group which originated during the Mesozoic, the phylogeny and historical biogeography of fishflies are of high interest. The previous hypothesis on the evolutionary history of fishflies was based primarily on morphological data. To further test the existing phylogenetic relationships and to understand the divergence pattern of fishflies, we conducted a molecule-based study. We determined the complete mitochondrial (mt) genomes of two Australian fishfly species, Archichauliodes deceptor Kimmins, 1954 and Protochauliodes biconicus Kimmins, 1954, both members of a major subgroup of Chauliodinae with high phylogenetic significance. A phylogenomic analysis was carried out based on 13 mt protein coding genes (PCGs) and two rRNAs genes from the megalopteran species with determined mt genomes. Both maximum likelihood and Bayesian inference analyses recovered the Dysmicohermes clade as the sister group of the Archichauliodes clade + the Protochauliodes clade, which is consistent with the previous morphology-based hypothesis. The divergence time estimation suggested that the divergence among the three major subgroups of fishflies occurred during the Late Jurassic and Early Cretaceous when the supercontinent Pangaea was undergoing sequential breakup.
Identification of true EST alignments for recognising transcribed regions.
Ma, Chuang; Wang, Jia; Li, Lun; Duan, Mo-Jie; Zhou, Yan-Hong
2011-01-01
Transcribed regions can be determined by aligning Expressed Sequence Tags (ESTs) with genome sequences. The kernel of this strategy is to effectively distinguish true EST alignments from spurious ones. In this study, three measures including Direction Check, Identity Check and Terminal Check were introduced to more effectively eliminate spurious EST alignments. On the basis of these introduced measures and other widely used measures, a computational tool, named ESTCleanser, has been developed to identify true EST alignments for obtaining reliable transcribed regions. The performance of ESTCleanser has been evaluated on the well-annotated human ENCyclopedia of DNA Elements (ENCODE) regions using human ESTs in the dbEST database. The evaluation results show that the accuracy of ESTCleanser at exon and intron levels is more remarkably enhanced than that of UCSC-spliced EST alignments. This work would be helpful to EST-based researches on finding new genes, complementing genome annotation, recognising alternative splicing events and Single Nucleotide Polymorphisms (SNPs), etc.
Empirically Supported Treatment’s Impact on Organizational Culture and Climate
Patterson-Silver Wolf, David A.; Dulmus, Catherine N.; Maguin, Eugene
2012-01-01
Objectives With the continued push to implement empirically supported treatments (ESTs) into community-based organizations, it is important to investigate whether working condition disruptions occur during this process. While there are many studies investigating best practices and how to adopt them, the literature lacks studies investigating the working conditions in programs that currently use ESTs. Method This study compared the culture and climate scores of a large organization’s programs that use ESTs and those programs indicating no EST usage. Results Of the total 55 different programs (1,273 frontline workers), 27 programs used ESTs. Results indicate that the programs offering an EST had significantly more rigid and resistant cultures, compared to those without any ESTs. In regard to climate, programs offering an EST were significantly less engaged, less functional, and more stressed. Conclusion Outcomes indicate a significant disruption in organizational culture and climate for programs offering ESTs. PMID:23243379
Empirically Supported Treatment's Impact on Organizational Culture and Climate.
Patterson-Silver Wolf, David A; Dulmus, Catherine N; Maguin, Eugene
2012-11-01
OBJECTIVES: With the continued push to implement empirically supported treatments (ESTs) into community-based organizations, it is important to investigate whether working condition disruptions occur during this process. While there are many studies investigating best practices and how to adopt them, the literature lacks studies investigating the working conditions in programs that currently use ESTs. METHOD: This study compared the culture and climate scores of a large organization's programs that use ESTs and those programs indicating no EST usage. RESULTS: Of the total 55 different programs (1,273 frontline workers), 27 programs used ESTs. Results indicate that the programs offering an EST had significantly more rigid and resistant cultures, compared to those without any ESTs. In regard to climate, programs offering an EST were significantly less engaged, less functional, and more stressed. CONCLUSION: Outcomes indicate a significant disruption in organizational culture and climate for programs offering ESTs.
USDA-ARS?s Scientific Manuscript database
Simple sequence repeat technology based on expressed sequence tag (EST-SSR) is a useful genomic tool for genome mapping, characterizing plant species relationships, elucidating genome evolution, and tracing genes on alien chromosome segments. EST-SSR primers developed from three perennial diploid T...
Weighill, Deborah A.; Jacobson, Daniel A.
2015-03-27
Herein we present and develop the theory of 3-way networks, a type of hypergraph in which each edge models relationships between triplets of objects as opposed to pairs of objects as done by standard network models. We explore approaches of how to prune these 3-way networks, illustrate their utility in comparative genomics and demonstrate how they find relationships which would be missed by standard 2-way network models using a phylogenomic dataset of 211 bacterial genomes.
Weighill, Deborah A; Jacobson, Daniel A
2015-01-01
We present and develop the theory of 3-way networks, a type of hypergraph in which each edge models relationships between triplets of objects as opposed to pairs of objects as done by standard network models. We explore approaches of how to prune these 3-way networks, illustrate their utility in comparative genomics and demonstrate how they find relationships which would be missed by standard 2-way network models using a phylogenomic dataset of 211 bacterial genomes. PMID:25815802
Lartillot, Nicolas; Brinkmann, Henner; Philippe, Hervé
2007-01-01
Background Thanks to the large amount of signal contained in genome-wide sequence alignments, phylogenomic analyses are converging towards highly supported trees. However, high statistical support does not imply that the tree is accurate. Systematic errors, such as the Long Branch Attraction (LBA) artefact, can be misleading, in particular when the taxon sampling is poor, or the outgroup is distant. In an otherwise consistent probabilistic framework, systematic errors in genome-wide analyses can be traced back to model mis-specification problems, which suggests that better models of sequence evolution should be devised, that would be more robust to tree reconstruction artefacts, even under the most challenging conditions. Methods We focus on a well characterized LBA artefact analyzed in a previous phylogenomic study of the metazoan tree, in which two fast-evolving animal phyla, nematodes and platyhelminths, emerge either at the base of all other Bilateria, or within protostomes, depending on the outgroup. We use this artefactual result as a case study for comparing the robustness of two alternative models: a standard, site-homogeneous model, based on an empirical matrix of amino-acid replacement (WAG), and a site-heterogeneous mixture model (CAT). In parallel, we propose a posterior predictive test, allowing one to measure how well a model acknowledges sequence saturation. Results Adopting a Bayesian framework, we show that the LBA artefact observed under WAG disappears when the site-heterogeneous model CAT is used. Using cross-validation, we further demonstrate that CAT has a better statistical fit than WAG on this data set. Finally, using our statistical goodness-of-fit test, we show that CAT, but not WAG, correctly accounts for the overall level of saturation, and that this is due to a better estimation of site-specific amino-acid preferences. Conclusion The CAT model appears to be more robust than WAG against LBA artefacts, essentially because it correctly anticipates the high probability of convergences and reversions implied by the small effective size of the amino-acid alphabet at each site of the alignment. More generally, our results provide strong evidence that site-specificities in the substitution process need be accounted for in order to obtain more reliable phylogenetic trees. PMID:17288577
The proteomic complexity and rise of the primordial ancestor of diversified life
2011-01-01
Background The last universal common ancestor represents the primordial cellular organism from which diversified life was derived. This urancestor accumulated genetic information before the rise of organismal lineages and is considered to be either a simple 'progenote' organism with a rudimentary translational apparatus or a more complex 'cenancestor' with almost all essential biological processes. Recent comparative genomic studies support the latter model and propose that the urancestor was similar to modern organisms in terms of gene content. However, most of these studies were based on molecular sequences, which are fast evolving and of limited value for deep evolutionary explorations. Results Here we engage in a phylogenomic study of protein domain structure in the proteomes of 420 free-living fully sequenced organisms. Domains were defined at the highly conserved fold superfamily (FSF) level of structural classification and an iterative phylogenomic approach was used to reconstruct max_set and min_set FSF repertoires as upper and lower bounds of the urancestral proteome. While the functional make up of the urancestral sets was complex, they represent only 5-11% of the 1,420 FSFs of extant proteomes and their make up and reuse was at least 5 and 3 times smaller than proteomes of free-living organisms, repectively. Trees of proteomes reconstructed directly from FSFs or from molecular functions, which included the max_set and min_set as articial taxa, showed that urancestors were always placed at their base and rooted the tree of life in Archaea. Finally, a molecular clock of FSFs suggests the min_set reflects urancestral genetic make up more reliably and confirms diversified life emerged about 2.9 billion years ago during the start of planet oxygenation. Conclusions The minimum urancestral FSF set reveals the urancestor had advanced metabolic capabilities, was especially rich in nucleotide metabolism enzymes, had pathways for the biosynthesis of membrane sn1,2 glycerol ester and ether lipids, and had crucial elements of translation, including a primordial ribosome with protein synthesis capabilities. It lacked however fundamental functions, including transcription, processes for extracellular communication, and enzymes for deoxyribonucleotide synthesis. Proteomic history reveals the urancestor is closer to a simple progenote organism but harbors a rather complex set of modern molecular functions. PMID:21612591
Field-scale prediction of enhanced DNAPL dissolution based on partitioning tracers.
Wang, Fang; Annable, Michael D; Jawitz, James W
2013-09-01
The equilibrium streamtube model (EST) has demonstrated the ability to accurately predict dense nonaqueous phase liquid (DNAPL) dissolution in laboratory experiments and numerical simulations. Here the model is applied to predict DNAPL dissolution at a tetrachloroethylene (PCE)-contaminated dry cleaner site, located in Jacksonville, Florida. The EST model is an analytical solution with field-measurable input parameters. Measured data from a field-scale partitioning tracer test were used to parameterize the EST model and the predicted PCE dissolution was compared to measured data from an in-situ ethanol flood. In addition, a simulated partitioning tracer test from a calibrated, three-dimensional, spatially explicit multiphase flow model (UTCHEM) was also used to parameterize the EST analytical solution. The EST ethanol prediction based on both the field partitioning tracer test and the simulation closely matched the total recovery well field ethanol data with Nash-Sutcliffe efficiency E=0.96 and 0.90, respectively. The EST PCE predictions showed a peak shift to earlier arrival times for models based on either field-measured or simulated partitioning tracer tests, resulting in poorer matches to the field PCE data in both cases. The peak shifts were concluded to be caused by well screen interval differences between the field tracer test and ethanol flood. Both the EST model and UTCHEM were also used to predict PCE aqueous dissolution under natural gradient conditions, which has a much less complex flow pattern than the forced-gradient double five spot used for the ethanol flood. The natural gradient EST predictions based on parameters determined from tracer tests conducted with a complex flow pattern underestimated the UTCHEM-simulated natural gradient total mass removal by 12% after 170 pore volumes of water flushing indicating that some mass was not detected by the tracers likely due to stagnation zones in the flow field. These findings highlight the important influence of well configuration and the associated flow patterns on dissolution. © 2013.
Field-scale prediction of enhanced DNAPL dissolution based on partitioning tracers
NASA Astrophysics Data System (ADS)
Wang, Fang; Annable, Michael D.; Jawitz, James W.
2013-09-01
The equilibrium streamtube model (EST) has demonstrated the ability to accurately predict dense nonaqueous phase liquid (DNAPL) dissolution in laboratory experiments and numerical simulations. Here the model is applied to predict DNAPL dissolution at a tetrachloroethylene (PCE)-contaminated dry cleaner site, located in Jacksonville, Florida. The EST model is an analytical solution with field-measurable input parameters. Measured data from a field-scale partitioning tracer test were used to parameterize the EST model and the predicted PCE dissolution was compared to measured data from an in-situ ethanol flood. In addition, a simulated partitioning tracer test from a calibrated, three-dimensional, spatially explicit multiphase flow model (UTCHEM) was also used to parameterize the EST analytical solution. The EST ethanol prediction based on both the field partitioning tracer test and the simulation closely matched the total recovery well field ethanol data with Nash-Sutcliffe efficiency E = 0.96 and 0.90, respectively. The EST PCE predictions showed a peak shift to earlier arrival times for models based on either field-measured or simulated partitioning tracer tests, resulting in poorer matches to the field PCE data in both cases. The peak shifts were concluded to be caused by well screen interval differences between the field tracer test and ethanol flood. Both the EST model and UTCHEM were also used to predict PCE aqueous dissolution under natural gradient conditions, which has a much less complex flow pattern than the forced-gradient double five spot used for the ethanol flood. The natural gradient EST predictions based on parameters determined from tracer tests conducted with a complex flow pattern underestimated the UTCHEM-simulated natural gradient total mass removal by 12% after 170 pore volumes of water flushing indicating that some mass was not detected by the tracers likely due to stagnation zones in the flow field. These findings highlight the important influence of well configuration and the associated flow patterns on dissolution.
Hughes, Lily C; Ortí, Guillermo; Huang, Yu; Sun, Ying; Baldwin, Carole C; Thompson, Andrew W; Arcila, Dahiana; Betancur-R, Ricardo; Li, Chenhong; Becker, Leandro; Bellora, Nicolás; Zhao, Xiaomeng; Li, Xiaofeng; Wang, Min; Fang, Chao; Xie, Bing; Zhou, Zhuocheng; Huang, Hai; Chen, Songlin; Venkatesh, Byrappa; Shi, Qiong
2018-05-14
Our understanding of phylogenetic relationships among bony fishes has been transformed by analysis of a small number of genes, but uncertainty remains around critical nodes. Genome-scale inferences so far have sampled a limited number of taxa and genes. Here we leveraged 144 genomes and 159 transcriptomes to investigate fish evolution with an unparalleled scale of data: >0.5 Mb from 1,105 orthologous exon sequences from 303 species, representing 66 out of 72 ray-finned fish orders. We apply phylogenetic tests designed to trace the effect of whole-genome duplication events on gene trees and find paralogy-free loci using a bioinformatics approach. Genome-wide data support the structure of the fish phylogeny, and hypothesis-testing procedures appropriate for phylogenomic datasets using explicit gene genealogy interrogation settle some long-standing uncertainties, such as the branching order at the base of the teleosts and among early euteleosts, and the sister lineage to the acanthomorph and percomorph radiations. Comprehensive fossil calibrations date the origin of all major fish lineages before the end of the Cretaceous.
Between a Pod and a Hard Test: The Deep Evolution of Amoebae
Kang, Seungho; Tice, Alexander K.; Spiegel, Frederick W.; Silberman, Jeffrey D.; Pánek, Tomáš; Čepička, Ivan; Kostka, Martin; Kosakyan, Anush; Alcântara, Daniel M.C.; Roger, Andrew J.; Shadwick, Lora L.; Smirnov, Alexey; Kudryavtsev, Alexander; Lahr, Daniel J.G.; Brown, Matthew W.
2017-01-01
Abstract Amoebozoa is the eukaryotic supergroup sister to Obazoa, the lineage that contains the animals and Fungi, as well as their protistan relatives, and the breviate and apusomonad flagellates. Amoebozoa is extraordinarily diverse, encompassing important model organisms and significant pathogens. Although amoebozoans are integral to global nutrient cycles and present in nearly all environments, they remain vastly understudied. We present a robust phylogeny of Amoebozoa based on broad representative set of taxa in a phylogenomic framework (325 genes). By sampling 61 taxa using culture-based and single-cell transcriptomics, our analyses show two major clades of Amoebozoa, Discosea, and Tevosa. This phylogeny refutes previous studies in major respects. Our results support the hypothesis that the last common ancestor of Amoebozoa was sexual and flagellated, it also may have had the ability to disperse propagules from a sporocarp-type fruiting body. Overall, the main macroevolutionary patterns in Amoebozoa appear to result from the parallel losses of homologous characters of a multiphase life cycle that included flagella, sex, and sporocarps rather than independent acquisition of convergent features. PMID:28505375
Blom, Mozes P K
2015-08-05
Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally conducted, is challenging with thousands of loci. Furthermore, due to the impracticality of repeated visual inspection with alternative filtering criteria, the potential consequences of using datasets with different degrees of missing data remain nominally explored in most empirical phylogenomic studies. In this short communication, I describe a flexible high-throughput pipeline designed to assess alignment quality and filter exonic sequence data for subsequent inference. The stringency criteria for alignment quality and missing data can be adapted based on the expected level of sequence divergence. Each alignment is automatically evaluated based on the stringency criteria specified, significantly reducing the number of alignments that require visual inspection. By developing a rapid method for alignment filtering and quality assessment, the consistency of phylogenetic estimation based on exonic sequence alignments can be further explored across distinct inference methods, while accounting for different degrees of missing data.
Braking the bandwagon: scrutinizing the science and politics of empirically supported therapies.
Hagemoser, Steven D
2009-12-01
Proponents of empirically supported therapies (ESTs) argue that because manualized ESTs have demonstrated efficacy in treating a range of psychological disorders, they should be the treatments of choice. In this article, the author uses a hypothetical treatment for obesity to highlight numerous flaws in EST logic and argues for common factors as a more clinically relevant but empirically challenging approach. The author then explores how political variables may be contributing to the expansion of EST and the resulting restriction of practitioner autonomy. Last, the author argues that EST is best viewed as 1 component of a more comprehensive evidence-based practice framework. The author concludes with some cautionary statements about the perils of equating the EST paradigm with the scientist-practitioner ideal.
PHYLUCE is a software package for the analysis of conserved genomic loci.
Faircloth, Brant C
2016-03-01
Targeted enrichment of conserved and ultraconserved genomic elements allows universal collection of phylogenomic data from hundreds of species at multiple time scales (<5 Ma to > 300 Ma). Prior to downstream inference, data from these types of targeted enrichment studies must undergo preprocessing to assemble contigs from sequence data; identify targeted, enriched loci from the off-target background data; align enriched contigs representing conserved loci to one another; and prepare and manipulate these alignments for subsequent phylogenomic inference. PHYLUCE is an efficient and easy-to-install software package that accomplishes these tasks across hundreds of taxa and thousands of enriched loci. PHYLUCE is written for Python 2.7. PHYLUCE is supported on OSX and Linux (RedHat/CentOS) operating systems. PHYLUCE source code is distributed under a BSD-style license from https://www.github.com/faircloth-lab/phyluce/ PHYLUCE is also available as a package (https://binstar.org/faircloth-lab/phyluce) for the Anaconda Python distribution that installs all dependencies, and users can request a PHYLUCE instance on iPlant Atmosphere (tag: phyluce). The software manual and a tutorial are available from http://phyluce.readthedocs.org/en/latest/ and test data are available from doi: 10.6084/m9.figshare.1284521. brant@faircloth-lab.org Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
When outgroups fail; phylogenomics of rooting the emerging pathogen, Coxiella burnetii.
Pearson, Talima; Hornstra, Heidie M; Sahl, Jason W; Schaack, Sarah; Schupp, James M; Beckstrom-Sternberg, Stephen M; O'Neill, Matthew W; Priestley, Rachael A; Champion, Mia D; Beckstrom-Sternberg, James S; Kersh, Gilbert J; Samuel, James E; Massung, Robert F; Keim, Paul
2013-09-01
Rooting phylogenies is critical for understanding evolution, yet the importance, intricacies and difficulties of rooting are often overlooked. For rooting, polymorphic characters among the group of interest (ingroup) must be compared to those of a relative (outgroup) that diverged before the last common ancestor (LCA) of the ingroup. Problems arise if an outgroup does not exist, is unknown, or is so distant that few characters are shared, in which case duplicated genes originating before the LCA can be used as proxy outgroups to root diverse phylogenies. Here, we describe a genome-wide expansion of this technique that can be used to solve problems at the other end of the evolutionary scale: where ingroup individuals are all very closely related to each other, but the next closest relative is very distant. We used shared orthologous single nucleotide polymorphisms (SNPs) from 10 whole genome sequences of Coxiella burnetii, the causative agent of Q fever in humans, to create a robust, but unrooted phylogeny. To maximize the number of characters informative about the rooting, we searched entire genomes for polymorphic duplicated regions where orthologs of each paralog could be identified so that the paralogs could be used to root the tree. Recent radiations, such as those of emerging pathogens, often pose rooting challenges due to a lack of ingroup variation and large genomic differences with known outgroups. Using a phylogenomic approach, we created a robust, rooted phylogeny for C. burnetii. [Coxiella burnetii; paralog SNPs; pathogen evolution; phylogeny; recent radiation; root; rooting using duplicated genes.].
Horvath, Julie E.; Weisrock, David W.; Embry, Stephanie L.; Fiorentino, Isabella; Balhoff, James P.; Kappeler, Peter; Wray, Gregory A.; Willard, Huntington F.; Yoder, Anne D.
2008-01-01
Lemurs and the other strepsirrhine primates are of great interest to the primate genomics community due to their phylogenetic placement as the sister lineage to all other primates. Previous attempts to resolve the phylogeny of lemurs employed limited mitochondrial or small nuclear data sets, with many relationships poorly supported or entirely unresolved. We used genomic resources to develop 11 novel markers from nine chromosomes, representing ∼9 kb of nuclear sequence data. In combination with previously published nuclear and mitochondrial loci, this yields a data set of more than 16 kb and adds ∼275 kb of DNA sequence to current databases. Our phylogenetic analyses confirm hypotheses of lemuriform monophyly and provide robust resolution of the phylogenetic relationships among the five lemuriform families. We verify that the genus Daubentonia is the sister lineage to all other lemurs. The Cheirogaleidae and Lepilemuridae are sister taxa and together form the sister lineage to the Indriidae; this clade is the sister lineage to the Lemuridae. Divergence time estimates indicate that lemurs are an ancient group, with their initial diversification occurring around the Cretaceous-Tertiary boundary. Given the power of this data set to resolve branches in a notoriously problematic area of primate phylogeny, we anticipate that our phylogenomic toolkit will be of value to other studies of primate phylogeny and diversification. Moreover, the methods applied will be broadly applicable to other taxonomic groups where phylogenetic relationships have been notoriously difficult to resolve. PMID:18245770
Rodríguez, Ariel; Burgon, James D; Lyra, Mariana; Irisarri, Iker; Baurain, Denis; Blaustein, Leon; Göçmen, Bayram; Künzel, Sven; Mable, Barbara K; Nolte, Arne W; Veith, Michael; Steinfartz, Sebastian; Elmer, Kathryn R; Philippe, Hervé; Vences, Miguel
2017-10-01
The rise of high-throughput sequencing techniques provides the unprecedented opportunity to analyse controversial phylogenetic relationships in great depth, but also introduces a risk of being misinterpreted by high node support values influenced by unevenly distributed missing data or unrealistic model assumptions. Here, we use three largely independent phylogenomic data sets to reconstruct the controversial phylogeny of true salamanders of the genus Salamandra, a group of amphibians providing an intriguing model to study the evolution of aposematism and viviparity. For all six species of the genus Salamandra, and two outgroup species from its sister genus Lyciasalamandra, we used RNA sequencing (RNAseq) and restriction site associated DNA sequencing (RADseq) to obtain data for: (1) 3070 nuclear protein-coding genes from RNAseq; (2) 7440 loci obtained by RADseq; and (3) full mitochondrial genomes. The RNAseq and RADseq data sets retrieved fully congruent topologies when each of them was analyzed in a concatenation approach, with high support for: (1) S. infraimmaculata being sister group to all other Salamandra species; (2) S. algira being sister to S. salamandra; (3) these two species being the sister group to a clade containing S. atra, S. corsica and S. lanzai; and (4) the alpine species S. atra and S. lanzai being sister taxa. The phylogeny inferred from the mitochondrial genome sequences differed from these results, most notably by strongly supporting a clade containing S. atra and S. corsica as sister taxa. A different placement of S. corsica was also retrieved when analysing the RNAseq and RADseq data under species tree approaches. Closer examination of gene trees derived from RNAseq revealed that only a low number of them supported each of the alternative placements of S. atra. Furthermore, gene jackknife support for the S. atra - S. lanzai node stabilized only with very large concatenated data sets. The phylogeny of true salamanders thus provides a compelling example of how classical node support metrics such as bootstrap and Bayesian posterior probability can provide high confidence values in a phylogenomic topology even if the phylogenetic signal for some nodes is spurious, highlighting the importance of complementary approaches such as gene jackknifing. Yet, the general congruence among the topologies recovered from the RNAseq and RADseq data sets increases our confidence in the results, and validates the use of phylotranscriptomic approaches for reconstructing shallow relationships among closely related taxa. We hypothesize that the evolution of Salamandra has been characterized by episodes of introgressive hybridization, which would explain the difficulties of fully reconstructing their evolutionary relationships. Copyright © 2017. Published by Elsevier Inc.
Comparative Genomics and Phylogenomics of East Asian Tulips (Amana, Liliaceae)
Li, Pan; Lu, Rui-Sen; Xu, Wu-Qin; Ohi-Toma, Tetsuo; Cai, Min-Qi; Qiu, Ying-Xiong; Cameron, Kenneth M.; Fu, Cheng-Xin
2017-01-01
The genus Amana Honda (Liliaceae), when it is treated as separate from Tulipa, comprises six perennial herbaceous species that are restricted to China, Japan and the Korean Peninsula. Although all six Amana species have important medicinal and horticultural uses, studies focused on species identification and molecular phylogenetics are few. Here we report the nucleotide sequences of six complete Amana chloroplast (cp) genomes. The cp genomes of Amana range from 150,613 bp to 151,136 bp in length, all including a pair of inverted repeats (25,629–25,859 bp) separated by the large single-copy (81,482–82,218 bp) and small single-copy (17,366–17,465 bp) regions. Each cp genome equivalently contains 112 unique genes consisting of 30 transfer RNA genes, four ribosomal RNA genes, and 78 protein coding genes. Gene content, gene order, AT content, and IR/SC boundary structure are nearly identical among all Amana cp genomes. However, the relative contraction and expansion of the IR/SC borders among the six Amana cp genomes results in length variation among them. Simple sequence repeat (SSR) analyses of these Amana cp genomes indicate that the richest SSRs are A/T mononucleotides. The number of repeats among the six Amana species varies from 54 (A. anhuiensis) to 69 (Amana kuocangshanica) with palindromic (28–35) and forward repeats (23–30) as the most common types. Phylogenomic analyses based on these complete cp genomes and 74 common protein-coding genes strongly support the monophyly of the genus, and a sister relationship between Amana and Erythronium, rather than a shared common ancestor with Tulipa. Nine DNA markers (rps15–ycf1, accD–psaI, petA–psbJ, rpl32–trnL, atpH–atpI, petD–rpoA, trnS–trnG, psbM–trnD, and ycf4–cemA) with number of variable sites greater than 0.9% were identified, and these may be useful for future population genetic and phylogeographic studies of Amana species. PMID:28421090
Xu, Lin; Wu, Yue-Hong; Zhou, Peng; Cheng, Hong; Liu, Qian; Xu, Xue-Wei
2018-05-23
Type strains of the genus Porphyrobacter belonging to the family Erythrobacteraceae and the class Alphaproteobacteria have been isolated from various environments, such as swimming pools, lake water and hot springs. P. cryptus DSM 12079 T and P. tepidarius DSM 10594 T out of all Erythrobacteraceae type strains, are two type strains that have been isolated from geothermal environments. Next-generation sequencing (NGS) technology offers a convenient approach for detecting situational types based on protein sequence differences between thermophiles and mesophiles; amino acid substitutions can lead to protein structural changes, improving the thermal stabilities of proteins. Comparative genomic studies have revealed that different thermal types exist in different taxa, and few studies have been focused on the class Alphaproteobacteria, especially the family Erythrobacteraceae. In this study, eight genomes of Porphyrobacter strains were compared to elucidate how Porphyrobacter thermophiles developed mechanisms to adapt to thermal environments. P. cryptus DSM 12079 T grew optimally at 50 °C, which was higher than the optimal growth temperature of other Porphyrobacter type strains. Phylogenomic analysis of the genus Porphyrobacter revealed that P. cryptus DSM 12079 T formed a distinct and independent clade. Comparative genomic studies uncovered that 1405 single-copy genes were shared by Porphyrobacter type strains. Alignments of single-copy proteins showed that various types of amino acid substitutions existed between P. cryptus DSM 12079 T and the other Porphyrobacter strains. The primary substitution types were changes from glycine/serine to alanine. P. cryptus DSM 12079 T was the sole thermophile within the genus Porphyrobacter. Phylogenomic analysis and amino acid frequencies indicated that amino acid substitutions might play an important role in the thermophily of P. cryptus DSM 12079 T . Bioinformatic analysis revealed that major amino acid substitutional types, such as changes from glycine/serine to alanine, increase the frequency of α-helices in proteins, promoting protein thermostability in P. cryptus DSM 12079 T . Hence, comparative genomic analysis broadens our understanding of thermophilic mechanisms in the genus Porphyrobacter and may provide a useful insight in the design of thermophilic enzymes for agricultural, industrial and medical applications.
Phylogenomic analysis of Apoidea sheds new light on the sister group of bees.
Sann, Manuela; Niehuis, Oliver; Peters, Ralph S; Mayer, Christoph; Kozlov, Alexey; Podsiadlowski, Lars; Bank, Sarah; Meusemann, Karen; Misof, Bernhard; Bleidorn, Christoph; Ohl, Michael
2018-05-18
Apoid wasps and bees (Apoidea) are an ecologically and morphologically diverse group of Hymenoptera, with some species of bees having evolved eusocial societies. Major problems for our understanding of the evolutionary history of Apoidea have been the difficulty to trace the phylogenetic origin and to reliably estimate the geological age of bees. To address these issues, we compiled a comprehensive phylogenomic dataset by simultaneously analyzing target DNA enrichment and transcriptomic sequence data, comprising 195 single-copy protein-coding genes and covering all major lineages of apoid wasps and bee families. Our compiled data matrix comprised 284,607 nucleotide sites that we phylogenetically analyzed by applying a combination of domain- and codon-based partitioning schemes. The inferred results confirm the polyphyletic status of the former family "Crabronidae", which comprises nine major monophyletic lineages. We found the former subfamily Pemphredoninae to be polyphyletic, comprising three distantly related clades. One of them, Ammoplanina, constituted the sister group of bees in all our analyses. We estimate the origin of bees to be in the Early Cretaceous (ca. 128 million years ago), a time period during which angiosperms rapidly radiated. Finally, our phylogenetic analyses revealed that within the Apoidea, (eu)social societies evolved exclusively in a single clade that comprises pemphredonine and philanthine wasps as well as bees. By combining transcriptomic sequences with those obtained via target DNA enrichment, we were able to include an unprecedented large number of apoid wasps in a phylogenetic study for tracing the phylogenetic origin of bees. Our results confirm the polyphyletic nature of the former wasp family Crabonidae, which we here suggest splitting into eight families. Of these, the family Ammoplanidae possibly represents the extant sister lineage of bees. Species of Ammoplanidae are known to hunt thrips, of which some aggregate on flowers and feed on pollen. The specific biology of Ammoplanidae as predators indicates how the transition from a predatory to pollen-collecting life style could have taken place in the evolution of bees. This insight plus the finding that (eu)social societies evolved exclusively in a single subordinated lineage of apoid wasps provides new perspectives for future comparative studies.
Noda, Hiroaki; Kawai, Sawako; Koizumi, Yoko; Matsui, Kageaki; Zhang, Qiang; Furukawa, Shigetoyo; Shimomura, Michihiko; Mita, Kazuei
2008-03-03
The brown planthopper (BPH), Nilaparvata lugens (Hemiptera, Delphacidae), is a serious insect pests of rice plants. Major means of BPH control are application of agricultural chemicals and cultivation of BPH resistant rice varieties. Nevertheless, BPH strains that are resistant to agricultural chemicals have developed, and BPH strains have appeared that are virulent against the resistant rice varieties. Expressed sequence tag (EST) analysis and related applications are useful to elucidate the mechanisms of resistance and virulence and to reveal physiological aspects of this non-model insect, with its poorly understood genetic background. More than 37,000 high-quality ESTs, excluding sequences of mitochondrial genome, microbial genomes, and rDNA, have been produced from 18 libraries of various BPH tissues and stages. About 10,200 clusters have been made from whole EST sequences, with average EST size of 627 bp. Among the top ten most abundantly expressed genes, three are unique and show no homology in BLAST searches. The actin gene was highly expressed in BPH, especially in the thorax. Tissue-specifically expressed genes were extracted based on the expression frequency among the libraries. An EST database is available at our web site. The EST library will provide useful information for transcriptional analyses, proteomic analyses, and gene functional analyses of BPH. Moreover, specific genes for hemimetabolous insects will be identified. The microarray fabricated based on the EST information will be useful for finding genes related to agricultural and biological problems related to this pest.
Lubin, Johnathan W; Tucey, Timothy M; Lundblad, Victoria
2012-09-01
In the budding yeast Saccharomyces cerevisiae, the telomerase enzyme is composed of a 1.3-kb TLC1 RNA that forms a complex with Est2 (the catalytic subunit) and two regulatory proteins, Est1 and Est3. Previous work has identified a conserved 5-nt bulge, present in a long helical arm of TLC1, which mediates binding of Est1 to TLC1. However, increased expression of Est1 can bypass the consequences of removal of this RNA bulge, indicating that there are additional binding site(s) for Est1 on TLC1. We report here that a conserved single-stranded internal loop immediately adjacent to the bulge is also required for the Est1-RNA interaction; furthermore, a TLC1 variant that lacks this internal loop but retains the bulge cannot be suppressed by Est1 overexpression, arguing that the internal loop may be a more critical element for Est1 binding. An additional structural feature consisting of a single-stranded region at the base of the helix containing the bulge and internal loop also contributes to recognition of TLC1 by Est1, potentially by providing flexibility to this helical arm. Association of Est1 with each of these TLC1 motifs was assessed using a highly sensitive biochemical assay that simultaneously monitors the relative levels of the Est1 and Est2 proteins in the telomerase complex. The identification of three elements of TLC1 that are required for Est1 association provides a detailed view of this particular protein-RNA interaction.
Dictionary-learning-based reconstruction method for electron tomography.
Liu, Baodong; Yu, Hengyong; Verbridge, Scott S; Sun, Lizhi; Wang, Ge
2014-01-01
Electron tomography usually suffers from so-called “missing wedge” artifacts caused by limited tilt angle range. An equally sloped tomography (EST) acquisition scheme (which should be called the linogram sampling scheme) was recently applied to achieve 2.4-angstrom resolution. On the other hand, a compressive sensing inspired reconstruction algorithm, known as adaptive dictionary based statistical iterative reconstruction (ADSIR), has been reported for X-ray computed tomography. In this paper, we evaluate the EST, ADSIR, and an ordered-subset simultaneous algebraic reconstruction technique (OS-SART), and compare the ES and equally angled (EA) data acquisition modes. Our results show that OS-SART is comparable to EST, and the ADSIR outperforms EST and OS-SART. Furthermore, the equally sloped projection data acquisition mode has no advantage over the conventional equally angled mode in this context.
A study of alternative splicing in the pig
2010-01-01
Background Since at least half of the genes in mammalian genomes are subjected to alternative splicing, alternative pre-mRNA splicing plays an important contribution to the complexity of the mammalian proteome. Expressed sequence tags (ESTs) provide evidence of a great number of possible alternative isoforms. With the EST resource for the domestic pig now containing more than one million porcine ESTs, it is possible to identify alternative splice forms of the individual transcripts in this species from the EST data with some confidence. Results The pig EST data generated by the Sino-Danish Pig Genome project has been assembled with publicly available ESTs and made available in the PigEST database. Using the Distiller package 2,515 EST clusters with candidate alternative isoforms were identified in the EST data with high confidence. In agreement with general observations in human and mouse, we find putative splice variants in about 30% of the contigs with more than 50 ESTs. Based on the criteria that a minimum of two EST sequences confirmed each splice event, a list of 100 genes with the most distinct tissue-specific alternative splice events was generated from the list of candidates. To confirm the tissue specificity of the splice events, 10 genes with functional annotation were randomly selected from which 16 individual splice events were chosen for experimental verification by quantitative PCR (qPCR). Six genes were shown to have tissue specific alternatively spliced transcripts with expression patterns matching those of the EST data. The remaining four genes had tissue-restricted expression of alternative spliced transcripts. Five out of the 16 splice events that were experimentally verified were found to be putative pig specific. Conclusions In accordance with human and rodent studies we estimate that approximately 30% of the porcine genes undergo alternative splicing. We found a good correlation between EST predicted tissue-specificity and experimentally validated splice events in different porcine tissue. This study indicates that a cluster size of around 50 ESTs is optimal for in silico detection of alternative splicing. Although based on a limited number of splice events, the study supports the notion that alternative splicing could have an important impact on species differentiation since 31% of the splice events studied appears to be species specific. PMID:20444244
Development of New Candidate Gene and EST-Based Molecular Markers for Gossypium Species
Buyyarapu, Ramesh; Kantety, Ramesh V.; Yu, John Z.; Saha, Sukumar; Sharma, Govind C.
2011-01-01
New source of molecular markers accelerate the efforts in improving cotton fiber traits and aid in developing high-density integrated genetic maps. We developed new markers based on candidate genes and G. arboreum EST sequences that were used for polymorphism detection followed by genetic and physical mapping. Nineteen gene-based markers were surveyed for polymorphism detection in 26 Gossypium species. Cluster analysis generated a phylogenetic tree with four major sub-clusters for 23 species while three species branched out individually. CAP method enhanced the rate of polymorphism of candidate gene-based markers between G. hirsutum and G. barbadense. Two hundred A-genome based SSR markers were designed after datamining of G. arboreum EST sequences (Mississippi Gossypium arboreum EST-SSR: MGAES). Over 70% of MGAES markers successfully produced amplicons while 65 of them demonstrated polymorphism between the parents of G. hirsutum and G. barbadense RIL population and formed 14 linkage groups. Chromosomal localization of both candidate gene-based and MGAES markers was assisted by euploid and hypoaneuploid CS-B analysis. Gene-based and MGAES markers were highly informative as they were designed from candidate genes and fiber transcriptome with a potential to be integrated into the existing cotton genetic and physical maps. PMID:22315588
On April 22, 2008, EPA issued the final Lead; Renovation, Repair, and Painting (RRP) Program Rule. The rule addresses lead-based paint hazards created by renovation, repair, and painting activities that disturb lead-based paint in target housing and child-occupied facilities. Und...
MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.
Kumar, Sudhir; Stecher, Glen; Li, Michael; Knyaz, Christina; Tamura, Koichiro
2018-06-01
The Molecular Evolutionary Genetics Analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from www.megasoftware.net free of charge.
Molecular Epidemiology and Genomics of Group A Streptococcus
Bessen, Debra E.; McShan, W. Michael; Nguyen, Scott V.; Shetty, Amol; Agrawal, Sonia; Tettelin, Hervé
2014-01-01
Streptococcus pyogenes (group A streptococcus; GAS) is a strict human pathogen with a very high prevalence worldwide. This review highlights the genetic organization of the species and the important ecological considerations that impact its evolution. Recent advances are presented on the topics of molecular epidemiology, population biology, molecular basis for genetic change, genome structure and genetic flux, phylogenomics and closely related streptococcal species, and the long- and short-term evolution of GAS. The application of whole genome sequence data to addressing key biological questions is discussed. PMID:25460818
None
2017-12-09
Le Dr.Muriel James est ingénieur, conseiller/consultant dans plusieurs commissions (p.ex.justice criminelle au Japon) et universités et est l'auteur de nombreux livres (11) traduits dans plusieurs langues. Le plus connu de ses ouvrages est "Born to win". Dans son exposé elle se réfère à son livre "O.K.Boss" et parle d'un modèle spécifique du comportement humain et de ses qualités essentielles qui est la base de son travail.
Comparative mapping in the Fagaceae and beyond with EST-SSRs
2012-01-01
Background Genetic markers and linkage mapping are basic prerequisites for comparative genetic analyses, QTL detection and map-based cloning. A large number of mapping populations have been developed for oak, but few gene-based markers are available for constructing integrated genetic linkage maps and comparing gene order and QTL location across related species. Results We developed a set of 573 expressed sequence tag-derived simple sequence repeats (EST-SSRs) and located 397 markers (EST-SSRs and genomic SSRs) on the 12 oak chromosomes (2n = 2x = 24) on the basis of Mendelian segregation patterns in 5 full-sib mapping pedigrees of two species: Quercus robur (pedunculate oak) and Quercus petraea (sessile oak). Consensus maps for the two species were constructed and aligned. They showed a high degree of macrosynteny between these two sympatric European oaks. We assessed the transferability of EST-SSRs to other Fagaceae genera and a subset of these markers was mapped in Castanea sativa, the European chestnut. Reasonably high levels of macrosynteny were observed between oak and chestnut. We also obtained diversity statistics for a subset of EST-SSRs, to support further population genetic analyses with gene-based markers. Finally, based on the orthologous relationships between the oak, Arabidopsis, grape, poplar, Medicago, and soybean genomes and the paralogous relationships between the 12 oak chromosomes, we propose an evolutionary scenario of the 12 oak chromosomes from the eudicot ancestral karyotype. Conclusions This study provides map locations for a large set of EST-SSRs in two oak species of recognized biological importance in natural ecosystems. This first step toward the construction of a gene-based linkage map will facilitate the assignment of future genome scaffolds to pseudo-chromosomes. This study also provides an indication of the potential utility of new gene-based markers for population genetics and comparative mapping within and beyond the Fagaceae. PMID:22931513
Identification and characterization of gene-based SSR markers in date palm (Phoenix dactylifera L.).
Zhao, Yongli; Williams, Roxanne; Prakash, C S; He, Guohao
2012-12-15
Date palm (Phoenix dactylifera L.) is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding. In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs) and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs). We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7%) were the most common, followed by tetranucleotide (10.4%) and dinucleotide motifs (9.6%). The motif AG (85.7%) was most abundant in dinucleotides, while motifs AGG (26.8%), AAG (19.3%), and AGC (16.1%) were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4%) of such ESTs had homology with known proteins. Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.
ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites.
Li, Li; Crabtree, Jonathan; Fischer, Steve; Pinney, Deborah; Stoeckert, Christian J; Sibley, L David; Roos, David S
2004-01-01
ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to form consensus sequences for each organism, and these assemblies were then subjected to automated annotation via similarity searches against protein and domain databases. The underlying relational database infrastructure, Genomics Unified Schema (GUS), enables complex biologically based queries, facilitating validation of gene models, identification of alternative splicing, detection of single nucleotide polymorphisms, identification of stage-specific genes and recognition of phylogenetically conserved and phylogenetically restricted sequences.
Noda, Hiroaki; Kawai, Sawako; Koizumi, Yoko; Matsui, Kageaki; Zhang, Qiang; Furukawa, Shigetoyo; Shimomura, Michihiko; Mita, Kazuei
2008-01-01
Background The brown planthopper (BPH), Nilaparvata lugens (Hemiptera, Delphacidae), is a serious insect pests of rice plants. Major means of BPH control are application of agricultural chemicals and cultivation of BPH resistant rice varieties. Nevertheless, BPH strains that are resistant to agricultural chemicals have developed, and BPH strains have appeared that are virulent against the resistant rice varieties. Expressed sequence tag (EST) analysis and related applications are useful to elucidate the mechanisms of resistance and virulence and to reveal physiological aspects of this non-model insect, with its poorly understood genetic background. Results More than 37,000 high-quality ESTs, excluding sequences of mitochondrial genome, microbial genomes, and rDNA, have been produced from 18 libraries of various BPH tissues and stages. About 10,200 clusters have been made from whole EST sequences, with average EST size of 627 bp. Among the top ten most abundantly expressed genes, three are unique and show no homology in BLAST searches. The actin gene was highly expressed in BPH, especially in the thorax. Tissue-specifically expressed genes were extracted based on the expression frequency among the libraries. An EST database is available at our web site. Conclusion The EST library will provide useful information for transcriptional analyses, proteomic analyses, and gene functional analyses of BPH. Moreover, specific genes for hemimetabolous insects will be identified. The microarray fabricated based on the EST information will be useful for finding genes related to agricultural and biological problems related to this pest. PMID:18315884
Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Yu, Yeisoo; Yang, Kiwoung; Choi, Beom-Soon; Koh, Hee-Jong; Waminal, Nomar Espinosa; Choi, Hong-Il; Kim, Nam-Hoon; Jang, Woojong; Park, Hyun-Seung; Lee, Jonghoon; Lee, Hyun Oh; Joh, Ho Jun; Lee, Hyeon Ju; Park, Jee Young; Perumal, Sampath; Jayakodi, Murukarthick; Lee, Yun Sun; Kim, Backki; Copetti, Dario; Kim, Soonok; Kim, Sunggil; Lim, Ki-Byung; Kim, Young-Dong; Lee, Jungho; Cho, Kwang-Su; Park, Beom-Seok; Wing, Rod A.; Yang, Tae-Jin
2015-01-01
Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis. PMID:26506948
Zhou, Xuming; Xu, Shixia; Xu, Junxiao; Chen, Bingyao; Zhou, Kaiya; Yang, Guang
2012-01-01
Abstract Although great progress has been made in resolving the relationships of placental mammals, the position of several clades in Laurasiatheria remain controversial. In this study, we performed a phylogenetic analysis of 97 orthologs (46,152 bp) for 15 taxa, representing all laurasiatherian orders. Additionally, phylogenetic trees of laurasiatherian mammals with draft genome sequences were reconstructed based on 1608 exons (2,175,102 bp). Our reconstructions resolve the interordinal relationships within Laurasiatheria and corroborate the clades Scrotifera, Fereuungulata, and Cetartiodactyla. Furthermore, we tested alternative topologies within Laurasiatheria, and among alternatives for the phylogenetic position of Perissodactyla, a sister-group relationship with Cetartiodactyla receives the highest support. Thus, Pegasoferae (Perissodactyla + Carnivora + Pholidota + Chiroptera) does not appear to be a natural group. Divergence time estimates from these genes were compared with published estimates for splits within Laurasiatheria. Our estimates were similar to those of several studies and suggest that the divergences among these orders occurred within just a few million years. PMID:21900649
Dodsworth, Jeremy A.; Blainey, Paul C.; Murugapiran, Senthil K.; Swingley, Wesley D.; Ross, Christian A.; Tringe, Susannah G.; Chain, Patrick S. G.; Scholz, Matthew B.; Lo, Chien-Chi; Raymond, Jason; Quake, Stephen R.; Hedlund, Brian P.
2013-01-01
OP9 is a yet-uncultivated bacterial lineage found in geothermal systems, petroleum reservoirs, anaerobic digesters, and wastewater treatment facilities. Here we use single-cell and metagenome sequencing to obtain two distinct, nearly-complete OP9 genomes, one constructed from single cells sorted from hot spring sediments and the other derived from binned metagenomic contigs from an in situ-enriched cellulolytic, thermophilic community. Phylogenomic analyses support the designation of OP9 as a candidate phylum for which we propose the name ‘Atribacteria’. Although a plurality of predicted proteins is most similar to those from Firmicutes, the presence of key genes suggests a diderm cell envelope. Metabolic reconstruction from the core genome suggests an anaerobic lifestyle based on sugar fermentation by Embden-Meyerhof glycolysis with production of hydrogen, acetate, and ethanol. Putative glycohydrolases and an endoglucanase may enable catabolism of (hemi)cellulose in thermal environments. This study lays a foundation for understanding the physiology and ecological role of the ‘Atribacteria’. PMID:23673639
Rider, Stanley Dean
2016-07-01
The complete mitochondrial genome of the desert darkling beetle Asbolus verrucosus (LeConte, 1851) was sequenced using paired-end technology to an average depth of 42,111× and assembled using De Bruijn graph-based methods. The genome is 15,828 bp in length and conforms to the basal arthropod mitochondrial gene composition with the same gene orders and orientations as other darkling beetle mitochondria. This arrangement includes a control region, 22 tRNA genes, 2 rRNA genes and 13 protein-coding genes. The main coding strand is probably replicated as the lagging strand (GC skew of -0.36 and AT skew of +0.19). Phylogenomics analyses are consistent with taxonomic classifications and indicate that Tenebrio molitor is the closest relative that has a completely sequenced mitochondrial genome available for analysis. This is the first fully assembled mitogenome sequence for a darkling beetle in the subfamily Pimeliinae and will be useful for population studies on members of this ecologically important group of beetles.
Manno, Mariano Torres; Zuljan, Federico; Alarcón, Sergio; Esteban, Luis; Blancato, Victor; Espariz, Martín; Magni, Christian
2018-06-23
Lactococcus lactis strains constitute one of the most important starter cultures for cheese production. In this study, a genome-wide analysis was performed including 68 available genomes of L. lactis group strains showing the existence of two species (L. lactis and L. cremoris) and two biovars (L. lactis biovar. diacetylactis and L. cremoris biovar. lactis). The proposed classification scheme revealed coherency among phenotypic (through in silico and in vivo bacterial function profiling), phylogenomic (through maximum likelihood trees) and genomic (using overall genome sequence-based parameters) approaches. Strain biodiversity for the industrial biovar. diacetylactis was also analyzed, finding they are formed by at least three variants with the CC1 clonal complex as the only one distributed worldwide. These findings and methodologies will help improve the selection of L. lactis group strains for industrial use as well as facilitate the interpretation of previous or future research studies on this diverse group of bacteria. Copyright © 2018. Published by Elsevier B.V.
Congur, Gulsah; Senay, Hilal; Turkcan, Ceren; Canavar, Ece; Erdem, Arzum; Akgol, Sinan
2013-06-28
The aim of this study is (i) to prepare estrone-imprinted nanospheres (nano-EST-MIPs) and (ii) to integrate them into the electrochemical sensor as a recognition layer. N-methacryloyl-(l)-phenylalanine (MAPA) was chosen as the complexing monomer. Firstly, estrone (EST) was complexed with MAPA and the EST-imprinted poly(2-hyroxyethylmethacrylate-co-N-methacryloyl-(l)-phenylalanine) [EST-imprinted poly(HEMA-MAPA)] nanospheres were synthesized by surfactant- free emulsion polymerization method. The specific surface area of the EST-imprinted poly(HEMA-MAPA) nanospheres was found to be 1275 m2/g with a size of 163.2 nm in diameter. According to the elemental analysis results, the nanospheres contained 95.3 mmole MAPA/g nanosphere. The application of EST specific MIP nanospheres for the development of an electrochemical biosensor was introduced for the first time in our study by using electrochemical impedance spectroscopy (EIS) technique. This nano-MIP based sensor presented a great specificity and selectivity for EST.
Radiation dose reduction in medical x-ray CT via Fourier-based iterative reconstruction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fahimian, Benjamin P.; Zhao Yunzhe; Huang Zhifeng
Purpose: A Fourier-based iterative reconstruction technique, termed Equally Sloped Tomography (EST), is developed in conjunction with advanced mathematical regularization to investigate radiation dose reduction in x-ray CT. The method is experimentally implemented on fan-beam CT and evaluated as a function of imaging dose on a series of image quality phantoms and anonymous pediatric patient data sets. Numerical simulation experiments are also performed to explore the extension of EST to helical cone-beam geometry. Methods: EST is a Fourier based iterative algorithm, which iterates back and forth between real and Fourier space utilizing the algebraically exact pseudopolar fast Fourier transform (PPFFT). Inmore » each iteration, physical constraints and mathematical regularization are applied in real space, while the measured data are enforced in Fourier space. The algorithm is automatically terminated when a proposed termination criterion is met. Experimentally, fan-beam projections were acquired by the Siemens z-flying focal spot technology, and subsequently interleaved and rebinned to a pseudopolar grid. Image quality phantoms were scanned at systematically varied mAs settings, reconstructed by EST and conventional reconstruction methods such as filtered back projection (FBP), and quantified using metrics including resolution, signal-to-noise ratios (SNRs), and contrast-to-noise ratios (CNRs). Pediatric data sets were reconstructed at their original acquisition settings and additionally simulated to lower dose settings for comparison and evaluation of the potential for radiation dose reduction. Numerical experiments were conducted to quantify EST and other iterative methods in terms of image quality and computation time. The extension of EST to helical cone-beam CT was implemented by using the advanced single-slice rebinning (ASSR) method. Results: Based on the phantom and pediatric patient fan-beam CT data, it is demonstrated that EST reconstructions with the lowest scanner flux setting of 39 mAs produce comparable image quality, resolution, and contrast relative to FBP with the 140 mAs flux setting. Compared to the algebraic reconstruction technique and the expectation maximization statistical reconstruction algorithm, a significant reduction in computation time is achieved with EST. Finally, numerical experiments on helical cone-beam CT data suggest that the combination of EST and ASSR produces reconstructions with higher image quality and lower noise than the Feldkamp Davis and Kress (FDK) method and the conventional ASSR approach. Conclusions: A Fourier-based iterative method has been applied to the reconstruction of fan-bean CT data with reduced x-ray fluence. This method incorporates advantageous features in both real and Fourier space iterative schemes: using a fast and algebraically exact method to calculate forward projection, enforcing the measured data in Fourier space, and applying physical constraints and flexible regularization in real space. Our results suggest that EST can be utilized for radiation dose reduction in x-ray CT via the readily implementable technique of lowering mAs settings. Numerical experiments further indicate that EST requires less computation time than several other iterative algorithms and can, in principle, be extended to helical cone-beam geometry in combination with the ASSR method.« less
Radiation dose reduction in medical x-ray CT via Fourier-based iterative reconstruction.
Fahimian, Benjamin P; Zhao, Yunzhe; Huang, Zhifeng; Fung, Russell; Mao, Yu; Zhu, Chun; Khatonabadi, Maryam; DeMarco, John J; Osher, Stanley J; McNitt-Gray, Michael F; Miao, Jianwei
2013-03-01
A Fourier-based iterative reconstruction technique, termed Equally Sloped Tomography (EST), is developed in conjunction with advanced mathematical regularization to investigate radiation dose reduction in x-ray CT. The method is experimentally implemented on fan-beam CT and evaluated as a function of imaging dose on a series of image quality phantoms and anonymous pediatric patient data sets. Numerical simulation experiments are also performed to explore the extension of EST to helical cone-beam geometry. EST is a Fourier based iterative algorithm, which iterates back and forth between real and Fourier space utilizing the algebraically exact pseudopolar fast Fourier transform (PPFFT). In each iteration, physical constraints and mathematical regularization are applied in real space, while the measured data are enforced in Fourier space. The algorithm is automatically terminated when a proposed termination criterion is met. Experimentally, fan-beam projections were acquired by the Siemens z-flying focal spot technology, and subsequently interleaved and rebinned to a pseudopolar grid. Image quality phantoms were scanned at systematically varied mAs settings, reconstructed by EST and conventional reconstruction methods such as filtered back projection (FBP), and quantified using metrics including resolution, signal-to-noise ratios (SNRs), and contrast-to-noise ratios (CNRs). Pediatric data sets were reconstructed at their original acquisition settings and additionally simulated to lower dose settings for comparison and evaluation of the potential for radiation dose reduction. Numerical experiments were conducted to quantify EST and other iterative methods in terms of image quality and computation time. The extension of EST to helical cone-beam CT was implemented by using the advanced single-slice rebinning (ASSR) method. Based on the phantom and pediatric patient fan-beam CT data, it is demonstrated that EST reconstructions with the lowest scanner flux setting of 39 mAs produce comparable image quality, resolution, and contrast relative to FBP with the 140 mAs flux setting. Compared to the algebraic reconstruction technique and the expectation maximization statistical reconstruction algorithm, a significant reduction in computation time is achieved with EST. Finally, numerical experiments on helical cone-beam CT data suggest that the combination of EST and ASSR produces reconstructions with higher image quality and lower noise than the Feldkamp Davis and Kress (FDK) method and the conventional ASSR approach. A Fourier-based iterative method has been applied to the reconstruction of fan-bean CT data with reduced x-ray fluence. This method incorporates advantageous features in both real and Fourier space iterative schemes: using a fast and algebraically exact method to calculate forward projection, enforcing the measured data in Fourier space, and applying physical constraints and flexible regularization in real space. Our results suggest that EST can be utilized for radiation dose reduction in x-ray CT via the readily implementable technique of lowering mAs settings. Numerical experiments further indicate that EST requires less computation time than several other iterative algorithms and can, in principle, be extended to helical cone-beam geometry in combination with the ASSR method.
Radiation dose reduction in medical x-ray CT via Fourier-based iterative reconstruction
Fahimian, Benjamin P.; Zhao, Yunzhe; Huang, Zhifeng; Fung, Russell; Mao, Yu; Zhu, Chun; Khatonabadi, Maryam; DeMarco, John J.; Osher, Stanley J.; McNitt-Gray, Michael F.; Miao, Jianwei
2013-01-01
Purpose: A Fourier-based iterative reconstruction technique, termed Equally Sloped Tomography (EST), is developed in conjunction with advanced mathematical regularization to investigate radiation dose reduction in x-ray CT. The method is experimentally implemented on fan-beam CT and evaluated as a function of imaging dose on a series of image quality phantoms and anonymous pediatric patient data sets. Numerical simulation experiments are also performed to explore the extension of EST to helical cone-beam geometry. Methods: EST is a Fourier based iterative algorithm, which iterates back and forth between real and Fourier space utilizing the algebraically exact pseudopolar fast Fourier transform (PPFFT). In each iteration, physical constraints and mathematical regularization are applied in real space, while the measured data are enforced in Fourier space. The algorithm is automatically terminated when a proposed termination criterion is met. Experimentally, fan-beam projections were acquired by the Siemens z-flying focal spot technology, and subsequently interleaved and rebinned to a pseudopolar grid. Image quality phantoms were scanned at systematically varied mAs settings, reconstructed by EST and conventional reconstruction methods such as filtered back projection (FBP), and quantified using metrics including resolution, signal-to-noise ratios (SNRs), and contrast-to-noise ratios (CNRs). Pediatric data sets were reconstructed at their original acquisition settings and additionally simulated to lower dose settings for comparison and evaluation of the potential for radiation dose reduction. Numerical experiments were conducted to quantify EST and other iterative methods in terms of image quality and computation time. The extension of EST to helical cone-beam CT was implemented by using the advanced single-slice rebinning (ASSR) method. Results: Based on the phantom and pediatric patient fan-beam CT data, it is demonstrated that EST reconstructions with the lowest scanner flux setting of 39 mAs produce comparable image quality, resolution, and contrast relative to FBP with the 140 mAs flux setting. Compared to the algebraic reconstruction technique and the expectation maximization statistical reconstruction algorithm, a significant reduction in computation time is achieved with EST. Finally, numerical experiments on helical cone-beam CT data suggest that the combination of EST and ASSR produces reconstructions with higher image quality and lower noise than the Feldkamp Davis and Kress (FDK) method and the conventional ASSR approach. Conclusions: A Fourier-based iterative method has been applied to the reconstruction of fan-bean CT data with reduced x-ray fluence. This method incorporates advantageous features in both real and Fourier space iterative schemes: using a fast and algebraically exact method to calculate forward projection, enforcing the measured data in Fourier space, and applying physical constraints and flexible regularization in real space. Our results suggest that EST can be utilized for radiation dose reduction in x-ray CT via the readily implementable technique of lowering mAs settings. Numerical experiments further indicate that EST requires less computation time than several other iterative algorithms and can, in principle, be extended to helical cone-beam geometry in combination with the ASSR method. PMID:23464329
Jing, S; Liu, B; Peng, L; Peng, X; Zhu, L; Fu, Q; He, G
2012-02-01
To assess genetic diversity in populations of the brown planthopper (Nilaparvata lugens Stål) (Homoptera: Delphacidae), we have developed and applied microsatellite, or simple sequence repeat (SSR), markers from expressed sequence tags (ESTs). We found that the brown planthopper clusters of ESTs were rich in SSRs with unique frequencies and distributions of SSR motifs. Three hundred and fifty-one EST-SSR markers were developed and yielded clear bands from samples of four brown planthopper populations. High cross-species transferability of these markers was detected in the closely related planthopper N. muiri. The newly developed EST-SSR markers provided sufficient resolution to distinguish within and among biotypes. Analyses based on SSR data revealed host resistance-based genetic differentiation among different brown planthopper populations; the genetic diversity of populations feeding on susceptible rice varieties was lower than that of populations feeding on resistant rice varieties. This is the first large-scale development of brown planthopper SSR markers, which will be useful for future molecular genetics and genomics studies of this serious agricultural pest.
La tuberculose nasosinusienne primaire: à propos d'un cas
Bouchentouf, Rachid; Bouaity, Brahim; Touati, Mohamed; Benjelloun, Amine; Aitbenasser, Moulay Ali
2013-01-01
La localisation nasosinusienne de la tuberculose est rare. Elle est caractérisée par une présentation clinique polymorphe et non spécifique, posant souvent un problème de diagnostic différentiel. Le diagnostic repose sur l'examen anatomopathologique et bactériologique avec examen direct et culture. Le traitement est essentiellement médical à base d'antibacillaires. PMID:23503658
A Phylogenomic Assessment of Ancient Polyploidy and Genome Evolution across the Poales
McKain, Michael R.; Tang, Haibao; McNeal, Joel R.; Ayyampalayam, Saravanaraj; Davis, Jerrold I.; dePamphilis, Claude W.; Givnish, Thomas J.; Pires, J. Chris; Stevenson, Dennis Wm.; Leebens-Mack, James H.
2016-01-01
Comparisons of flowering plant genomes reveal multiple rounds of ancient polyploidy characterized by large intragenomic syntenic blocks. Three such whole-genome duplication (WGD) events, designated as rho (ρ), sigma (σ), and tau (τ), have been identified in the genomes of cereal grasses. Precise dating of these WGD events is necessary to investigate how they have influenced diversification rates, evolutionary innovations, and genomic characteristics such as the GC profile of protein-coding sequences. The timing of these events has remained uncertain due to the paucity of monocot genome sequence data outside the grass family (Poaceae). Phylogenomic analysis of protein-coding genes from sequenced genomes and transcriptome assemblies from 35 species, including representatives of all families within the Poales, has resolved the timing of rho and sigma relative to speciation events and placed tau prior to divergence of Asparagales and the commelinids but after divergence with eudicots. Examination of gene family phylogenies indicates that rho occurred just prior to the diversification of Poaceae and sigma occurred before early diversification of Poales lineages but after the Poales-commelinid split. Additional lineage-specific WGD events were identified on the basis of the transcriptome data. Gene families exhibiting high GC content are underrepresented among those with duplicate genes that persisted following these genome duplications. However, genome duplications had little overall influence on lineage-specific changes in the GC content of coding genes. Improved resolution of the timing of WGD events in monocot history provides evidence for the influence of polyploidization on functional evolution and species diversification. PMID:26988252
When Outgroups Fail; Phylogenomics of Rooting the Emerging Pathogen, Coxiella burnetii
Pearson, Talima; Hornstra, Heidie M.; Sahl, Jason W.; Schaack, Sarah; Schupp, James M.; Beckstrom-Sternberg, Stephen M.; O'Neill, Matthew W.; Priestley, Rachael A.; Champion, Mia D.; Beckstrom-Sternberg, James S.; Kersh, Gilbert J.; Samuel, James E.; Massung, Robert F.; Keim, Paul
2013-01-01
Rooting phylogenies is critical for understanding evolution, yet the importance, intricacies and difficulties of rooting are often overlooked. For rooting, polymorphic characters among the group of interest (ingroup) must be compared to those of a relative (outgroup) that diverged before the last common ancestor (LCA) of the ingroup. Problems arise if an outgroup does not exist, is unknown, or is so distant that few characters are shared, in which case duplicated genes originating before the LCA can be used as proxy outgroups to root diverse phylogenies. Here, we describe a genome-wide expansion of this technique that can be used to solve problems at the other end of the evolutionary scale: where ingroup individuals are all very closely related to each other, but the next closest relative is very distant. We used shared orthologous single nucleotide polymorphisms (SNPs) from 10 whole genome sequences of Coxiella burnetii, the causative agent of Q fever in humans, to create a robust, but unrooted phylogeny. To maximize the number of characters informative about the rooting, we searched entire genomes for polymorphic duplicated regions where orthologs of each paralog could be identified so that the paralogs could be used to root the tree. Recent radiations, such as those of emerging pathogens, often pose rooting challenges due to a lack of ingroup variation and large genomic differences with known outgroups. Using a phylogenomic approach, we created a robust, rooted phylogeny for C. burnetii. [Coxiella burnetii; paralog SNPs; pathogen evolution; phylogeny; recent radiation; root; rooting using duplicated genes.] PMID:23736103
Trimpert, Jakob; Groenke, Nicole; Jenckel, Maria; He, Shulin; Kunec, Dusan; Szpara, Moriah L; Spatz, Stephen J; Osterrieder, Nikolaus; McMahon, Dino P
2017-12-01
Virulence determines the impact a pathogen has on the fitness of its host, yet current understanding of the evolutionary origins and causes of virulence of many pathogens is surprisingly incomplete. Here, we explore the evolution of Marek's disease virus (MDV), a herpesvirus commonly afflicting chickens and rarely other avian species. The history of MDV in the 20th century represents an important case study in the evolution of virulence. The severity of MDV infection in chickens has been rising steadily since the adoption of intensive farming techniques and vaccination programs in the 1950s and 1970s, respectively. It has remained uncertain, however, which of these factors is causally more responsible for the observed increase in virulence of circulating viruses. We conducted a phylogenomic study to understand the evolution of MDV in the context of dramatic changes to poultry farming and disease control. Our analysis reveals evidence of geographical structuring of MDV strains, with reconstructions supporting the emergence of virulent viruses independently in North America and Eurasia. Of note, the emergence of virulent viruses appears to coincide approximately with the introduction of comprehensive vaccination on both continents. The time-dated phylogeny also indicated that MDV has a mean evolutionary rate of ~1.6 × 10 -5 substitutions per site per year. An examination of gene-linked mutations did not identify a strong association between mutational variation and virulence phenotypes, indicating that MDV may evolve readily and rapidly under strong selective pressures and that multiple genotypic pathways may underlie virulence adaptation in MDV.
Open Questions on the Origin of Life at Anoxic Geothermal Fields
Mulkidjanian, Armen Y.; Bychkov, Andrew Yu.; Dibrova, Daria V.; Galperin, Michael Y.; Koonin, Eugene V.
2014-01-01
We have recently reconstructed the ‘hatcheries’ of the first cells by combining geochemical analysis with phylogenomic scrutiny of the inorganic ion requirements of universal components of modern cells (Mulkidjanian et al.: Origin of first cells at terrestrial, anoxic geothermal fields. Proc Natl Acad Sci USA 2012, 109:E821–830). These ubiquitous, and by inference primordial, proteins and functional systems show affinity to and functional requirement for K+, Zn2+, Mn2+, and phosphate. Thus, protocells must have evolved in habitats with a high K+/Na+ ratio and relatively high concentrations of Zn, Mn and phosphorous compounds. Geochemical reconstruction shows that the ionic composition conducive to the origin of cells could not have existed in marine settings but is compatible with emissions of vapor-dominated zones of inland geothermal systems. Under anoxic, CO2-dominated atmosphere, the ionic composition of pools of cool, condensed vapor at anoxic geothermal fields would resemble the internal milieu of modern cells. Such pools would be lined with porous silicate minerals mixed with metal sulfides and enriched in K+ ions and phosphorous compounds. Here we address some questions that have appeared in print after the publication of our anoxic geothermal field scenario. We argue that anoxic geothermal fields, which were identified as likely cradles of life by using a top-down approach and phylogenomics analysis as a tool, could provide geochemical conditions similar to those which were suggested as most conducive for the emergence of life by the chemists who pursuit the complementary bottom-up strategy. PMID:23132762
Awai, Koichiro
2017-01-01
Abstract Lipid biosynthesis within the chloroplast, or more generally plastids, was conventionally called “prokaryotic pathway,” which produces glycerolipids bearing C18 acids at the sn-1 position and C16 acids at the sn-2 position, as in cyanobacteria such as Anabaena and Synechocystis. This positional specificity is determined during the synthesis of phosphatidate, which is a precursor to diacylglycerol, the acceptor of galactose for the synthesis of galactolipids. The first acylation at sn-1 is catalyzed by glycerol-3-phosphate acyltransferase (GPAT or GPT), whereas the second acylation at sn-2 is performed by lysophosphatidate acyltransferase (LPAAT, AGPAT, or PlsC). Here we present comprehensive phylogenomic analysis of the origins of various acyltransferases involved in the synthesis of phosphatidate, as well as phosphatidate phosphatases in the chloroplasts. The results showed that the enzymes involved in the two steps of acylation in cyanobacteria and chloroplasts are entirely phylogenetically unrelated despite a previous report stating that the chloroplast LPAAT (ATS2) and cyanobacterial PlsC were sister groups. Phosphatidate phosphatases were separated into eukaryotic and prokaryotic clades, and the chloroplast enzymes were not of cyanobacterial origin, in contrast with another previous report. These results indicate that the lipid biosynthetic pathway in the chloroplasts or plastids did not originate from the cyanobacterial endosymbiont and is not “prokaryotic” in the context of endosymbiotic theory of plastid origin. This is another line of evidence for the discontinuity of plastids and cyanobacteria, which has been suggested in the glycolipid biosynthesis. PMID:29145606
Chætognath transcriptome reveals ancestral and unique features among bilaterians
Marlétaz, Ferdinand; Gilles, André; Caubit, Xavier; Perez, Yvan; Dossat, Carole; Samain, Sylvie; Gyapay, Gabor; Wincker, Patrick; Le Parco, Yannick
2008-01-01
Background The chætognaths (arrow worms) have puzzled zoologists for years because of their astonishing morphological and developmental characteristics. Despite their deuterostome-like development, phylogenomic studies recently positioned the chætognath phylum in protostomes, most likely in an early branching. This key phylogenetic position and the peculiar characteristics of chætognaths prompted further investigation of their genomic features. Results Transcriptomic and genomic data were collected from the chætognath Spadella cephaloptera through the sequencing of expressed sequence tags and genomic bacterial artificial chromosome clones. Transcript comparisons at various taxonomic scales emphasized the conservation of a core gene set and phylogenomic analysis confirmed the basal position of chætognaths among protostomes. A detailed survey of transcript diversity and individual genotyping revealed a past genome duplication event in the chætognath lineage, which was, surprisingly, followed by a high retention rate of duplicated genes. Moreover, striking genetic heterogeneity was detected within the sampled population at the nuclear and mitochondrial levels but cannot be explained by cryptic speciation. Finally, we found evidence for trans-splicing maturation of transcripts through splice-leader addition in the chætognath phylum and we further report that this processing is associated with operonic transcription. Conclusion These findings reveal both shared ancestral and unique derived characteristics of the chætognath genome, which suggests that this genome is likely the product of a very original evolutionary history. These features promote chætognaths as a pivotal model for comparative genomics, which could provide new clues for the investigation of the evolution of animal genomes. PMID:18533022
Luo, Yang; Ma, Peng-Fei; Li, Hong-Tao; Yang, Jun-Bo; Wang, Hong; Li, De-Zhu
2016-01-01
The predominantly aquatic order Alismatales, which includes approximately 4,500 species within Araceae, Tofieldiaceae, and the core alismatid families, is a key group in investigating the origin and early diversification of monocots. Despite their importance, phylogenetic ambiguity regarding the root of the Alismatales tree precludes answering questions about the early evolution of the order. Here, we sequenced the first complete plastid genomes from three key families in this order: Potamogeton perfoliatus (Potamogetonaceae), Sagittaria lichuanensis (Alismataceae), and Tofieldia thibetica (Tofieldiaceae). Each family possesses the typical quadripartite structure, with plastid genome sizes of 156,226, 179,007, and 155,512 bp, respectively. Among them, the plastid genome of S. lichuanensis is the largest in monocots and the second largest in angiosperms. Like other sequenced Alismatales plastid genomes, all three families generally encode the same 113 genes with similar structure and arrangement. However, we detected 2.4 and 6 kb inversions in the plastid genomes of Sagittaria and Potamogeton, respectively. Further, we assembled a 79 plastid protein-coding gene sequence data matrix of 22 taxa that included the three newly generated plastid genomes plus 19 previously reported ones, which together represent all primary lineages of monocots and outgroups. In plastid phylogenomic analyses using maximum likelihood and Bayesian inference, we show both strong support for Acorales as sister to the remaining monocots and monophyly of Alismatales. More importantly, Tofieldiaceae was resolved as the most basal lineage within Alismatales. These results provide new insights into the evolution of Alismatales as well as the early-diverging monocots as a whole. PMID:26957030
The evolutionary history of ferns inferred from 25 low-copy nuclear genes.
Rothfels, Carl J; Li, Fay-Wei; Sigel, Erin M; Huiet, Layne; Larsson, Anders; Burge, Dylan O; Ruhsam, Markus; Deyholos, Michael; Soltis, Douglas E; Stewart, C Neal; Shaw, Shane W; Pokorny, Lisa; Chen, Tao; dePamphilis, Claude; DeGironimo, Lisa; Chen, Li; Wei, Xiaofeng; Sun, Xiao; Korall, Petra; Stevenson, Dennis W; Graham, Sean W; Wong, Gane K-S; Pryer, Kathleen M
2015-07-01
• Understanding fern (monilophyte) phylogeny and its evolutionary timescale is critical for broad investigations of the evolution of land plants, and for providing the point of comparison necessary for studying the evolution of the fern sister group, seed plants. Molecular phylogenetic investigations have revolutionized our understanding of fern phylogeny, however, to date, these studies have relied almost exclusively on plastid data.• Here we take a curated phylogenomics approach to infer the first broad fern phylogeny from multiple nuclear loci, by combining broad taxon sampling (73 ferns and 12 outgroup species) with focused character sampling (25 loci comprising 35877 bp), along with rigorous alignment, orthology inference and model selection.• Our phylogeny corroborates some earlier inferences and provides novel insights; in particular, we find strong support for Equisetales as sister to the rest of ferns, Marattiales as sister to leptosporangiate ferns, and Dennstaedtiaceae as sister to the eupolypods. Our divergence-time analyses reveal that divergences among the extant fern orders all occurred prior to ∼200 MYA. Finally, our species-tree inferences are congruent with analyses of concatenated data, but generally with lower support. Those cases where species-tree support values are higher than expected involve relationships that have been supported by smaller plastid datasets, suggesting that deep coalescence may be reducing support from the concatenated nuclear data.• Our study demonstrates the utility of a curated phylogenomics approach to inferring fern phylogeny, and highlights the need to consider underlying data characteristics, along with data quantity, in phylogenetic studies. © 2015 Botanical Society of America, Inc.
NASA Astrophysics Data System (ADS)
Bergeron, Alain
Cette recherche vise a la mise en oeuvre optique de reseaux neuronaux. Deux architectures differentes sont proposees. La premiere est la memoire associative permettant d'associer a un objet quelconque une sortie arbitraire tout en preservant l'information sur sa position. La seconde architecture, le classificateur neuronal pour le controle robotique, permet l'identification d'une entree et son classement selon differentes categories. La sortie est compatible avec les systemes numeriques standard. Pour realiser ces architectures, une approche modulaire est privilegiee. Le correlateur constitue le module de base des realisations. Differents modules sont de plus introduits pour realiser convenablement les operations neuronales. Le premier de ces modules est le seuil optoelectronique permettant de realiser une fonction non lineaire, element essentiel des reseaux neuronaux. Le second module a etre introduit est l'encodeur optonumerique, utile au classement des objets. Le probleme de l'enregistrement de la memoire est aborde a l'aide du codage iteratif global.
2011-01-01
Background Abiotic stresses, such as water deficit and soil salinity, result in changes in physiology, nutrient use, and vegetative growth in vines, and ultimately, yield and flavor in berries of wine grape, Vitis vinifera L. Large-scale expressed sequence tags (ESTs) were generated, curated, and analyzed to identify major genetic determinants responsible for stress-adaptive responses. Although roots serve as the first site of perception and/or injury for many types of abiotic stress, EST sequencing in root tissues of wine grape exposed to abiotic stresses has been extremely limited to date. To overcome this limitation, large-scale EST sequencing was conducted from root tissues exposed to multiple abiotic stresses. Results A total of 62,236 expressed sequence tags (ESTs) were generated from leaf, berry, and root tissues from vines subjected to abiotic stresses and compared with 32,286 ESTs sequenced from 20 public cDNA libraries. Curation to correct annotation errors, clustering and assembly of the berry and leaf ESTs with currently available V. vinifera full-length transcripts and ESTs yielded a total of 13,278 unique sequences, with 2302 singletons and 10,976 mapped to V. vinifera gene models. Of these, 739 transcripts were found to have significant differential expression in stressed leaves and berries including 250 genes not described previously as being abiotic stress responsive. In a second analysis of 16,452 ESTs from a normalized root cDNA library derived from roots exposed to multiple, short-term, abiotic stresses, 135 genes with root-enriched expression patterns were identified on the basis of their relative EST abundance in roots relative to other tissues. Conclusions The large-scale analysis of relative EST frequency counts among a diverse collection of 23 different cDNA libraries from leaf, berry, and root tissues of wine grape exposed to a variety of abiotic stress conditions revealed distinct, tissue-specific expression patterns, previously unrecognized stress-induced genes, and many novel genes with root-enriched mRNA expression for improving our understanding of root biology and manipulation of rootstock traits in wine grape. mRNA abundance estimates based on EST library-enriched expression patterns showed only modest correlations between microarray and quantitative, real-time reverse transcription-polymerase chain reaction (qRT-PCR) methods highlighting the need for deep-sequencing expression profiling methods. PMID:21592389
Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping
2007-01-01
Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Carro, Lorena; Nouioui, Imen; Sangal, Vartul; Meier-Kolthoff, Jan P; Trujillo, Martha E; Montero-Calasanz, Maria Del Carmen; Sahin, Nevzat; Smith, Darren Lee; Kim, Kristi E; Peluso, Paul; Deshpande, Shweta; Woyke, Tanja; Shapiro, Nicole; Kyrpides, Nikos C; Klenk, Hans-Peter; Göker, Markus; Goodfellow, Michael
2018-01-11
There is a need to clarify relationships within the actinobacterial genus Micromonospora, the type genus of the family Micromonosporaceae, given its biotechnological and ecological importance. Here, draft genomes of 40 Micromonospora type strains and two non-type strains are made available through the Genomic Encyclopedia of Bacteria and Archaea project and used to generate a phylogenomic tree which showed they could be assigned to well supported phyletic lines that were not evident in corresponding trees based on single and concatenated sequences of conserved genes. DNA G+C ratios derived from genome sequences showed that corresponding data from species descriptions were imprecise. Emended descriptions include precise base composition data and approximate genome sizes of the type strains. antiSMASH analyses of the draft genomes show that micromonosporae have a previously unrealised potential to synthesize novel specialized metabolites. Close to one thousand biosynthetic gene clusters were detected, including NRPS, PKS, terpenes and siderophores clusters that were discontinuously distributed thereby opening up the prospect of prioritising gifted strains for natural product discovery. The distribution of key stress related genes provide an insight into how micromonosporae adapt to key environmental variables. Genes associated with plant interactions highlight the potential use of micromonosporae in agriculture and biotechnology.
Classification and Taxonomy of Vegetable Macergens
Aremu, Bukola R.; Babalola, Olubukola O.
2015-01-01
Macergens are bacteria capable of releasing pectic enzymes (pectolytic bacteria). These enzymatic actions result in the separation of plant tissues leading to total plant destruction. This can be attributed to soft rot diseases in vegetables. These macergens primarily belong to the genus Erwinia and to a range of opportunistic pathogens namely: the Xanthomonas spp., Pseudomonas spp., Clostridium spp., Cytophaga spp., and Bacillus spp. They consist of taxa that displayed considerable heterogeneity and intermingled with members of other genera belonging to the Enterobacteriaceae. They have been classified based on phenotypic, chemotaxonomic and genotypic which obviously not necessary in the taxonomy of all bacterial genera for defining bacterial species and describing new ones These taxonomic markers have been used traditionally as a simple technique for identification of bacterial isolates. The most important fields of taxonomy are supposed to be based on clear, reliable and worldwide applicable criteria. Hence, this review clarifies the taxonomy of the macergens to the species level and revealed that their taxonomy is beyond complete. For discovery of additional species, further research with the use modern molecular methods like phylogenomics need to be done. This can precisely define classification of macergens resulting in occasional, but significant changes in previous taxonomic schemes of these macergens. PMID:26640465
Between a Pod and a Hard Test: The Deep Evolution of Amoebae.
Kang, Seungho; Tice, Alexander K; Spiegel, Frederick W; Silberman, Jeffrey D; Pánek, Tomáš; Cepicka, Ivan; Kostka, Martin; Kosakyan, Anush; Alcântara, Daniel M C; Roger, Andrew J; Shadwick, Lora L; Smirnov, Alexey; Kudryavtsev, Alexander; Lahr, Daniel J G; Brown, Matthew W
2017-09-01
Amoebozoa is the eukaryotic supergroup sister to Obazoa, the lineage that contains the animals and Fungi, as well as their protistan relatives, and the breviate and apusomonad flagellates. Amoebozoa is extraordinarily diverse, encompassing important model organisms and significant pathogens. Although amoebozoans are integral to global nutrient cycles and present in nearly all environments, they remain vastly understudied. We present a robust phylogeny of Amoebozoa based on broad representative set of taxa in a phylogenomic framework (325 genes). By sampling 61 taxa using culture-based and single-cell transcriptomics, our analyses show two major clades of Amoebozoa, Discosea, and Tevosa. This phylogeny refutes previous studies in major respects. Our results support the hypothesis that the last common ancestor of Amoebozoa was sexual and flagellated, it also may have had the ability to disperse propagules from a sporocarp-type fruiting body. Overall, the main macroevolutionary patterns in Amoebozoa appear to result from the parallel losses of homologous characters of a multiphase life cycle that included flagella, sex, and sporocarps rather than independent acquisition of convergent features. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Classification and Taxonomy of Vegetable Macergens.
Aremu, Bukola R; Babalola, Olubukola O
2015-01-01
Macergens are bacteria capable of releasing pectic enzymes (pectolytic bacteria). These enzymatic actions result in the separation of plant tissues leading to total plant destruction. This can be attributed to soft rot diseases in vegetables. These macergens primarily belong to the genus Erwinia and to a range of opportunistic pathogens namely: the Xanthomonas spp., Pseudomonas spp., Clostridium spp., Cytophaga spp., and Bacillus spp. They consist of taxa that displayed considerable heterogeneity and intermingled with members of other genera belonging to the Enterobacteriaceae. They have been classified based on phenotypic, chemotaxonomic and genotypic which obviously not necessary in the taxonomy of all bacterial genera for defining bacterial species and describing new ones These taxonomic markers have been used traditionally as a simple technique for identification of bacterial isolates. The most important fields of taxonomy are supposed to be based on clear, reliable and worldwide applicable criteria. Hence, this review clarifies the taxonomy of the macergens to the species level and revealed that their taxonomy is beyond complete. For discovery of additional species, further research with the use modern molecular methods like phylogenomics need to be done. This can precisely define classification of macergens resulting in occasional, but significant changes in previous taxonomic schemes of these macergens.
Genomics and Evolution in Traditional Medicinal Plants: Road to a Healthier Life
Hao, Da-Cheng; Xiao, Pei-Gen
2015-01-01
Medicinal plants have long been utilized in traditional medicine and ethnomedicine worldwide. This review presents a glimpse of the current status of and future trends in medicinal plant genomics, evolution, and phylogeny. These dynamic fields are at the intersection of phytochemistry and plant biology and are concerned with the evolution mechanisms and systematics of medicinal plant genomes, origin and evolution of the plant genotype and metabolic phenotype, interaction between medicinal plant genomes and their environment, the correlation between genomic diversity and metabolite diversity, and so on. Use of the emerging high-end genomic technologies can be expanded from crop plants to traditional medicinal plants, in order to expedite medicinal plant breeding and transform them into living factories of medicinal compounds. The utility of molecular phylogeny and phylogenomics in predicting chemodiversity and bioprospecting is also highlighted within the context of natural-product-based drug discovery and development. Representative case studies of medicinal plant genome, phylogeny, and evolution are summarized to exemplify the expansion of knowledge pedigree and the paradigm shift to the omics-based approaches, which update our awareness about plant genome evolution and enable the molecular breeding of medicinal plants and the sustainable utilization of plant pharmaceutical resources. PMID:26461812
Genomics and Evolution in Traditional Medicinal Plants: Road to a Healthier Life.
Hao, Da-Cheng; Xiao, Pei-Gen
2015-01-01
Medicinal plants have long been utilized in traditional medicine and ethnomedicine worldwide. This review presents a glimpse of the current status of and future trends in medicinal plant genomics, evolution, and phylogeny. These dynamic fields are at the intersection of phytochemistry and plant biology and are concerned with the evolution mechanisms and systematics of medicinal plant genomes, origin and evolution of the plant genotype and metabolic phenotype, interaction between medicinal plant genomes and their environment, the correlation between genomic diversity and metabolite diversity, and so on. Use of the emerging high-end genomic technologies can be expanded from crop plants to traditional medicinal plants, in order to expedite medicinal plant breeding and transform them into living factories of medicinal compounds. The utility of molecular phylogeny and phylogenomics in predicting chemodiversity and bioprospecting is also highlighted within the context of natural-product-based drug discovery and development. Representative case studies of medicinal plant genome, phylogeny, and evolution are summarized to exemplify the expansion of knowledge pedigree and the paradigm shift to the omics-based approaches, which update our awareness about plant genome evolution and enable the molecular breeding of medicinal plants and the sustainable utilization of plant pharmaceutical resources.
Le syndrome d'embolie graisseuse post traumatique
Berdai, Adnane Mohamed; Shimi, Abdelkarim; Khatouf, Mohammed
2014-01-01
Le syndrome d'embolie graisseuse est une complication grave des fractures des os longs, il est la conséquence de la dissémination des particules graisseuses dans la microcirculation. L'objectif de ce travail est de déterminer le profil épidémiologique, la présentation clinique et paraclinique de ce syndrome et sa prise en charge thérapeutique. Notre étude porte sur 11 cas de syndrome d'embolie graisseuse colligés au service de réanimation A1 au centre hospitalier universitaire Hassan II de Fès, de Janvier 2009 à Juin 2012. Le diagnostic positif est basé sur les critères de Gurd. Les cas collectés se caractérisent par la prédominance du sexe masculin, d'un âge inférieur à 40 ans, présentant une fracture fémorale. Ce syndrome survient souvent dans les 72 heures après le traumatisme. La présentation clinique est dominée par l'hypoxémie et les troubles de conscience. Sur le plan biologique: l'anémie et la thrombopénie sont les manifestations les plus fréquentes. La prise en charge est symptomatique, 63% des patients ont nécessité l'intubation et la ventilation. L’évolution n'est pas toujours bénigne. Nos résultats confirme le polymorphisme de la présentation clinique et paraclinique du syndrome d'embolie graisseuse. Le diagnostic de ce syndrome se base sur des critères cliniques, mais reste essentiellement un diagnostic d’élimination. La prise en charge est symptomatique. La prévention de ce syndrome est essentielle et se base sur une fixation précoce des fractures des os longs. PMID:25452829
NASA Astrophysics Data System (ADS)
Rebaine, Ali
1997-08-01
Ce travail consiste en la simulation numerique des ecoulements internes compressibles bidimensionnels laminaires et turbulents. On s'interesse, particulierement, aux ecoulements dans les ejecteurs supersoniques. Les equations de Navier-Stokes sont formulees sous forme conservative et utilisent, comme variables independantes, les variables dites enthalpiques a savoir: la pression statique, la quantite de mouvement et l'enthalpie totale specifique. Une formulation variationnelle stable des equations de Navier-Stokes est utilisee. Elle est base sur la methode SUPG (Streamline Upwinding Petrov Galerkin) et utilise un operateur de capture des forts gradients. Un modele de turbulence, pour la simulation des ecoulements dans les ejecteurs, est mis au point. Il consiste a separer deux regions distinctes: une region proche de la paroi solide, ou le modele de Baldwin et Lomax est utilise et l'autre, loin de la paroi, ou une formulation nouvelle, basee sur le modele de Schlichting pour les jets, est proposee. Une technique de calcul de la viscosite turbulente, sur un maillage non structure, est implementee. La discretisation dans l'espace de la forme variationnelle est faite a l'aide de la methode des elements finis en utilisant une approximation mixte: quadratique pour les composantes de la quantite de mouvement et de la vitesse et lineaire pour le reste des variables. La discretisation temporelle est effectuee par une methode de differences finies en utilisant le schema d'Euler implicite. Le systeme matriciel, resultant de la discretisation spatio-temporelle, est resolu a l'aide de l'algorithme GMRES en utilisant un preconditionneur diagonal. Les validations numeriques ont ete menees sur plusieurs types de tuyeres et ejecteurs. La principale validation consiste en la simulation de l'ecoulement dans l'ejecteur teste au centre de recherche NASA Lewis. Les resultats obtenus sont tres comparables avec ceux des travaux anterieurs et sont nettement superieurs concernant les ecoulements turbulents dans les ejecteurs.
OBrien, Stephen J; Haussler, David; Ryder, Oliver
2014-01-01
Everyone loves the birds of the world. From their haunting songs and majesty of flight to dazzling plumage and mating rituals, bird watchers - both amateurs and professionals - have marveled for centuries at their considerable adaptations. Now, we are offered a special treat with the publication of a series of papers in dedicated issues of Science, Genome Biology and GigaScience (which also included pre-publication data release). These present the successful beginnings of an international interdisciplinary venture, the Avian Phylogenomics Project that lets us view, through a genomics lens, modern bird species and the evolutionary events that produced them.
Dreyer, Christine; Hoffmann, Margarete; Lanz, Christa; Willing, Eva-Maria; Riester, Markus; Warthmann, Norman; Sprecher, Andrea; Tripathi, Namita; Henz, Stefan R; Weigel, Detlef
2007-01-01
Background The guppy, Poecilia reticulata, is a well-known model organism for studying inheritance and variation of male ornamental traits as well as adaptation to different river habitats. However, genomic resources for studying this important model were not previously widely available. Results With the aim of generating molecular markers for genetic mapping of the guppy, cDNA libraries were constructed from embryos and different adult organs to generate expressed sequence tags (ESTs). About 18,000 ESTs were annotated according to BLASTN and BLASTX results and the sequence information from the 3' UTRs was exploited to generate PCR primers for re-sequencing of genomic DNA from different wild type strains. By comparison of EST-linked genomic sequences from at least four different ecotypes, about 1,700 polymorphisms were identified, representing about 400 distinct genes. Two interconnected MySQL databases were built to organize the ESTs and markers, respectively. A robust phylogeny of the guppy was reconstructed, based on 10 different nuclear genes. Conclusion Our EST and marker databases provide useful tools for genetic mapping and phylogenetic studies of the guppy. PMID:17686157
Riebeling, Christian; Pirow, Ralph; Becker, Klaus; Buesen, Roland; Eikel, Daniel; Kaltenhäuser, Johanna; Meyer, Frauke; Nau, Heinz; Slawik, Birgitta; Visan, Anke; Volland, Jutta; Spielmann, Horst; Luch, Andreas; Seiler, Andrea
2011-01-01
Teratogenicity can be predicted in vitro using the embryonic stem cell test (EST). The EST, which is based on the morphometric measurement of cardiomyocyte differentiation and cytotoxicity parameters, represents a scientifically validated method for the detection and classification of chemicals according to their teratogenic potency. Furthermore, an abbreviated protocol applying flow cytometry of intracellular marker proteins to determine differentiation into the cardiomyocyte lineage is available. Although valproic acid (VPA) is in worldwide clinical use as antiepileptic drug, it exhibits two severe side effects, i.e., teratogenicity and hepatotoxicity. These limitations have led to extensive research into derivatives of VPA. Here we chose VPA as model compound to test the applicability domain and to further evaluate the reliability of the EST. To this end, we study six closely related congeners of VPA and demonstrate that both the standard and the molecular flow cytometry-based EST are well suited to indicate differences in the teratogenic potency among VPA analogs that differ only in chirality or side chain length. Our data show that identical results can be obtained by using the standard EST or a shortened protocol based on flow cytometry of intracellular marker proteins. Both in vitro protocols enable to reliably determine differentiation of murine stem cells toward the cardiomyocyte lineage and to assess its chemical-mediated inhibition. PMID:21227905
Hendre, Prasad S.; Aggarwal, Ramesh K.
2014-01-01
Coffee breeding and improvement efforts can be greatly facilitated by availability of a large repository of simple sequence repeats (SSRs) based microsatellite markers, which provides efficiency and high-resolution in genetic analyses. This study was aimed to improve SSR availability in coffee by developing new genic−/genomic-SSR markers using in-silico bioinformatics and streptavidin-biotin based enrichment approach, respectively. The expressed sequence tag (EST) based genic microsatellite markers (EST-SSRs) were developed using the publicly available dataset of 13,175 unigene ESTs, which showed a distribution of 1 SSR/3.4 kb of coffee transcriptome. Genomic SSRs, on the other hand, were developed from an SSR-enriched small-insert partial genomic library of robusta coffee. In total, 69 new SSRs (44 EST-SSRs and 25 genomic SSRs) were developed and validated as suitable genetic markers. Diversity analysis of selected coffee genotypes revealed these to be highly informative in terms of allelic diversity and PIC values, and eighteen of these markers (∼27%) could be mapped on a robusta linkage map. Notably, the markers described here also revealed a very high cross-species transferability. In addition to the validated markers, we have also designed primer pairs for 270 putative EST-SSRs, which are expected to provide another ca. 200 useful genetic markers considering the high success rate (88%) of marker conversion of similar pairs tested/validated in this study. PMID:25461752
Generation, annotation and analysis of ESTs from Trichoderma harzianum CECT 2413
Vizcaíno, Juan Antonio; González, Francisco Javier; Suárez, M Belén; Redondo, José; Heinrich, Julian; Delgado-Jarana, Jesús; Hermosa, Rosa; Gutiérrez, Santiago; Monte, Enrique; Llobell, Antonio; Rey, Manuel
2006-01-01
Background The filamentous fungus Trichoderma harzianum is used as biological control agent of several plant-pathogenic fungi. In order to study the genome of this fungus, a functional genomics project called "TrichoEST" was developed to give insights into genes involved in biological control activities using an approach based on the generation of expressed sequence tags (ESTs). Results Eight different cDNA libraries from T. harzianum strain CECT 2413 were constructed. Different growth conditions involving mainly different nutrient conditions and/or stresses were used. We here present the analysis of the 8,710 ESTs generated. A total of 3,478 unique sequences were identified of which 81.4% had sequence similarity with GenBank entries, using the BLASTX algorithm. Using the Gene Ontology hierarchy, we performed the annotation of 51.1% of the unique sequences and compared its distribution among the gene libraries. Additionally, the InterProScan algorithm was used in order to further characterize the sequences. The identification of the putatively secreted proteins was also carried out. Later, based on the EST abundance, we examined the highly expressed genes and a hydrophobin was identified as the gene expressed at the highest level. We compared our collection of ESTs with the previous collections obtained from Trichoderma species and we also compared our sequence set with different complete eukaryotic genomes from several animals, plants and fungi. Accordingly, the presence of similar sequences in different kingdoms was also studied. Conclusion This EST collection and its annotation provide a significant resource for basic and applied research on T. harzianum, a fungus with a high biotechnological interest. PMID:16872539
Etude vibroacoustique d'un systeme coque-plancher-cavite avec application a un fuselage simplifie
NASA Astrophysics Data System (ADS)
Missaoui, Jemai
L'objectif de ce travail est de developper des modeles semi-analytiques pour etudier le comportement structural, acoustique et vibro-acoustique d'un systeme coque-plancher-cavite. La connection entre la coque et le plancher est assuree en utilisant le concept de rigidite artificielle. Ce concept de modelisation flexible facilite le choix des fonctions de decomposition du mouvement de chaque sous-structure. Les resultats issus de cette etude vont permettre la comprehension des phenomenes physiques de base rencontres dans une structure d'avion. Une approche integro-modale est developpee pour calculer les caracteristiques modales acoustiques. Elle utilise une discretisation de la cavite irreguliere en sous-cavites acoustiques dont les bases de developpement sont connues a priori. Cette approche, a caractere physique, presente l'avantage d'etre efficace et precise. La validite de celle-ci a ete demontree en utilisant des resultats disponibles dans la litterature. Un modele vibro-acoustique est developpe dans un but d'analyser et de comprendre les effets structuraux et acoustiques du plancher dans la configuration. La validite des resultats, en termes de resonance et de fonction de transfert, est verifiee a l'aide des mesures experimentales realisees au laboratoire.
Vontas, J G; Small, G J; Hemingway, J
2000-12-01
Organophosphorus and carbamate insecticide resistance in Nilaparvata lugens is based on amplification of a carboxylesterase gene, Nl-EST1. An identical gene occurs in susceptible insects. Quantitative real-time PCR was used to demonstrate that Nl-EST1 is amplified 3-7-fold in the genome of resistant compared to susceptible planthoppers. Expression levels were similar to amplification levels, with 1-15-fold more Nl-EST1 mRNA in individual insects and 5-11-fold more Nl-EST1 mRNA in mass whole body homogenates of resistant females compared to susceptibles. These values corresponded to an 8-10-fold increase in esterase activity in the head and thorax of individual resistant insects. Although amplification, expression and activity levels of Nl-EST1 in resistant N. lugens were similar, the correlation between esterase activity and Nl-EST1 mRNA levels in resistant individuals was not linear.
Sahu, Jagajjit; Das Talukdar, Anupam; Devi, Kamalakshi; Choudhury, Manabendra Dutta; Barooah, Madhumita; Modi, Mahendra Kumar; Sen, Priyabrata
2015-01-01
Abstract Centella asiatica (Gotu Kola) is a plant that grows in tropical swampy regions of the world and has important medicinal and culinary use. It is often considered as part of Ayurvedic medicine, traditional African medicine, and traditional Chinese medicine. The unavailability of genomics resources is significantly impeding its genetic improvement. To date, no attempt has been made to develop Expressed Sequence Tags (ESTs) derived Simple Sequence Repeat (SSR) markers (eSSRs) from the Centella genome. Hence, the present study aimed to develop eSSRs and their further experimental validation and cross-transferability of these markers in different genera of the Apiaceae family to which Centella belongs. An in-house pipeline was developed for the entire analyses by combining bioinformatics tools and perl scripts. A total of 4443 C. asiatica EST sequences from dbEST were processed, which generated 2617 nonredundant high quality EST sequences consisting 441 contigs and 2176 singletons. Out of 1776.5 kb of examined sequences, 417 (15.9%) ESTs containing 686 SSRs were detected with a density of one SSR per 2.59 kb. The gene ontology study revealed 282 functional domains involved in various processes, components, and functions, out of which 64 ESTs were found to have both SSRs and functional domains. Out of 603 designed EST-SSR primers, 18 pairs of primers were selected for validation based on the optimum parameter value. Reproducible amplification was obtained for six primer pairs in C. asiatica that were further tested for cross-transferability in nine other important genera/species of the Apiaceae family. Cross-transferability of the EST-SSR markers among the species were examined and Centella javanica showed highest transferability (83.3%). The study revealed six highly polymorphic EST-SSR primers with an average PIC value of 0.95. In conclusion, these EST-SSR markers hold a big promise for the genomics analysis of Centella asiatica, to facilitate comparative map-based analyses across other related species within the Apiaceae family, and future marker-assisted breeding programs. To the best of our knowledge, this is the first report of development of EST-SSRs in Centella asiatica by in silico approaches, which offers a veritable potential in further use in plant omics research and development.
Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen
2013-04-01
Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.
NASA Astrophysics Data System (ADS)
Wang, F.; Annable, M. D.; Jawitz, J. W.
2012-12-01
The equilibrium streamtube model (EST) has demonstrated the ability to accurately predict dense nonaqueous phase liquid (DNAPL) dissolution in laboratory experiments and numerical simulations. Here the model is applied to predict DNAPL dissolution at a PCE-contaminated dry cleaner site, located in Jacksonville, Florida. The EST is an analytical solution with field-measurable input parameters. Here, measured data from a field-scale partitioning tracer test were used to parameterize the EST model and the predicted PCE dissolution was compared to measured data from an in-situ alcohol (ethanol) flood. In addition, a simulated partitioning tracer test from a calibrated spatially explicit multiphase flow model (UTCHEM) was also used to parameterize the EST analytical solution. The ethanol prediction based on both the field partitioning tracer test and the UTCHEM tracer test simulation closely matched the field data. The PCE EST prediction showed a peak shift to an earlier arrival time that was concluded to be caused by well screen interval differences between the field tracer test and alcohol flood. This observation was based on a modeling assessment of potential factors that may influence predictions by using UTCHEM simulations. The imposed injection and pumping flow pattern at this site for both the partitioning tracer test and alcohol flood was more complex than the natural gradient flow pattern (NGFP). Both the EST model and UTCHEM were also used to predict PCE dissolution under natural gradient conditions, with much simpler flow patterns than the forced-gradient double five spot of the alcohol flood. The NGFP predictions based on parameters determined from tracer tests conducted with complex flow patterns underestimated PCE concentrations and total mass removal. This suggests that the flow patterns influence aqueous dissolution and that the aqueous dissolution under the NGFP is more efficient than dissolution under complex flow patterns.
Luo, Xiangwen; Zhang, Deyong; Zhou, Xuguo; Du, Jiao; Zhang, Songbai; Liu, Yong
2018-05-09
Full length open reading frame of pyrethroid detoxification gene, Est3385, contains 963 nucleotides. This gene was identified and cloned based on the genome sequence of Rhodopseudomonas palustris PSB-S available at the GneBank. The predicted amino acid sequence of Est3385 shared moderate identities (30-46%) with the known homologous esterases. Phylogenetic analysis revealed that Est3385 was a member in the esterase family I. Recombinant Est3385 was heterologous expressed in E. coli, purified and characterized for its substrate specificity, kinetics and stability under various conditions. The optimal temperature and pH for Est3385 were 35 °C and 6.0, respectively. This enzyme could detoxify various pyrethroid pesticides and degrade the optimal substrate fenpropathrin with a Km and Vmax value of 0.734 ± 0.013 mmol·l -1 and 0.918 ± 0.025 U·µg -1 , respectively. No cofactor was found to affect Est3385 activity but substantial reduction of enzymatic activity was observed when metal ions were applied. Taken together, a new pyrethroid degradation esterase was identified and characterized. Modification of Est3385 with protein engineering toolsets should enhance its potential for field application to reduce the pesticide residue from agroecosystems.
A simple respirogram-based approach for the management of effluent from an activated sludge system.
Li, Zhi-Hua; Zhu, Yuan-Mo; Yang, Cheng-Jian; Zhang, Tian-Yu; Yu, Han-Qing
2018-08-01
Managing wastewater treatment plant (WWTP) based on respirometric analysis is a new and promising field. In this study, a multi-dimensional respirogram space was constructed, and an important index R es/t (ratio of in-situ respiration rate to maximum respiration rate) was derived as an alarm signal for the effluent quality control. A smaller R es/t value suggests better effluent. The critical R' es/t value used for determining whether the effluent meets the regulation depends on operational conditions, which were characterized by temperature and biomass ratio of heterotrophs to autotrophs. With given operational conditions, the critical R' es/t value can be calculated from the respirogram space and effluent conditions required by the discharge regulation, with no requirement for calibration of parameters or any additional measurements. Since it is simple, easy to use, and can be readily implemented online, this approach holds a great promise for applications. Copyright © 2018 Elsevier Ltd. All rights reserved.
De Maayer, Pieter; Aliyu, Habibu; Vikram, Surendra; Blom, Jochen; Duffy, Brion; Cowan, Don A.; Smits, Theo H. M.; Venter, Stephanus N.; Coutinho, Teresa A.
2017-01-01
Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart’s wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes, has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis. While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range. PMID:28959245
Romiguier, Jonathan; Ranwez, Vincent; Delsuc, Frédéric; Galtier, Nicolas; Douzery, Emmanuel J P
2013-09-01
Despite the rapid increase of size in phylogenomic data sets, a number of important nodes on animal phylogeny are still unresolved. Among these, the rooting of the placental mammal tree is still a controversial issue. One difficulty lies in the pervasive phylogenetic conflicts among genes, with each one telling its own story, which may be reliable or not. Here, we identified a simple criterion, that is, the GC content, which substantially helps in determining which gene trees best reflect the species tree. We assessed the ability of 13,111 coding sequence alignments to correctly reconstruct the placental phylogeny. We found that GC-rich genes induced a higher amount of conflict among gene trees and performed worse than AT-rich genes in retrieving well-supported, consensual nodes on the placental tree. We interpret this GC effect mainly as a consequence of genome-wide variations in recombination rate. Indeed, recombination is known to drive GC-content evolution through GC-biased gene conversion and might be problematic for phylogenetic reconstruction, for instance, in an incomplete lineage sorting context. When we focused on the AT-richest fraction of the data set, the resolution level of the placental phylogeny was greatly increased, and a strong support was obtained in favor of an Afrotheria rooting, that is, Afrotheria as the sister group of all other placentals. We show that in mammals most conflicts among gene trees, which have so far hampered the resolution of the placental tree, are concentrated in the GC-rich regions of the genome. We argue that the GC content-because it is a reliable indicator of the long-term recombination rate-is an informative criterion that could help in identifying the most reliable molecular markers for species tree inference.
Luo, Yang; Ma, Peng-Fei; Li, Hong-Tao; Yang, Jun-Bo; Wang, Hong; Li, De-Zhu
2016-04-06
The predominantly aquatic order Alismatales, which includes approximately 4,500 species within Araceae, Tofieldiaceae, and the core alismatid families, is a key group in investigating the origin and early diversification of monocots. Despite their importance, phylogenetic ambiguity regarding the root of the Alismatales tree precludes answering questions about the early evolution of the order. Here, we sequenced the first complete plastid genomes from three key families in this order:Potamogeton perfoliatus(Potamogetonaceae),Sagittaria lichuanensis(Alismataceae), andTofieldia thibetica(Tofieldiaceae). Each family possesses the typical quadripartite structure, with plastid genome sizes of 156,226, 179,007, and 155,512 bp, respectively. Among them, the plastid genome ofS. lichuanensisis the largest in monocots and the second largest in angiosperms. Like other sequenced Alismatales plastid genomes, all three families generally encode the same 113 genes with similar structure and arrangement. However, we detected 2.4 and 6 kb inversions in the plastid genomes ofSagittariaandPotamogeton, respectively. Further, we assembled a 79 plastid protein-coding gene sequence data matrix of 22 taxa that included the three newly generated plastid genomes plus 19 previously reported ones, which together represent all primary lineages of monocots and outgroups. In plastid phylogenomic analyses using maximum likelihood and Bayesian inference, we show both strong support for Acorales as sister to the remaining monocots and monophyly of Alismatales. More importantly, Tofieldiaceae was resolved as the most basal lineage within Alismatales. These results provide new insights into the evolution of Alismatales as well as the early-diverging monocots as a whole. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
2012-01-01
Background Through next-generation sequencing, the amount of sequence data potentially available for phylogenetic analyses has increased exponentially in recent years. Simultaneously, the risk of incorporating ‘noisy’ data with misleading phylogenetic signal has also increased, and may disproportionately influence the topology of weakly supported nodes and lineages featuring rapid radiations and/or elevated rates of evolution. Results We investigated the influence of phylogenetic noise in large data sets by applying two fundamental strategies, variable site removal and long-branch exclusion, to the phylogenetic analysis of a full plastome alignment of 107 species of Pinus and six Pinaceae outgroups. While high overall phylogenetic resolution resulted from inclusion of all data, three historically recalcitrant nodes remained conflicted with previous analyses. Close investigation of these nodes revealed dramatically different responses to data removal. Whereas topological resolution and bootstrap support for two clades peaked with removal of highly variable sites, the third clade resolved most strongly when all sites were included. Similar trends were observed using long-branch exclusion, but patterns were neither as strong nor as clear. When compared to previous phylogenetic analyses of nuclear loci and morphological data, the most highly supported topologies seen in Pinus plastome analysis are congruent for the two clades gaining support from variable site removal and long-branch exclusion, but in conflict for the clade with highest support from the full data set. Conclusions These results suggest that removal of misleading signal in phylogenomic datasets can result not only in increased resolution for poorly supported nodes, but may serve as a tool for identifying erroneous yet highly supported topologies. For Pinus chloroplast genomes, removal of variable sites appears to be more effective than long-branch exclusion for clarifying phylogenetic hypotheses. PMID:22731878
Phylogeny of zebrafish, a "model species," within Danio, a "model genus".
McCluskey, Braedan M; Postlethwait, John H
2015-03-01
Zebrafish (Danio rerio) is an important model for vertebrate development, genomics, physiology, behavior, toxicology, and disease. Additionally, work on numerous Danio species is elucidating evolutionary mechanisms for morphological development. Yet, the relationships of zebrafish and its closest relatives remain unclear possibly due to incomplete lineage sorting, speciation with gene flow, and interspecies hybridization. To clarify these relationships, we first constructed phylogenomic data sets from 30,801 restriction-associated DNA (RAD)-tag loci (483,026 variable positions) with clear orthology to a single location in the sequenced zebrafish genome. We then inferred a well-supported species tree for Danio and tested for gene flow during the diversification of the genus. An approach independent of the sequenced zebrafish genome verified all inferred relationships. Although identification of the sister taxon to zebrafish has been contentious, multiple RAD-tag data sets and several analytical methods provided strong evidence for Danio aesculapii as the most closely related extant zebrafish relative studied to date. Data also displayed patterns consistent with gene flow during speciation and postspeciation introgression in the lineage leading to zebrafish. The incorporation of biogeographic data with phylogenomic analyses put these relationships in a phylogeographic context and supplied additional support for D. aesculapii as the sister species to D. rerio. The clear resolution of this study establishes a framework for investigating the evolutionary biology of Danio and the heterogeneity of genome evolution in the recent history of a model organism within an emerging model genus for genetics, development, and evolution. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Anchored phylogenomics illuminates the skipper butterfly tree of life.
Toussaint, Emmanuel F A; Breinholt, Jesse W; Earl, Chandra; Warren, Andrew D; Brower, Andrew V Z; Yago, Masaya; Dexter, Kelly M; Espeland, Marianne; Pierce, Naomi E; Lohman, David J; Kawahara, Akito Y
2018-06-19
Butterflies (Papilionoidea) are perhaps the most charismatic insect lineage, yet phylogenetic relationships among them remain incompletely studied and controversial. This is especially true for skippers (Hesperiidae), one of the most species-rich and poorly studied butterfly families. To infer a robust phylogenomic hypothesis for Hesperiidae, we sequenced nearly 400 loci using Anchored Hybrid Enrichment and sampled all tribes and more than 120 genera of skippers. Molecular datasets were analyzed using maximum-likelihood, parsimony and coalescent multi-species phylogenetic methods. All analyses converged on a novel, robust phylogenetic hypothesis for skippers. Different optimality criteria and methodologies recovered almost identical phylogenetic trees with strong nodal support at nearly all nodes and all taxonomic levels. Our results support Coeliadinae as the sister group to the remaining skippers, the monotypic Euschemoninae as the sister group to all other subfamilies but Coeliadinae, and the monophyly of Eudaminae plus Pyrginae. Within Pyrginae, Celaenorrhinini and Tagiadini are sister groups, the Neotropical firetips, Pyrrhopygini, are sister to all other tribes but Celaenorrhinini and Tagiadini. Achlyodini is recovered as the sister group to Carcharodini, and Erynnini as sister group to Pyrgini. Within the grass skippers (Hesperiinae), there is strong support for the monophyly of Aeromachini plus remaining Hesperiinae. The giant skippers (Agathymus and Megathymus) once classified as a subfamily, are recovered as monophyletic with strong support, but are deeply nested within Hesperiinae. Anchored Hybrid Enrichment sequencing resulted in a large amount of data that built the foundation for a new, robust evolutionary tree of skippers. The newly inferred phylogenetic tree resolves long-standing systematic issues and changes our understanding of the skipper tree of life. These resultsenhance understanding of the evolution of one of the most species-rich butterfly families.
Turmel, Monique; de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude
2016-01-01
The Chlorodendrophyceae is a small class of green algae belonging to the core Chlorophyta, an assemblage that also comprises the Pedinophyceae, Trebouxiophyceae, Ulvophyceae and Chlorophyceae. Here we describe for the first time the chloroplast genomes of chlorodendrophycean algae (Scherffelia dubia, 137,161 bp; Tetraselmis sp. CCMP 881, 100,264 bp). Characterized by a very small single-copy (SSC) region devoid of any gene and an unusually large inverted repeat (IR), the quadripartite structures of the Scherffelia and Tetraselmis genomes are unique among all core chlorophytes examined thus far. The lack of genes in the SSC region is offset by the rich and atypical gene complement of the IR, which includes genes from the SSC and large single-copy regions of prasinophyte and streptophyte chloroplast genomes having retained an ancestral quadripartite structure. Remarkably, seven of the atypical IR-encoded genes have also been observed in the IRs of pedinophycean and trebouxiophycean chloroplast genomes, suggesting that they were already present in the IR of the common ancestor of all core chlorophytes. Considering that the relationships among the main lineages of the core Chlorophyta are still unresolved, we evaluated the impact of including the Chlorodendrophyceae in chloroplast phylogenomic analyses. The trees we inferred using data sets of 79 and 108 genes from 71 chlorophytes indicate that the Chlorodendrophyceae is a deep-diverging lineage of the core Chlorophyta, although the placement of this class relative to the Pedinophyceae remains ambiguous. Interestingly, some of our phylogenomic trees together with our comparative analysis of gene order data support the monophyly of the Trebouxiophyceae, thus offering further evidence that the previously observed affiliation between the Chlorellales and Pedinophyceae is the result of systematic errors in phylogenetic reconstruction.
Grant, Jessica R.; Katz, Laura A.
2014-01-01
Lateral gene transfer (LGT) has impacted the evolutionary history of eukaryotes, though to a lesser extent than in bacteria and archaea. Detecting LGT and distinguishing it from single gene tree artifacts is difficult, particularly when considering very ancient events (i.e., over hundreds of millions of years). Here, we use two independent lines of evidence—a taxon-rich phylogenetic approach and an assessment of the patterns of gene presence/absence—to evaluate the extent of LGT in the parasitic amoebozoan genus Entamoeba. Previous work has suggested that a number of genes in the genome of Entamoeba spp. were acquired by LGT. Our approach, using an automated phylogenomic pipeline to build taxon-rich gene trees, suggests that LGT is more extensive than previously thought. Our analyses reveal that genes have frequently entered the Entamoeba genome via nonvertical events, including at least 116 genes acquired directly from bacteria or archaea, plus an additional 22 genes in which Entamoeba plus one other eukaryote are nested among bacteria and/or archaea. These genes may make good candidates for novel therapeutics, as drugs targeting these genes are less likely to impact the human host. Although we recognize the challenges of inferring intradomain transfers given systematic errors in gene trees, we find 109 genes supporting LGT from a eukaryote to Entamoeba spp., and 178 genes unique to Entamoeba spp. and one other eukaryotic taxon (i.e., presence/absence data). Inspection of these intradomain LGTs provide evidence of a common sister relationship between genes of Entamoeba (Amoebozoa) and parabasalids (Excavata). We speculate that this indicates a past close relationship (e.g., symbiosis) between ancestors of these extant lineages. PMID:25146649
Sato, Naoki; Awai, Koichiro
2017-11-01
Lipid biosynthesis within the chloroplast, or more generally plastids, was conventionally called "prokaryotic pathway," which produces glycerolipids bearing C18 acids at the sn-1 position and C16 acids at the sn-2 position, as in cyanobacteria such as Anabaena and Synechocystis. This positional specificity is determined during the synthesis of phosphatidate, which is a precursor to diacylglycerol, the acceptor of galactose for the synthesis of galactolipids. The first acylation at sn-1 is catalyzed by glycerol-3-phosphate acyltransferase (GPAT or GPT), whereas the second acylation at sn-2 is performed by lysophosphatidate acyltransferase (LPAAT, AGPAT, or PlsC). Here we present comprehensive phylogenomic analysis of the origins of various acyltransferases involved in the synthesis of phosphatidate, as well as phosphatidate phosphatases in the chloroplasts. The results showed that the enzymes involved in the two steps of acylation in cyanobacteria and chloroplasts are entirely phylogenetically unrelated despite a previous report stating that the chloroplast LPAAT (ATS2) and cyanobacterial PlsC were sister groups. Phosphatidate phosphatases were separated into eukaryotic and prokaryotic clades, and the chloroplast enzymes were not of cyanobacterial origin, in contrast with another previous report. These results indicate that the lipid biosynthetic pathway in the chloroplasts or plastids did not originate from the cyanobacterial endosymbiont and is not "prokaryotic" in the context of endosymbiotic theory of plastid origin. This is another line of evidence for the discontinuity of plastids and cyanobacteria, which has been suggested in the glycolipid biosynthesis. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data
Shen, Xing -Xing; Zhou, Xiaofan; Kominek, Jacek; ...
2016-09-26
Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeastmore » fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. Furthermore, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.« less
Finnerty, John R; Mazza, Maureen E; Jezewski, Peter A
2009-01-01
Background Msx originated early in animal evolution and is implicated in human genetic disorders. To reconstruct the functional evolution of Msx and inform the study of human mutations, we analyzed the phylogeny and synteny of 46 metazoan Msx proteins and tracked the duplication, diversification and loss of conserved motifs. Results Vertebrate Msx sequences sort into distinct Msx1, Msx2 and Msx3 clades. The sister-group relationship between MSX1 and MSX2 reflects their derivation from the 4p/5q chromosomal paralogon, a derivative of the original "MetaHox" cluster. We demonstrate physical linkage between Msx and other MetaHox genes (Hmx, NK1, Emx) in a cnidarian. Seven conserved domains, including two Groucho repression domains (N- and C-terminal), were present in the ancestral Msx. In cnidarians, the Groucho domains are highly similar. In vertebrate Msx1, the N-terminal Groucho domain is conserved, while the C-terminal domain diverged substantially, implying a novel function. In vertebrate Msx2 and Msx3, the C-terminal domain was lost. MSX1 mutations associated with ectodermal dysplasia or orofacial clefting disorders map to conserved domains in a non-random fashion. Conclusion Msx originated from a MetaHox ancestor that also gave rise to Tlx, Demox, NK, and possibly EHGbox, Hox and ParaHox genes. Duplication, divergence or loss of domains played a central role in the functional evolution of Msx. Duplicated domains allow pleiotropically expressed proteins to evolve new functions without disrupting existing interaction networks. Human missense sequence variants reside within evolutionarily conserved domains, likely disrupting protein function. This phylogenomic evaluation of candidate disease markers will inform clinical and functional studies. PMID:19154605
Finnerty, John R; Mazza, Maureen E; Jezewski, Peter A
2009-01-20
Msx originated early in animal evolution and is implicated in human genetic disorders. To reconstruct the functional evolution of Msx and inform the study of human mutations, we analyzed the phylogeny and synteny of 46 metazoan Msx proteins and tracked the duplication, diversification and loss of conserved motifs. Vertebrate Msx sequences sort into distinct Msx1, Msx2 and Msx3 clades. The sister-group relationship between MSX1 and MSX2 reflects their derivation from the 4p/5q chromosomal paralogon, a derivative of the original "MetaHox" cluster. We demonstrate physical linkage between Msx and other MetaHox genes (Hmx, NK1, Emx) in a cnidarian. Seven conserved domains, including two Groucho repression domains (N- and C-terminal), were present in the ancestral Msx. In cnidarians, the Groucho domains are highly similar. In vertebrate Msx1, the N-terminal Groucho domain is conserved, while the C-terminal domain diverged substantially, implying a novel function. In vertebrate Msx2 and Msx3, the C-terminal domain was lost. MSX1 mutations associated with ectodermal dysplasia or orofacial clefting disorders map to conserved domains in a non-random fashion. Msx originated from a MetaHox ancestor that also gave rise to Tlx, Demox, NK, and possibly EHGbox, Hox and ParaHox genes. Duplication, divergence or loss of domains played a central role in the functional evolution of Msx. Duplicated domains allow pleiotropically expressed proteins to evolve new functions without disrupting existing interaction networks. Human missense sequence variants reside within evolutionarily conserved domains, likely disrupting protein function. This phylogenomic evaluation of candidate disease markers will inform clinical and functional studies.
A Phylogenomic Solution to the Origin of Insects by Resolving Crustacean-Hexapod Relationships.
Schwentner, Martin; Combosch, David J; Pakes Nelson, Joey; Giribet, Gonzalo
2017-06-19
Insects, the most diverse group of organisms, are nested within crustaceans, arguably the most abundant group of marine animals. However, to date, no consensus has been reached as to which crustacean taxon is the closest relative of hexapods. A majority of studies have proposed that Branchiopoda (e.g., fairy shrimps) is the sister group of Hexapoda [1-7]. However, these investigations largely excluded two equally important taxa, Remipedia and Cephalocarida. Other studies suggested Remipedia [8-11] or Remipedia + Cephalocarida [12, 13] as potential sister groups of hexapods, but they either did not include Cephalocarida or used only Sanger sequence data and morphology [9, 12]. Here we present the first phylogenomic study specifically addressing the origins of hexapods, including transcriptomes for two species each of Cephalocarida and Remipedia. Phylogenetic analyses of selected matrices, ranging from 81 to 1,675 orthogroups and up to 510,982 amino acid positions, clearly reject a sister-group relationship between Hexapoda and Branchiopoda [1-7]. Nonetheless, support for a hexapod sister-group relationship to Remipedia or to Cephalocarida-Remipedia was highly dependent on the employed analytical methodology. Further analyses assessing the effects of gene evolutionary rate and targeted taxon exclusion support Remipedia as the sole sister taxon of Hexapoda and suggest that the prior grouping of Remipedia + Cephalocarida is an artifact, possibly due to long branch attraction and compositional heterogeneity. We further conclude that terrestrialization of Hexapoda probably occurred in the late Cambrian to early Ordovician, an estimate that is independent of their proposed sister group [4, 8, 12, 14]. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evolution of microbes and viruses: a paradigm shift in evolutionary biology?
Koonin, Eugene V.; Wolf, Yuri I.
2012-01-01
When Charles Darwin formulated the central principles of evolutionary biology in the Origin of Species in 1859 and the architects of the Modern Synthesis integrated these principles with population genetics almost a century later, the principal if not the sole objects of evolutionary biology were multicellular eukaryotes, primarily animals and plants. Before the advent of efficient gene sequencing, all attempts to extend evolutionary studies to bacteria have been futile. Sequencing of the rRNA genes in thousands of microbes allowed the construction of the three- domain “ribosomal Tree of Life” that was widely thought to have resolved the evolutionary relationships between the cellular life forms. However, subsequent massive sequencing of numerous, complete microbial genomes revealed novel evolutionary phenomena, the most fundamental of these being: (1) pervasive horizontal gene transfer (HGT), in large part mediated by viruses and plasmids, that shapes the genomes of archaea and bacteria and call for a radical revision (if not abandonment) of the Tree of Life concept, (2) Lamarckian-type inheritance that appears to be critical for antivirus defense and other forms of adaptation in prokaryotes, and (3) evolution of evolvability, i.e., dedicated mechanisms for evolution such as vehicles for HGT and stress-induced mutagenesis systems. In the non-cellular part of the microbial world, phylogenomics and metagenomics of viruses and related selfish genetic elements revealed enormous genetic and molecular diversity and extremely high abundance of viruses that come across as the dominant biological entities on earth. Furthermore, the perennial arms race between viruses and their hosts is one of the defining factors of evolution. Thus, microbial phylogenomics adds new dimensions to the fundamental picture of evolution even as the principle of descent with modification discovered by Darwin and the laws of population genetics remain at the core of evolutionary biology. PMID:22993722
Reconstructing the Backbone of the Saccharomycotina Yeast Phylogeny Using Genome-Scale Data
Shen, Xing-Xing; Zhou, Xiaofan; Kominek, Jacek; Kurtzman, Cletus P.; Hittinger, Chris Todd; Rokas, Antonis
2016-01-01
Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeast fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. However, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast. PMID:27672114
De Maayer, Pieter; Aliyu, Habibu; Vikram, Surendra; Blom, Jochen; Duffy, Brion; Cowan, Don A; Smits, Theo H M; Venter, Stephanus N; Coutinho, Teresa A
2017-01-01
Pantoea ananatis is ubiquitously found in the environment and causes disease on a wide range of plant hosts. By contrast, its sister species, Pantoea stewartii subsp. stewartii is the host-specific causative agent of the devastating maize disease Stewart's wilt. This pathogen has a restricted lifecycle, overwintering in an insect vector before being introduced into susceptible maize cultivars, causing disease and returning to overwinter in its vector. The other subspecies of P. stewartii subsp. indologenes , has been isolated from different plant hosts and is predicted to proliferate in different environmental niches. Here we have, by the use of comparative genomics and a comprehensive suite of bioinformatic tools, analyzed the genomes of ten P. stewartii and nineteen P. ananatis strains. Our phylogenomic analyses have revealed that there are two distinct clades within P. ananatis while far less phylogenetic diversity was observed among the P. stewartii subspecies. Pan-genome analyses revealed a large core genome comprising of 3,571 protein coding sequences is shared among the twenty-nine compared strains. Furthermore, we showed that an extensive accessory genome made up largely by a mobilome of plasmids, integrated prophages, integrative and conjugative elements and insertion elements has resulted in extensive diversification of P. stewartii and P. ananatis . While these organisms share many pathogenicity determinants, our comparative genomic analyses show that they differ in terms of the secretion systems they encode. The genomic differences identified in this study have allowed us to postulate on the divergent evolutionary histories of the analyzed P. ananatis and P. stewartii strains and on the molecular basis underlying their ecological success and host range.
Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shen, Xing -Xing; Zhou, Xiaofan; Kominek, Jacek
Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeastmore » fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. Furthermore, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.« less
Tse, Herman; Chen, Jonathan H.K.; Tang, Ying; Lau, Susanna K.P.; Woo, Patrick C.Y.
2014-01-01
Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the “sanguinis group.” As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the “mitis group.” On the basis of the findings, we propose a novel group, named “sinensis group,” to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. PMID:25331233
Teng, Jade L L; Huang, Yi; Tse, Herman; Chen, Jonathan H K; Tang, Ying; Lau, Susanna K P; Woo, Patrick C Y
2014-10-20
Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the "sanguinis group." As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the "mitis group." On the basis of the findings, we propose a novel group, named "sinensis group," to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Iandolino, Alberto; Nobuta, Kan; da Silva, Francisco Goes; Cook, Douglas R; Meyers, Blake C
2008-05-12
Vitis vinifera (V. vinifera) is the primary grape species cultivated for wine production, with an industry valued annually in the billions of dollars worldwide. In order to sustain and increase grape production, it is necessary to understand the genetic makeup of grape species. Here we performed mRNA profiling using Massively Parallel Signature Sequencing (MPSS) and combined it with available Expressed Sequence Tag (EST) data. These tag-based technologies, which do not require a priori knowledge of genomic sequence, are well-suited for transcriptional profiling. The sequence depth of MPSS allowed us to capture and quantify almost all the transcripts at a specific stage in the development of the grape berry. The number and relative abundance of transcripts from stage II grape berries was defined using Massively Parallel Signature Sequencing (MPSS). A total of 2,635,293 17-base and 2,259,286 20-base signatures were obtained, representing at least 30,737 and 26,878 distinct sequences. The average normalized abundance per signature was approximately 49 TPM (Transcripts Per Million). Comparisons of the MPSS signatures with available Vitis species' ESTs and a unigene set demonstrated that 6,430 distinct contigs and 2,190 singletons have a perfect match to at least one MPSS signature. Among the matched sequences, ESTs were identified from tissues other than berries or from berries at different developmental stages. Additional MPSS signatures not matching to known grape ESTs can extend our knowledge of the V. vinifera transcriptome, particularly when these data are used to assist in annotation of whole genome sequences from Vitis vinifera. The MPSS data presented here not only achieved a higher level of saturation than previous EST based analyses, but in doing so, expand the known set of transcripts of grape berries during the unique stage in development that immediately precedes the onset of ripening. The MPSS dataset also revealed evidence of antisense expression not previously reported in grapes but comparable to that reported in other plant species. Finally, we developed a novel web-based, public resource for utilization of the grape MPSS data [1].
[Essentials of pharmacophylogeny: knowledge pedigree, epistemology and paradigm shift].
Hao, Da-cheng; Xiao, Pei-gen; Liu, Li-wei; Peng, Yong; He, Chun-nian
2015-09-01
Chinese materia medica resource (CMM resource) is the foundation of the development of traditional Chinese medicine. In the study of sustainable utilization of CMM resource, adopting innovative theory and method to find new CMM resource is one of hotspots and always highlighted. Pharmacophylogeny interrogates the phylogenetic relationship of medicinal organisms (especially medicinal plants), as well as the intrinsic correlation of morphological taxonomy, molecular phylogeny, chemical constituents, and therapeutic efficacy (ethnopharmacology and pharmacological activity). This new discipline may have the power to change the way we utilize medicinal plant resources and develop plant-based drugs. Phylogenomics is the crossing of evolutionary biology and genomics, in which genome data are utilized for evolutionary reconstructions. Phylogenomics can be integrated into the flow chart of drug discovery and development, and extends the field of pharmacophylogeny at the omic level, thus the concept of pharmacophylogenomics could be redefined in the context of plant pharmaceutical resources. This contribution gives a brief discourse of knowledge pedigree of pharmacophylogeny, epistemology and paradigm shift, highlighting the theoretical and practical values of pharmacophylogenomics. Many medicinally important tribes and genera, such as Clematis, Pulsatilla, Anemone, Cimicifugeae, Nigella, Delphinieae, Adonideae, Aquilegia, Thalictrum, and Coptis, belong to Ranunculaceae family. Compared to other plant families, Ranunculaceae has the most species that are recorded in China Pharmacopoeia (CP) 2010. However, many Ranunculaceae species, e. g., those that are closely related to CP species, as well as those endemic to China, have not been investigated in depth, and their phylogenetic relationship and potential in medicinal use remain elusive. As such, it is proposed to select Ranunculaceae to exemplify the utility of pharmacophylogenomics and to elaborate the new concept empirically. It is argued that phylogenetic and evolutionary relationship of medicinally important tribes and genera within Ranunculaceae could be elucidated at the genomic, transcriptomic, and metabolomic levels, from which the intrinsic correlation between medicinal plant genotype and metabolic phenotype, and between genetic diversity and chemodivesity of closely related taxa, could be revealed. This proof-of-concept study regards pharmacophylogenomics as the updated version of pharmacophylogeny and would enrich the intension and spread the extension of pharmacophylogeny. The interdisciplinary knowledge and techniques will be integrated in the proposed study to promote development of CMM resource discipline and to boost sustainable development of Chinese medicinal plant resources.
Prediction of EST functional relationships via literature mining with user-specified parameters.
Wang, Hei-Chia; Huang, Tian-Hsiang
2009-04-01
The massive amount of expressed sequence tags (ESTs) gathered over recent years has triggered great interest in efficient applications for genomic research. In particular, EST functional relationships can be used to determine a possible gene network for biological processes of interest. In recent years, many researchers have tried to determine EST functional relationships by analyzing the biological literature. However, it has been challenging to find efficient prediction methods. Moreover, an annotated EST is usually associated with many functions, so successful methods must be able to distinguish between relevant and irrelevant functions based on user specifications. This paper proposes a method to discover functional relationships between ESTs of interest by analyzing literature from the Medical Literature Analysis and Retrieval System Online, with user-specified parameters for selecting keywords. This method performs better than the multiple kernel documents method in setting up a specific threshold for gathering materials. The method is also able to uncover known functional relationships, as shown by a comparison with the Kyoto Encyclopedia of Genes and Genomes database. The reliable EST relationships predicted by the proposed method can help to construct gene networks for specific biological functions of interest.
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.
Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao
2017-01-01
The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
9. VIEW FROM MANY PARKS CURVE (ON TRAIL RIDGE ROAD) ...
9. VIEW FROM MANY PARKS CURVE (ON TRAIL RIDGE ROAD) OF HORSESHOE PARK, SHOWING FALL RIVER ROAD FAINTLY AT LEFT AT BASE OF SHEEP MOUNTAIN AND CROSSING ALLUVIAL FAN FROM LAWN LAKE FLOOD. - Fall River Road, Between Estes Park & Fall River Pass, Estes Park, Larimer County, CO
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries
Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P
2008-01-01
Background Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. Results We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. Conclusion EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects. PMID:18402700
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries.
Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P
2008-04-10
Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects.
Pandin, Caroline; Le Coq, Dominique; Deschamps, Julien; Védie, Régis; Rousseau, Thierry; Aymerich, Stéphane; Briandet, Romain
2018-04-24
Bacillus subtilis QST713 is extensively used as a biological control agent in agricultural fields including in the button mushroom culture, Agaricus bisporus. This last use exploits its inhibitory activity against microbial pathogens such as Trichoderma aggressivum f. europaeum, the main button mushroom green mould competitor. Here, we report the complete genome sequence of this bacterium with a genome size of 4 233 757 bp, 4263 predicted genes and an average GC content of 45.9%. Based on phylogenomic analyses, strain QST713 is finally designated as Bacillus velezensis. Genomic analyses revealed two clusters encoding potential new antimicrobials with NRPS and TransATPKS synthetase. B. velezensis QST713 genome also harbours several genes previously described as being involved in surface colonization and biofilm formation. This strain shows a strong ability to form in vitro spatially organized biofilm and to antagonize T. aggressivum. The availability of this genome sequence could bring new elements to understand the interactions with micro or/and macroorganisms in crops. Copyright © 2018 Elsevier B.V. All rights reserved.
Chen, Ting; Zheng, Lei; Yuan, Jie; An, Zhongfu; Chen, Runfeng; Tao, Ye; Li, Huanhuan; Xie, Xiaoji; Huang, Wei
2015-01-01
Developing organic optoelectronic materials with desired photophysical properties has always been at the forefront of organic electronics. The variation of singlet-triplet splitting (ΔEST) can provide useful means in modulating organic excitons for diversified photophysical phenomena, but controlling ΔEST in a desired manner within a large tuning scope remains a daunting challenge. Here, we demonstrate a convenient and quantitative approach to relate ΔEST to the frontier orbital overlap and separation distance via a set of newly developed parameters using natural transition orbital analysis to consider whole pictures of electron transitions for both the lowest singlet (S1) and triplet (T1) excited states. These critical parameters revealed that both separated S1 and T1 states leads to ultralow ΔEST; separated S1 and overlapped T1 states results in small ΔEST; and both overlapped S1 and T1 states induces large ΔEST. Importantly, we realized a widely-tuned ΔEST in a range from ultralow (0.0003 eV) to extra-large (1.47 eV) via a subtle symmetric control of triazine molecules, based on time-dependent density functional theory calculations combined with experimental explorations. These findings provide keen insights into ΔEST control for feasible excited state tuning, offering valuable guidelines for the construction of molecules with desired optoelectronic properties. PMID:26161684
Nagel, Jana; Culley, Lana K.; Lu, Yuping; Liu, Enwu; Matthews, Paul D.; Stevens, Jan F.; Page, Jonathan E.
2008-01-01
The glandular trichomes (lupulin glands) of hop (Humulus lupulus) synthesize essential oils and terpenophenolic resins, including the bioactive prenylflavonoid xanthohumol. To dissect the biosynthetic processes occurring in lupulin glands, we sequenced 10,581 ESTs from four trichome-derived cDNA libraries. ESTs representing enzymes of terpenoid biosynthesis, including all of the steps of the methyl 4-erythritol phosphate pathway, were abundant in the EST data set, as were ESTs for the known type III polyketide synthases of bitter acid and xanthohumol biosynthesis. The xanthohumol biosynthetic pathway involves a key O-methylation step. Four S-adenosyl-l-methionine–dependent O-methyltransferases (OMTs) with similarity to known flavonoid-methylating enzymes were present in the EST data set. OMT1, which was the most highly expressed OMT based on EST abundance and RT-PCR analysis, performs the final reaction in xanthohumol biosynthesis by methylating desmethylxanthohumol to form xanthohumol. OMT2 accepted a broad range of substrates, including desmethylxanthohumol, but did not form xanthohumol. Mass spectrometry and proton nuclear magnetic resonance analysis showed it methylated xanthohumol to 4-O-methylxanthohumol, which is not known from hop. OMT3 was inactive with all substrates tested. The lupulin gland-specific EST data set expands the genomic resources for H. lupulus and provides further insight into the metabolic specialization of glandular trichomes. PMID:18223037
Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species
Liang, Xuanqiang; Chen, Xiaoping; Hong, Yanbin; Liu, Haiyan; Zhou, Guiyuan; Li, Shaoxiong; Guo, Baozhu
2009-01-01
Background Lack of sufficient molecular markers hinders current genetic research in peanuts (Arachis hypogaea L.). It is necessary to develop more molecular markers for potential use in peanut genetic research. With the development of peanut EST projects, a vast amount of available EST sequence data has been generated. These data offered an opportunity to identify SSR in ESTs by data mining. Results In this study, we investigated 24,238 ESTs for the identification and development of SSR markers. In total, 881 SSRs were identified from 780 SSR-containing unique ESTs. On an average, one SSR was found per 7.3 kb of EST sequence with tri-nucleotide motifs (63.9%) being the most abundant followed by di- (32.7%), tetra- (1.7%), hexa- (1.0%) and penta-nucleotide (0.7%) repeat types. The top six motifs included AG/TC (27.7%), AAG/TTC (17.4%), AAT/TTA (11.9%), ACC/TGG (7.72%), ACT/TGA (7.26%) and AT/TA (6.3%). Based on the 780 SSR-containing ESTs, a total of 290 primer pairs were successfully designed and used for validation of the amplification and assessment of the polymorphism among 22 genotypes of cultivated peanuts and 16 accessions of wild species. The results showed that 251 primer pairs yielded amplification products, of which 26 and 221 primer pairs exhibited polymorphism among the cultivated and wild species examined, respectively. Two to four alleles were found in cultivated peanuts, while 3–8 alleles presented in wild species. The apparent broad polymorphism was further confirmed by cloning and sequencing of amplified alleles. Sequence analysis of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the microsatellite regions. In addition, a few single base mutations were observed in the microsatellite flanking regions. Conclusion This study gives an insight into the frequency, type and distribution of peanut EST-SSRs and demonstrates successful development of EST-SSR markers in cultivated peanut. These EST-SSR markers could enrich the current resource of molecular markers for the peanut community and would be useful for qualitative and quantitative trait mapping, marker-assisted selection, and genetic diversity studies in cultivated peanut as well as related Arachis species. All of the 251 working primer pairs with names, motifs, repeat types, primer sequences, and alleles tested in cultivated and wild species are listed in Additional File 1. PMID:19309524
2012-01-01
Background White mold, caused by Sclerotinia sclerotiorum, is one of the most important diseases of pea (Pisum sativum L.), however, little is known about the genetics and biochemistry of this interaction. Identification of genes underlying resistance in the host or pathogenicity and virulence factors in the pathogen will increase our knowledge of the pea-S. sclerotiorum interaction and facilitate the introgression of new resistance genes into commercial pea varieties. Although the S. sclerotiorum genome sequence is available, no pea genome is available, due in part to its large genome size (~3500 Mb) and extensive repeated motifs. Here we present an EST data set specific to the interaction between S. sclerotiorum and pea, and a method to distinguish pathogen and host sequences without a species-specific reference genome. Results 10,158 contigs were obtained by de novo assembly of 128,720 high-quality reads generated by 454 pyrosequencing of the pea-S. sclerotiorum interactome. A method based on the tBLASTx program was modified to distinguish pea and S. sclerotiorum ESTs. To test this strategy, a mixture of known ESTs (18,490 pea and 17,198 S. sclerotiorum ESTs) from public databases were pooled and parsed; the tBLASTx method successfully separated 90.1% of the artificial EST mix with 99.9% accuracy. The tBLASTx method successfully parsed 89.4% of the 454-derived EST contigs, as validated by PCR, into pea (6,299 contigs) and S. sclerotiorum (2,780 contigs) categories. Two thousand eight hundred and forty pea ESTs and 996 S. sclerotiorum ESTs were predicted to be expressed specifically during the pea-S. sclerotiorum interaction as determined by homology search against 81,449 pea ESTs (from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings) and 57,751 S. sclerotiorum ESTs (from mycelia at neutral pH, developing apothecia and developing sclerotia). Among those ESTs specifically expressed, 277 (9.8%) pea ESTs were predicted to be involved in plant defense and response to biotic or abiotic stress, and 93 (9.3%) S. sclerotiorum ESTs were predicted to be involved in pathogenicity/virulence. Additionally, 142 S. sclerotiorum ESTs were identified as secretory/signal peptides of which only 21 were previously reported. Conclusions We present and characterize an EST resource specific to the pea-S. sclerotiorum interaction. Additionally, the tBLASTx method used to parse S. sclerotiorum and pea ESTs was demonstrated to be a reliable and accurate method to distinguish ESTs without a reference genome. PMID:23181755
A dating success story: genomes and fossils converge on placental mammal origins
2012-01-01
The timing of the placental mammal radiation has been a source of contention for decades. The fossil record of mammals extends over 200 million years, but no confirmed placental mammal fossils are known prior to 64 million years ago, which is approximately 1.5 million years after the Cretaceous-Paleogene (K-Pg) mass extinction that saw the end of non-avian dinosaurs. Thus, it came as a great surprise when the first published molecular clock studies suggested that placental mammals originated instead far back in the Cretaceous, in some cases doubling divergence estimates based on fossils. In the last few decades, more than a hundred new genera of Mesozoic mammals have been discovered, and molecular divergence studies have grown from simple clock-like models applied to a few genes to sophisticated analyses of entire genomes. Yet, molecular and fossil-based divergence estimates for placental mammal origins have remained remote, with knock-on effects for macro-scale reconstructions of mammal evolution. A few recent molecular studies have begun to converge with fossil-based estimates, and a new phylogenomic study in particular shows that the palaeontological record was mostly correct; most placental mammal orders diversified after the K-Pg mass extinction. While a small gap still remains for Late Cretaceous supraordinal divergences, this study has significantly improved the congruence between molecular and palaeontological data and heralds a broader integration of these fields of evolutionary science. PMID:22883371
Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A.
2015-01-01
Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study, next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina HiSeq 2500 instrument. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs. PMID:26656830
Zhang, Ning; Wen, Jun; Zimmer, Elizabeth A
2015-01-01
Vitaceae is well-known for having one of the most economically important fruits, i.e., the grape (Vitis vinifera). The deep phylogeny of the grape family was not resolved until a recent phylogenomic analysis of 417 nuclear genes from transcriptome data. However, it has been reported extensively that topologies based on nuclear and organellar genes may be incongruent due to differences in their evolutionary histories. Therefore, it is important to reconstruct a backbone phylogeny of the grape family using plastomes and mitochondrial genes. In this study,next-generation sequencing data sets of 27 species were obtained using genome skimming with total DNAs from silica-gel preserved tissue samples on an Illumina NextSeq 500 instrument [corrected]. Plastomes were assembled using the combination of de novo and reference genome (of V. vinifera) methods. Sixteen mitochondrial genes were also obtained via genome skimming using the reference genome of V. vinifera. Extensive phylogenetic analyses were performed using maximum likelihood and Bayesian methods. The topology based on either plastome data or mitochondrial genes is congruent with the one using hundreds of nuclear genes, indicating that the grape family did not exhibit significant reticulation at the deep level. The results showcase the power of genome skimming in capturing extensive phylogenetic data: especially from chloroplast and mitochondrial DNAs.
Burress, E D; Alda, F; Duarte, A; Loureiro, M; Armbruster, J W; Chakrabarty, P
2018-01-01
The rapid rise of phenotypic and ecological diversity in independent lake-dwelling groups of cichlids is emblematic of the East African Great Lakes. In this study, we show that similar ecologically based diversification has occurred in pike cichlids (Crenicichla) throughout the Uruguay River drainage of South America. We collected genomic data from nearly 500 ultraconserved element (UCEs) loci and >260 000 base pairs across 33 species, to obtain a phylogenetic hypothesis for the major species groups and to evaluate the relationships and genetic structure among five closely related, endemic, co-occurring species (the Uruguay River species flock; URSF). Additionally, we evaluated ecological divergence of the URSF based on body and lower pharyngeal jaw (LPJ) shape and gut contents. Across the genus, we recovered novel relationships among the species groups. We found strong support for the monophyly of the URSF; however, relationships among these species remain problematic, likely because of the rapid and recent evolution of this clade. Clustered co-ancestry analysis recovered most species as well delimited genetic groups. The URSF species exhibit species-specific body and LPJ shapes associated with specialized trophic roles. Collectively, our results suggest that the URSF consists of incipient species that arose via ecological speciation associated with the exploration of novel trophic roles. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
English for Science and Technology: Profiles and Perspectives.
ERIC Educational Resources Information Center
Orr, Thomas, Ed.
This report contains a collection of articles that document the evolution of English for Science and Technology (EST) instruction and research--one of the earliest and most active branches of English for Specific Purposes (ESP). Articles included are: (1) "WWW-Based Instruction for EST" (Roy Bowers); (2) "English for Medical Purposes in Mexico: A…
ERIC Educational Resources Information Center
Wilkins, Victoria; Chambliss, Catherine
When training counseling students, it is important to familiarize them with the clinical research literature exploring the efficacy of particular treatments. The bulk of the document is comprised of a review of empirically supported treatments (ESTs). ESTs or evidence-based treatments are grounded in studies recommended by the American…
Li, X Y; Xu, H X; Chen, J W
2014-04-29
Manual cultivar identification diagram is a new strategy for plant cultivar identification based on DNA markers, providing information to efficiently separate cultivars. We tested 25 pairs of apple EST-SSR primers for amplification of PCR products from loquat cultivars. These EST-SSR primers provided clear amplification products from the loquat cultivars, with a relatively high transferability rate of 84% to loquat; 11 pairs of primers amplified polymorphic products. After analysis of 24 red-fleshed loquat accessions, we found that only 7 pairs of primers could clearly separate all of them. A cultivar identification diagram of the 24 cultivars was constructed using polymorphic bands from the DNA fingerprints and EST-SSR primers. Any two of the 24 cultivars could be rapidly separated from each other, according to the polymorphic bands from the cultivars; the corresponding primers were marked in the correct position on the cultivar identification diagram. This red-flesh loquat cultivar identification diagram can separate the 24 red-flesh loquat cultivars, which is of benefit for loquat cultivar identification for germplasm management and breeding programs.
First genetic linkage map of Taraxacum koksaghyz Rodin based on AFLP, SSR, COS and EST-SSR markers.
Arias, Marina; Hernandez, Monica; Remondegui, Naroa; Huvenaars, Koen; van Dijk, Peter; Ritter, Enrique
2016-08-04
Taraxacum koksaghyz Rodin (TKS) has been studied in many occasions as a possible alternative source for natural rubber production of good quality and for inulin production. Some tire companies are already testing TKS tire prototypes. There are also many investigations on the production of bio-fuels from inulin and inulin applications for health improvement and in the food industry. A limited amount of genomic resources exist for TKS and particularly no genetic linkage map is available in this species. We have constructed the first TKS genetic linkage map based on AFLP, COS, SSR and EST-SSR markers. The integrated linkage map with eight linkage groups (LG), representing the eight chromosomes of Russian dandelion, has 185 individual AFLP markers from parent 1, 188 individual AFLP markers from parent 2, 75 common AFLP markers and 6 COS, 1 SSR and 63 EST-SSR loci. Blasting the EST-SSR sequences against known sequences from lettuce allowed a partial alignment of our TKS map with a lettuce map. Blast searches against plant gene databases revealed some homologies with useful genes for downstream applications in the future.
Developpement d'une commande pour une hydrolienne de riviere et optimisation =
NASA Astrophysics Data System (ADS)
Tetrault, Philippe
Suivant le developpement des energies renouvelables, la presente etude se veut une base theorique quant aux principes fondamentaux necessaires au bon fonctionnement et a l'implementation d'une hydrolienne de riviere. La problematique derriere ce nouveau type d'appareil est d'abord presentee. La machine electrique utilisee dans l'application, c'est-a-dire la machine synchrone a aimants permanents, est etudiee : ses equations dynamiques mecaniques et electriques sont developpees, introduisant en meme temps le concept de referentiel tournant. Le fonctionnement de l'onduleur utilise, soit un montage en pont complet a deux niveaux a semi-conducteurs, est explique et mit en equation pour permettre de comprendre les strategies de modulation disponibles. Un bref historique de ces strategies est fait avant de mettre l'emphase sur la modulation vectorielle qui sera celle utilisee pour l'application en cours. Les differents modules sont assembles dans une simulation Matlab pour confirmer leur bon fonctionnement et comparer les resultats de la simulation avec les calculs theoriques. Differents algorithmes permettant de traquer et maintenir un point de fonctionnement optimal sont presentes. Le comportement de la riviere est etudie afin d'evaluer l'ampleur des perturbations que le systeme devra gerer. Finalement, une nouvelle approche est presentee et comparee a une strategie plus conservatrice a l'aide d'un autre modele de simulation Matlab.
Two EST-derived marker systems for cultivar identification in tree peony.
Zhang, J J; Shu, Q Y; Liu, Z A; Ren, H X; Wang, L S; De Keyser, E
2012-02-01
Tree peony (Paeonia suffruticosa Andrews), a woody deciduous shrub, belongs to the section Moutan DC. in the genus of Paeonia of the Paeoniaceae family. To increase the efficiency of breeding, two EST-derived marker systems were developed based on a tree peony expressed sequence tag (EST) database. Using target region amplification polymorphism (TRAP), 19 of 39 primer pairs showed good amplification for 56 accessions with amplicons ranging from 120 to 3,000 bp long, among which 99.3% were polymorphic. In contrast, 7 of 21 primer pairs demonstrated adequate amplification with clear bands for simple sequence repeats (SSRs) developed from ESTs, and a total of 33 alleles were found in 56 accessions. The similarity matrices generated by TRAP and EST-SSR markers were compared, and the Mantel test (r = 0.57778, P = 0.0020) showed a moderate correlation between the two types of molecular markers. TRAP markers were suitable for DNA fingerprinting and EST-SSR markers were more appropriate for discriminating synonyms (the same cultivars with different names due to limited information exchanged among different geographic areas). The two sets of EST-derived markers will be used further for genetic linkage map construction and quantitative trait locus detection in tree peony.
Generation and Analysis of Expressed Sequence Tags from Olea europaea L.
Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal
2010-01-01
Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085
Aerodynamic profiles of women with muscle tension dysphonia/aphonia.
Gillespie, Amanda I; Gartner-Schmidt, Jackie; Rubinstein, Elaine N; Abbott, Katherine Verdolini
2013-04-01
In this study, the authors aimed to (a) determine whether phonatory airflows and estimated subglottal pressures (est-Psub) for women with primary muscle tension dysphonia/aphonia (MTD/A) differ from those for healthy speakers; (b) identify different aerodynamic profile patterns within the MTD/A subject group; and (c) determine whether results suggest new understanding of pathogenesis in MTD/A. Retrospective review of aerodynamic data collected from 90 women at the time of primary MTD/A diagnosis. Aerodynamic profiles were significantly different for women with MTD/A as compared with healthy speakers. Five distinct profiles were identified: (a) normal flow, normal est-Psub; (b) high flow, high est-Psub; (c) low flow, normal est-Psub; (d) normal flow, high est-Psub; and (e) high flow, normal est-Psub. This study is the first to identify distinct subgroups of aerodynamic profiles in women with MTD/A and to quantitatively identify a clinical phenomenon sometimes described in association with it-"breath holding"-that is shown by low airflow with normal est-Psub. Results were consistent with clinical claims that diverse respiratory and laryngeal functions may underlie phonatory patterns associated with MTD/A. One potential mechanism, based in psychobiological theory, is introduced to explain some of the variability in aerodynamic profiles of women with MTD/A.
Phenetic Comparison of Prokaryotic Genomes Using k-mers
Déraspe, Maxime; Raymond, Frédéric; Boisvert, Sébastien; Culley, Alexander; Roy, Paul H.; Laviolette, François; Corbeil, Jacques
2017-01-01
Abstract Bacterial genomics studies are getting more extensive and complex, requiring new ways to envision analyses. Using the Ray Surveyor software, we demonstrate that comparison of genomes based on their k-mer content allows reconstruction of phenetic trees without the need of prior data curation, such as core genome alignment of a species. We validated the methodology using simulated genomes and previously published phylogenomic studies of Streptococcus pneumoniae and Pseudomonas aeruginosa. We also investigated the relationship of specific genetic determinants with bacterial population structures. By comparing clusters from the complete genomic content of a genome population with clusters from specific functional categories of genes, we can determine how the population structures are correlated. Indeed, the strain clustering based on a subset of k-mers allows determination of its similarity with the whole genome clusters. We also applied this methodology on 42 species of bacteria to determine the correlational significance of five important bacterial genomic characteristics. For example, intrinsic resistance is more important in P. aeruginosa than in S. pneumoniae, and the former has increased correlation of its population structure with antibiotic resistance genes. The global view of the pangenome of bacteria also demonstrated the taxa-dependent interaction of population structure with antibiotic resistance, bacteriophage, plasmid, and mobile element k-mer data sets. PMID:28957508
Naushad, Sohail; Adeolu, Mobolaji; Goel, Nisha; Khadka, Bijendra; Al-Dahwi, Aqeel; Gupta, Radhey S.
2015-01-01
The genera Actinobacillus, Haemophilus, and Pasteurella exhibit extensive polyphyletic branching in phylogenetic trees and do not represent coherent clusters of species. In this study, we have utilized molecular signatures identified through comparative genomic analyses in conjunction with genome based and multilocus sequence based phylogenetic analyses to clarify the phylogenetic and taxonomic boundary of these genera. We have identified large clusters of Actinobacillus, Haemophilus, and Pasteurella species which represent the “sensu stricto” members of these genera. We have identified 3, 7, and 6 conserved signature indels (CSIs), which are specifically shared by sensu stricto members of Actinobacillus, Haemophilus, and Pasteurella, respectively. We have also identified two different sets of CSIs that are unique characteristics of the pathogen containing genera Aggregatibacter and Mannheimia, respectively. It is now possible to demarcate the genera Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto on the basis of discrete molecular signatures. The other members of the genera Actinobacillus, Haemophilus, and Pasteurella that do not fall within the “sensu stricto” clades and do not contain these molecular signatures should be reclassified as other genera. The CSIs identified here also provide useful diagnostic targets for the identification of current and novel members of the indicated genera. PMID:25821780
Descamps, Elodie C T; Monteil, Caroline L; Menguy, Nicolas; Ginet, Nicolas; Pignol, David; Bazylinski, Dennis A; Lefèvre, Christopher T
2017-07-01
A magnetotactic bacterium, designated strain BW-1 T , was isolated from a brackish spring in Death Valley National Park (California, USA) and cultivated in axenic culture. The Gram-negative cells of strain BW-1 T are relatively large and rod-shaped and possess a single polar flagellum (monotrichous). This strain is the first magnetotactic bacterium isolated in axenic culture capable of producing greigite and/or magnetite nanocrystals aligned in one or more chains per cell. Strain BW-1 T is an obligate anaerobe that grows chemoorganoheterotrophically while reducing sulfate as a terminal electron acceptor. Optimal growth occurred at pH 7.0 and 28°C with fumarate as electron donor and carbon source. Based on its genome sequence, the G+C content is 40.72mol %. Phylogenomic and phylogenetic analyses indicate that strain BW-1 T belongs to the Desulfobacteraceae family within the Deltaproteobacteria class. Based on average amino acid identity, strain BW-1 T can be considered as a novel species of a new genus, for which the name Desulfamplus magnetovallimortis is proposed. The type strain of D. magnetovallimortis is BW-1 T (JCM 18010 T -DSM 103535 T ). Copyright © 2017 Elsevier GmbH. All rights reserved.
Attigala, Lakshmi; Wysocki, William P; Duvall, Melvin R; Clark, Lynn G
2016-08-01
We explored phylogenetic relationships among the twelve lineages of the temperate woody bamboo clade (tribe Arundinarieae) based on plastid genome (plastome) sequence data. A representative sample of 28 taxa was used and maximum parsimony, maximum likelihood and Bayesian inference analyses were conducted to estimate the Arundinarieae phylogeny. All the previously recognized clades of Arundinarieae were supported, with Ampelocalamus calcareus (Clade XI) as sister to the rest of the temperate woody bamboos. Well supported sister relationships between Bergbambos tessellata (Clade I) and Thamnocalamus spathiflorus (Clade VII) and between Kuruna (Clade XII) and Chimonocalmus (Clade III) were revealed by the current study. The plastome topology was tested by taxon removal experiments and alternative hypothesis testing and the results supported the current plastome phylogeny as robust. Neighbor-net analyses showed few phylogenetic signal conflicts, but suggested some potentially complex relationships among these taxa. Analyses of morphological character evolution of rhizomes and reproductive structures revealed that pachymorph rhizomes were most likely the ancestral state in Arundinarieae. In contrast leptomorph rhizomes either evolved once with reversions to the pachymorph condition or multiple times in Arundinarieae. Further, pseudospikelets evolved independently at least twice in the Arundinarieae, but the ancestral state is ambiguous. Copyright © 2016 Elsevier Inc. All rights reserved.
Hazen, Tracy H; Michalski, Jane; Luo, Qingwei; Shetty, Amol C; Daugherty, Sean C; Fleckenstein, James M; Rasko, David A
2017-06-14
Escherichia coli that are capable of causing human disease are often classified into pathogenic variants (pathovars) based on their virulence gene content. However, disease-associated hybrid E. coli, containing unique combinations of multiple canonical virulence factors have also been described. Such was the case of the E. coli O104:H4 outbreak in 2011, which caused significant morbidity and mortality. Among the pathovars of diarrheagenic E. coli that cause significant human disease are the enteropathogenic E. coli (EPEC) and enterotoxigenic E. coli (ETEC). In the current study we use comparative genomics, transcriptomics, and functional studies to characterize isolates that contain virulence factors of both EPEC and ETEC. Based on phylogenomic analysis, these hybrid isolates are more genomically-related to EPEC, but appear to have acquired ETEC virulence genes. Global transcriptional analysis using RNA sequencing, demonstrated that the EPEC and ETEC virulence genes of these hybrid isolates were differentially-expressed under virulence-inducing laboratory conditions, similar to reference isolates. Immunoblot assays further verified that the virulence gene products were produced and that the T3SS effector EspB of EPEC, and heat-labile toxin of ETEC were secreted. These findings document the existence and virulence potential of an E. coli pathovar hybrid that blurs the distinction between E. coli pathovars.
Seligmann, Hervé
2013-05-07
GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhan, Aibin; Bao, Zhenmin; Wang, Mingling; Chang, Dan; Yuan, Jian; Wang, Xiaolong; Hu, Xiaoli; Liang, Chengzhu; Hu, Jingjie
2008-05-01
The EST database of the Pacific abalone ( Haliotis discus) was mined for developing microsatellite markers. A total of 1476 EST sequences were registered in GenBank when data mining was performed. Fifty sequences (approximately 3.4%) were found to contain one or more microsatellites. Based on the length and GC content of the flanking regions, cluster analysis and BLASTN, 13 microsatellite-containing ESTs were selected for PCR primer design. The results showed that 10 out of 13 primer pairs could amplify scorable PCR products and showed polymorphism. The number of alleles ranged from 2 to 13 and the values of H o and H e varied from 0.1222 to 0.8611 and 0.2449 to 0.9311, respectively. No significant linkage disequilibrium (LD) between any pairs of these loci was found, and 6 of 10 loci conformed to the Hardy-Weinberg equilibrium (HWE). These EST-SSRs are therefore potential tools for studies of intraspecies variation and hybrid identification.
Modelisation des emissions de particules microniques et nanometriques en usinage
NASA Astrophysics Data System (ADS)
Khettabi, Riad
La mise en forme des pieces par usinage emet des particules, de tailles microscopiques et nanometriques, qui peuvent etre dangereuses pour la sante. Le but de ce travail est d'etudier les emissions de ces particules pour fins de prevention et reduction a la source. L'approche retenue est experimentale et theorique, aux deux echelles microscopique et macroscopique. Le travail commence par des essais permettant de determiner les influences du materiau, de l'outil et des parametres d'usinage sur les emissions de particules. E nsuite un nouveau parametre caracterisant les emissions, nomme Dust unit , est developpe et un modele predictif est propose. Ce modele est base sur une nouvelle theorie hybride qui integre les approches energetiques, tribologiques et deformation plastique, et inclut la geometrie de l'outil, les proprietes du materiau, les conditions de coupe et la segmentation des copeaux. Il ete valide au tournage sur quatre materiaux: A16061-T6, AISI1018, AISI4140 et fonte grise.
The EST Model for Predicting Progressive Damage and Failure of Open Hole Bending Specimens
NASA Technical Reports Server (NTRS)
Joseph, Ashith P. K.; Waas, Anthony M.; Pineda, Evan J.
2016-01-01
Progressive damage and failure in open hole composite laminate coupons subjected to flexural loading is modeled using Enhanced Schapery Theory (EST). Previous studies have demonstrated that EST can accurately predict the strength of open hole coupons under remote tensile and compressive loading states. This homogenized modeling approach uses single composite shell elements to represent the entire laminate in the thickness direction and significantly reduces computational cost. Therefore, when delaminations are not of concern or are active in the post-peak regime, the version of EST presented here is a good engineering tool for predicting deformation response. Standard coupon level tests provides all the input data needed for the model and they are interpreted in conjunction with finite element (FE) based simulations. Open hole bending test results of three different IM7/8552 carbon fiber composite layups agree well with EST predictions. The model is able to accurately capture the curvature change and deformation localization in the specimen at and during the post catastrophic load drop event.
NASA Astrophysics Data System (ADS)
Hamada, Y.; Yamada, Y.; Sanada, Y.; Nakamura, Y.; Kido, Y. N.; Moe, K.
2017-12-01
Gas hydrates bearing layer can be normally identified by a basement simulating reflector (BSR) or well logging because of their high acoustic- and electric impedance compared to the surrounding formation. These characteristics of the gas hydrate can also represent contrast of in-situ formation strength. We here attempt to describe gas hydrate bearing layers based on the equivalent strength (EST). The Indian National Gas Hydrate Program (NGHP) Expedition 02 was executed 2015 off the eastern margin of the Indian Peninsula to investigate distribution and occurrence of gas hydrates. From 25 drill sites, downhole logging data, cored samples, and drilling performance data were collected. Recorded drilling performance data was converted to the EST, which is a developed mechanical strength calculated only by drilling parameters (top drive torque, rotation per minute , rate of penetration , and drill bit diameter). At a representative site, site 23, the EST shows constant trend of 5 to 10 MPa, with some positive peaks at 0 - 270 mbsf interval, and sudden increase up to 50 MPa above BSR depth (270 - 290 mbsf). Below the BSR, the EST stays at 5-10 MPa down to the bottom of the hole (378 mbsf). Comparison of the EST with logging data and core sample description suggests that the depth profiles of the EST reflect formation lithology and gas hydrate content: the EST increase in the sand-rich layer and the gas hydrate bearing zone. Especially in the gas hydrate zone, the EST curve indicates approximately the same trend with that of P-wave velocity and resistivity measured by downhole logging. Cross plot of the increment of the EST and resistivity revealed the relation between them is roughly logarithmic, indicating the increase and decrease of the EST strongly depend on the saturation factor of gas hydrate. These results suggest that the EST, proxy of in-situ formation strength, can be an indicator of existence and amount of the gas-hydrate layer. Although the EST was calculated after drilling utilizing recorded surface drilling parameter in this study, the EST can be acquired during drilling by using real-time drilling parameters. In addition, the EST only requires drilling performance parameters without any additional tools or measurements, making it a simplified and economical tool for the exploration of gas hydrates.
Brandon Schlautman; Vera Pfeiffer; Juan Zalapa; Johanne Brunet
2014-01-01
Numerous microsatellite markers were developed for Aquilegia formosafrom sequences deposited within the Expressed Sequence Tag (EST), Genomic Survey Sequence (GSS), and Nucleotide databases in NCBI. Microsatellites (SSRs) were identified and primers were designed for 9 SSR containing sequences in the Nucleotide database, 3803 sequences in the EST...
The European general thoracic surgery database project.
Falcoz, Pierre Emmanuel; Brunelli, Alessandro
2014-05-01
The European Society of Thoracic Surgeons (ESTS) Database is a free registry created by ESTS in 2001. The current online version was launched in 2007. It runs currently on a Dendrite platform with extensive data security and frequent backups. The main features are a specialty-specific, procedure-specific, prospectively maintained, periodically audited and web-based electronic database, designed for quality control and performance monitoring, which allows for the collection of all general thoracic procedures. Data collection is the "backbone" of the ESTS database. It includes many risk factors, processes of care and outcomes, which are specially designed for quality control and performance audit. The user can download and export their own data and use them for internal analyses and quality control audits. The ESTS database represents the gold standard of clinical data collection for European General Thoracic Surgery. Over the past years, the ESTS database has achieved many accomplishments. In particular, the database hit two major milestones: it now includes more than 235 participating centers and 70,000 surgical procedures. The ESTS database is a snapshot of surgical practice that aims at improving patient care. In other words, data capture should become integral to routine patient care, with the final objective of improving quality of care within Europe.
Ferraz Dos Santos, Lucas; Moreira Fregapani, Roberta; Falcão, Loeni Ludke; Togawa, Roberto Coiti; Costa, Marcos Mota do Carmo; Lopes, Uilson Vanderlei; Peres Gramacho, Karina; Alves, Rafael Moyses; Micheli, Fabienne; Marcellino, Lucilia Helena
2016-01-01
The cupuassu tree (Theobroma grandiflorum) (Willd. ex Spreng.) Schum. is a fruitful species from the Amazon with great economical potential, due to the multiple uses of its fruit´s pulp and seeds in the food and cosmetic industries, including the production of cupulate, an alternative to chocolate. In order to support the cupuassu breeding program and to select plants presenting both pulp/seed quality and fungal disease resistance, SSRs from Next Generation Sequencing ESTs were obtained and used in diversity analysis. From 8,330 ESTs, 1,517 contained one or more SSRs (1,899 SSRs identified). The most abundant motifs identified in the EST-SSRs were hepta- and trinucleotides, and they were found with a minimum and maximum of 2 and 19 repeats, respectively. From the 1,517 ESTs containing SSRs, 70 ESTs were selected based on their functional annotation, focusing on pulp and seed quality, as well as resistance to pathogens. The 70 ESTs selected contained 77 SSRs, and among which, 11 were polymorphic in cupuassu genotypes. These EST-SSRs were able to discriminate the cupuassu genotype in relation to resistance/susceptibility to witches' broom disease, as well as to pulp quality (SST/ATT values). Finally, we showed that these markers were transferable to cacao genotypes, and that genome availability might be used as a predictive tool for polymorphism detection and primer design useful for both Theobroma species. To our knowledge, this is the first report involving EST-SSRs from cupuassu and is also a pioneer in the analysis of marker transferability from cupuassu to cacao. Moreover, these markers might contribute to develop or saturate the cupuassu and cacao genetic maps, respectively.
Ferraz dos Santos, Lucas; Moreira Fregapani, Roberta; Falcão, Loeni Ludke; Togawa, Roberto Coiti; Costa, Marcos Mota do Carmo; Lopes, Uilson Vanderlei; Peres Gramacho, Karina; Alves, Rafael Moyses
2016-01-01
The cupuassu tree (Theobroma grandiflorum) (Willd. ex Spreng.) Schum. is a fruitful species from the Amazon with great economical potential, due to the multiple uses of its fruit´s pulp and seeds in the food and cosmetic industries, including the production of cupulate, an alternative to chocolate. In order to support the cupuassu breeding program and to select plants presenting both pulp/seed quality and fungal disease resistance, SSRs from Next Generation Sequencing ESTs were obtained and used in diversity analysis. From 8,330 ESTs, 1,517 contained one or more SSRs (1,899 SSRs identified). The most abundant motifs identified in the EST-SSRs were hepta- and trinucleotides, and they were found with a minimum and maximum of 2 and 19 repeats, respectively. From the 1,517 ESTs containing SSRs, 70 ESTs were selected based on their functional annotation, focusing on pulp and seed quality, as well as resistance to pathogens. The 70 ESTs selected contained 77 SSRs, and among which, 11 were polymorphic in cupuassu genotypes. These EST-SSRs were able to discriminate the cupuassu genotype in relation to resistance/susceptibility to witches’ broom disease, as well as to pulp quality (SST/ATT values). Finally, we showed that these markers were transferable to cacao genotypes, and that genome availability might be used as a predictive tool for polymorphism detection and primer design useful for both Theobroma species. To our knowledge, this is the first report involving EST-SSRs from cupuassu and is also a pioneer in the analysis of marker transferability from cupuassu to cacao. Moreover, these markers might contribute to develop or saturate the cupuassu and cacao genetic maps, respectively. PMID:26949967
Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A
2008-01-01
Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731
Candidate gene database and transcript map for peach, a model species for fruit trees.
Horn, Renate; Lecouls, Anne-Claire; Callahan, Ann; Dandekar, Abhaya; Garay, Lilibeth; McCord, Per; Howad, Werner; Chan, Helen; Verde, Ignazio; Main, Doreen; Jung, Sook; Georgi, Laura; Forrest, Sam; Mook, Jennifer; Zhebentyayeva, Tatyana; Yu, Yeisoo; Kim, Hye Ran; Jesudurai, Christopher; Sosinski, Bryon; Arús, Pere; Baird, Vance; Parfitt, Dan; Reighard, Gregory; Scorza, Ralph; Tomkins, Jeffrey; Wing, Rod; Abbott, Albert Glenn
2005-05-01
Peach (Prunus persica) is a model species for the Rosaceae, which includes a number of economically important fruit tree species. To develop an extensive Prunus expressed sequence tag (EST) database for identifying and cloning the genes important to fruit and tree development, we generated 9,984 high-quality ESTs from a peach cDNA library of developing fruit mesocarp. After assembly and annotation, a putative peach unigene set consisting of 3,842 ESTs was defined. Gene ontology (GO) classification was assigned based on the annotation of the single "best hit" match against the Swiss-Prot database. No significant homology could be found in the GenBank nr databases for 24.3% of the sequences. Using core markers from the general Prunus genetic map, we anchored bacterial artificial chromosome (BAC) clones on the genetic map, thereby providing a framework for the construction of a physical and transcript map. A transcript map was developed by hybridizing 1,236 ESTs from the putative peach unigene set and an additional 68 peach cDNA clones against the peach BAC library. Hybridizing ESTs to genetically anchored BACs immediately localized 11.2% of the ESTs on the genetic map. ESTs showed a clustering of expressed genes in defined regions of the linkage groups. [The data were built into a regularly updated Genome Database for Rosaceae (GDR), available at (http://www.genome.clemson.edu/gdr/).].
Ramu, P; Kassahun, B; Senthilvel, S; Ashok Kumar, C; Jayashree, B; Folkertsma, R T; Reddy, L Ananda; Kuruvinashetti, M S; Haussmann, B I G; Hash, C T
2009-11-01
The sequencing and detailed comparative functional analysis of genomes of a number of select botanical models open new doors into comparative genomics among the angiosperms, with potential benefits for improvement of many orphan crops that feed large populations. In this study, a set of simple sequence repeat (SSR) markers was developed by mining the expressed sequence tag (EST) database of sorghum. Among the SSR-containing sequences, only those sharing considerable homology with rice genomic sequences across the lengths of the 12 rice chromosomes were selected. Thus, 600 SSR-containing sorghum EST sequences (50 homologous sequences on each of the 12 rice chromosomes) were selected, with the intention of providing coverage for corresponding homologous regions of the sorghum genome. Primer pairs were designed and polymorphism detection ability was assessed using parental pairs of two existing sorghum mapping populations. About 28% of these new markers detected polymorphism in this 4-entry panel. A subset of 55 polymorphic EST-derived SSR markers were mapped onto the existing skeleton map of a recombinant inbred population derived from cross N13 x E 36-1, which is segregating for Striga resistance and the stay-green component of terminal drought tolerance. These new EST-derived SSR markers mapped across all 10 sorghum linkage groups, mostly to regions expected based on prior knowledge of rice-sorghum synteny. The ESTs from which these markers were derived were then mapped in silico onto the aligned sorghum genome sequence, and 88% of the best hits corresponded to linkage-based positions. This study demonstrates the utility of comparative genomic information in targeted development of markers to fill gaps in linkage maps of related crop species for which sufficient genomic tools are not available.
Evolutionary Patterns and Processes: Lessons from Ancient DNA.
Leonardi, Michela; Librado, Pablo; Der Sarkissian, Clio; Schubert, Mikkel; Alfarhan, Ahmed H; Alquraishi, Saleh A; Al-Rasheid, Khaled A S; Gamba, Cristina; Willerslev, Eske; Orlando, Ludovic
2017-01-01
Ever since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics, and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution—time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data.
Evolutionary Patterns and Processes: Lessons from Ancient DNA
Leonardi, Michela; Librado, Pablo; Der Sarkissian, Clio; Schubert, Mikkel; Alfarhan, Ahmed H.; Alquraishi, Saleh A.; Al-Rasheid, Khaled A. S.; Gamba, Cristina; Willerslev, Eske
2017-01-01
Abstract Ever since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics, and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution—time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data. PMID:28173586
2009-01-01
Background Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. Results A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species. PMID:19912666
The Use of EST Expression Matrixes for the Quality Control of Gene Expression Data
Milnthorpe, Andrew T.; Soloviev, Mikhail
2012-01-01
EST expression profiling provides an attractive tool for studying differential gene expression, but cDNA libraries' origins and EST data quality are not always known or reported. Libraries may originate from pooled or mixed tissues; EST clustering, EST counts, library annotations and analysis algorithms may contain errors. Traditional data analysis methods, including research into tissue-specific gene expression, assume EST counts to be correct and libraries to be correctly annotated, which is not always the case. Therefore, a method capable of assessing the quality of expression data based on that data alone would be invaluable for assessing the quality of EST data and determining their suitability for mRNA expression analysis. Here we report an approach to the selection of a small generic subset of 244 UniGene clusters suitable for identification of the tissue of origin for EST libraries and quality control of the expression data using EST expression information alone. We created a small expression matrix of UniGene IDs using two rounds of selection followed by two rounds of optimisation. Our selection procedures differ from traditional approaches to finding “tissue-specific” genes and our matrix yields consistency high positive correlation values for libraries with confirmed tissues of origin and can be applied for tissue typing and quality control of libraries as small as just a few hundred total ESTs. Furthermore, we can pick up tissue correlations between related tissues e.g. brain and peripheral nervous tissue, heart and muscle tissues and identify tissue origins for a few libraries of uncharacterised tissue identity. It was possible to confirm tissue identity for some libraries which have been derived from cancer tissues or have been normalised. Tissue matching is affected strongly by cancer progression or library normalisation and our approach may potentially be applied for elucidating the stage of normalisation in normalised libraries or for cancer staging. PMID:22412959
First genetic linkage map of Taraxacum koksaghyz Rodin based on AFLP, SSR, COS and EST-SSR markers
Arias, Marina; Hernandez, Monica; Remondegui, Naroa; Huvenaars, Koen; van Dijk, Peter; Ritter, Enrique
2016-01-01
Taraxacum koksaghyz Rodin (TKS) has been studied in many occasions as a possible alternative source for natural rubber production of good quality and for inulin production. Some tire companies are already testing TKS tire prototypes. There are also many investigations on the production of bio-fuels from inulin and inulin applications for health improvement and in the food industry. A limited amount of genomic resources exist for TKS and particularly no genetic linkage map is available in this species. We have constructed the first TKS genetic linkage map based on AFLP, COS, SSR and EST-SSR markers. The integrated linkage map with eight linkage groups (LG), representing the eight chromosomes of Russian dandelion, has 185 individual AFLP markers from parent 1, 188 individual AFLP markers from parent 2, 75 common AFLP markers and 6 COS, 1 SSR and 63 EST-SSR loci. Blasting the EST-SSR sequences against known sequences from lettuce allowed a partial alignment of our TKS map with a lettuce map. Blast searches against plant gene databases revealed some homologies with useful genes for downstream applications in the future. PMID:27488242
Kikuchi, Taisei; Aikawa, Takuya; Kosaka, Hajime; Pritchard, Leighton; Ogura, Nobuo; Jones, John T
2007-09-01
Most Bursaphelenchus species feed on fungi that colonise dead or dying trees. However, Bursaphelenchus xylophilus is unique in that in addition to feeding on fungi it has the capacity to be a parasite of live pine trees. We present an analysis of over 13,000 expressed sequence tags (ESTs) from B. xylophilus and, by way of contrast, over 3000 ESTs from a closely related species that does not parasitise plants as readily; B. mucronatus. Four libraries from B. xylophilus, from a variety of life stages including fungal feeding nematodes, nematodes extracted from plants and dauer-like stage nematodes, and one library from B. mucronatus were constructed and used to generate ESTs. Contig analysis showed that the 13,327 B. xylophilus ESTs could be grouped into 2110 contigs and 4377 singletons giving a total of 6487 identified genes. Similarly the 3193 B. mucronatus ESTs yielded a total of 2219 identified genes from 425 contigs and 1794 singletons. A variety of proteins potentially important in the parasitic process of B. xylophilus and B. mucronatus, including plant and fungal cell wall degrading enzymes and a novel gene potentially encoding a expansin-like protein that may disrupt non-covalent bonds in the plant cell wall were identified in the libraries. Additionally several gene candidates potentially involved in dauer entry or maintenance were also identified in the EST dataset. The EST sequences from this study will provide a solid base for future research on the biology, pathogenicity and evolutionary history of this nematode group.
Castelnuovo, Gianluca
2010-01-01
The field of research and practice in psychotherapy has been deeply influenced by two different approaches: the empirically supported treatments (ESTs) movement, linked with the evidence-based medicine (EBM) perspective and the “Common Factors” approach, typically connected with the “Dodo Bird Verdict”. About the first perspective, since 1998 a list of ESTs has been established in mental health field. Criterions for “well-established” and “probably efficacious” treatments have arisen. The development of these kinds of paradigms was motivated by the emergence of a “managerial” approach and related systems for remuneration also for mental health providers and for insurance companies. In this article ESTs will be presented underlining also some possible criticisms. Finally complementary approaches, that could add different evidence in the psychotherapy research in comparison with traditional EBM approach, are presented. PMID:21833197
Geology and hydrogeology of the Dammam Formation in Kuwait
NASA Astrophysics Data System (ADS)
Al-Awadi, E.; Mukhopadhyay, A.; Al-Senafy, M. N.
The Dammam Formation of Middle Eocene age is one of the major aquifers containing useable brackish water in Kuwait. Apart from the paleokarst zone at the top, the Dammam Formation in Kuwait consists of 150-200m of dolomitized limestone that is subdivided into three members, on the basis of lithology and biofacies. The upper member consists of friable chalky dolomicrite and dolomite. The middle member is mainly laminated biomicrite and biodolomicrite. The lower member is nummulitic limestone with interlayered shale toward the base. Geophysical markers conform to these subdivisions. Core analyses indicate that the upper member is the most porous and permeable of the three units, as confirmed by the distribution of lost-circulation zones. The quality of water in the aquifer deteriorates toward the north and east. A potentiometric-head difference exists between the Dammam Formation and the unconformably overlying Kuwait Group; this difference is maintained by the presence of an intervening aquitard. Résumé La formation de Damman, d'âge Éocène moyen, est l'un des principaux aquifères du Koweit, contenant de l'eau saumâtre utilisable. A part dans sa partie supérieure où existe un paléokarst, la formation de Damman au Koweit est constituée par 150 à 200m de calcaires dolomitisés, divisés en trois unités sur la base de leur lithologie et de biofaciès. L'unité supérieure est formée d'une dolomicrite crayeuse et friable et d'une dolomie. L'unité médiane est pour l'essentiel une biomicrite laminée et une biodolomicrite. L'unité inférieure est un calcaire nummulitique avec des intercalations argileuses vers la base. Les marqueurs géophysiques sont conformes à ces subdivisions. Les analyses de carottes montrent que l'unité supérieure est la plus poreuse et la plus perméable des trois. La répartition des zones d'écoulement souterrain confirment ces données. La qualité de l'eau dans l'aquifère se dégrade en direction du nord et de l'est. Une différence de niveau piézométrique est observée entre la formation de Damman et le groupe de Koweit qui la recouvre en discordance; cette différence est due à la présence d'un niveau imperméable qui la maintient captive. Resumen La Formación Damman, del Eoceno Medio, es uno de los mayores acuíferos de agua salobre aprovechable en Kuwait. Además de una zona de paleokarst en la parte superior, la Formación Damman en Kuwait consiste en 150-200m de caliza dolomitizada, que se subdivide en tres zonas en función de la litología y la biofacies. La parte superior está formada por dolomicrita yesífera friable y dolomita. La parte central es básicamente biomicrita laminada y biodolomicrita. La inferior es caliza nummulítica, con intercalaciones de pizarra en la base. Los marcadores geofísicos reflejan claramente estas subdivisiones. Los análisis de testigos revelan que la parte superior es la más porosa y permeable de las tres unidades. Esto queda confirmado con la distribución de zonas de circulación perdidas. La calidad del agua en el acuífero se deteriora hacia el norte y el este. Sobre la Formación Damman, y de manera no conforme, suprayace otra formación, que se conoce como Grupo Kuwaití. Existe una diferencia de niveles piezométricos entre ambas formaciones, la cual se mantiene por la presencia de una capa semiconfinante.
Rational design of a carboxylic esterase RhEst1 based on computational analysis of substrate binding
Chen, Qi; Luan, Zheng -Jiao; Yu, Hui -Lei; ...
2015-10-31
A new carboxylic esterase RhEst1 which catalyzes the hydrolysis of (S)-(+)-2,2-dimethylcyclopropanecarboxylate (S-DmCpCe), the key chiral building block of cilastatin, was identified and subsequently crystallized in our previous work. Mutant RhEst 1A147I/V148F/G254A was found to show a 5-fold increase in the catalytic activity. In this work, molecular dynamic simulations were performed to elucidate the molecular determinant of the enzyme activity. Our simulations show that the substrate binds much more strongly in the A147I/V148F/G254A mutant than in wild type, with more hydrogen bonds formed between the substrate and the catalytic triad and the oxyanion hole. The OH group of the catalytic residuemore » Ser101 in the mutant is better positioned to initiate the nucleophilic attack on S-DmCpCe. Interestingly, the "170-179" loop which is involved in shaping the catalytic sites and facilitating the product release shows remarkable dynamic differences in the two systems. Based on the simulation results, six residues were identified as potential "hot-spots" for further experimental testing. Consequently, the G126S and R133L mutants show higher catalytic efficiency as compared with the wild type. In conclusion, this work provides molecular-level insights into the substrate binding mechanism of carboxylic esterase RhEst1, facilitating future experimental efforts toward developing more efficient RhEst1 variants for industrial applications.« less
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Microgenerateurs electriques a base d'oscillateurs thermiques
NASA Astrophysics Data System (ADS)
Leveille, Etienne
Dans un contexte de developpement durable et d'automatisation de notre environnement, l'utilisation de capteurs sans-fil distribues est croissante. Hors l'usage et le remplacement de piles s'avere couteux. La consommation energetique de plus en plus faible de l'electronique rend l'extraction energetique de l'energie ambiante envisageable. La chaleur residuelle est une source d'energie interessante puisqu'elle est la forme finale de la majeure partie de l'energie utilisee par l'humain. Cependant, a petite echelle, seuls les elements thermoelectriques sont disponibles. Les presents travaux s'interessent donc a explorer et comparer des mecanismes de generation alternatifs. Puisque la majorite des mecanismes de transduction alternatifs sont dynamiques, leur utilisation requiert une transformation de l'energie thermique continue en oscillations. Les mecanismes etudies ont donc tous en commun de posseder un oscillateur thermique en plus d'un mecanisme de transduction vers la forme d'energie electrique. Parmi les divers mecanismes identifies, deux sont etudies en details pour comprendre leurs comportements ainsi que connaitre leur efficacite et leur puissance potentielle. Le premier generateur etudie theoriquement est base sur le changement de ferromagnetisme d'une masse suspendue par des ressorts au-dessus d'un aimant. Les comportements du modele developpe correspondent aux comportements reportes dans la litterature. Deux parametres de conception principaux ont ete identifies, permettant un controle de la frequence, de la plage de temperatures d'operation. De plus le mecanisme peut operer avec de faibles differences de temperature et des temperatures proches de l'ambiant, ouvrant la porte a des applications utilisant la chaleur du corps humain. L'utilisation de materiau pyroelectrique comme mecanisme de transduction pourrait offrir des densites de puissance electrique envisageables de l'ordre de 1mW/cm3. Le second generateur etudie experimentalement est base sur l'evaporation explosive d'un liquide surchauffe en absence de sites de nucleation. Un premier prototype a permis de demontrer, pour la premiere fois, le fonctionnement d'un tel cycle. Une etude de l'effet de la temperature de la source de chaleur et de l'effet du debit de liquide montre qu'une zone d'operation ideale est presente. La puissance de sortie maximale mesuree est de l'ordre de 1.6muW. Des ameliorations sont proposees pour faire croitre cette puissance de deux ordres de grandeur. Finalement, l'utilisation du pompage capillaire pour rendre le systeme autonome est demontre, mais reste sensible aux variations de conditions. Finalement, l'etude des dispositifs montre que les microgenerateurs a base d'oscillateurs thermiques peuvent presenter un interet, par rapport aux elements thermoelectriques, dans les applications ou les temperatures sont faibles ou incertaines. Cependant, ces mecanismes souffrent d'une tres faible efficacite causee par les multiples transformations energetiques a faible couplage. Mots-cles : microgenerateur, oscillateur thermique, cycle thermodynamique, thermoelectricite, microsystemes electromecaniques, MEMS
Bonizzoni, Paola; Rizzi, Raffaella; Pesole, Graziano
2005-10-05
Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies. We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.
Argout, Xavier; Fouet, Olivier; Wincker, Patrick; Gramacho, Karina; Legavre, Thierry; Sabau, Xavier; Risterucci, Ange Marie; Da Silva, Corinne; Cascardo, Julio; Allegre, Mathilde; Kuhn, David; Verica, Joseph; Courtois, Brigitte; Loor, Gaston; Babin, Regis; Sounigo, Olivier; Ducamp, Michel; Guiltinan, Mark J; Ruiz, Manuel; Alemanno, Laurence; Machado, Regina; Phillips, Wilberth; Schnell, Ray; Gilmour, Martin; Rosenquist, Eric; Butler, David; Maximova, Siela; Lanaud, Claire
2008-01-01
Background Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. Results Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species. Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories. A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database. To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection. A large collection of new genetic markers was provided by this ESTs collection. Conclusion This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation. PMID:18973681
Argout, Xavier; Fouet, Olivier; Wincker, Patrick; Gramacho, Karina; Legavre, Thierry; Sabau, Xavier; Risterucci, Ange Marie; Da Silva, Corinne; Cascardo, Julio; Allegre, Mathilde; Kuhn, David; Verica, Joseph; Courtois, Brigitte; Loor, Gaston; Babin, Regis; Sounigo, Olivier; Ducamp, Michel; Guiltinan, Mark J; Ruiz, Manuel; Alemanno, Laurence; Machado, Regina; Phillips, Wilberth; Schnell, Ray; Gilmour, Martin; Rosenquist, Eric; Butler, David; Maximova, Siela; Lanaud, Claire
2008-10-30
Theobroma cacao L., is a tree originated from the tropical rainforest of South America. It is one of the major cash crops for many tropical countries. T. cacao is mainly produced on smallholdings, providing resources for 14 million farmers. Disease resistance and T. cacao quality improvement are two important challenges for all actors of cocoa and chocolate production. T. cacao is seriously affected by pests and fungal diseases, responsible for more than 40% yield losses and quality improvement, nutritional and organoleptic, is also important for consumers. An international collaboration was formed to develop an EST genomic resource database for cacao. Fifty-six cDNA libraries were constructed from different organs, different genotypes and different environmental conditions. A total of 149,650 valid EST sequences were generated corresponding to 48,594 unigenes, 12,692 contigs and 35,902 singletons. A total of 29,849 unigenes shared significant homology with public sequences from other species.Gene Ontology (GO) annotation was applied to distribute the ESTs among the main GO categories.A specific information system (ESTtik) was constructed to process, store and manage this EST collection allowing the user to query a database.To check the representativeness of our EST collection, we looked for the genes known to be involved in two different metabolic pathways extensively studied in other plant species and important for T. cacao qualities: the flavonoid and the terpene pathways. Most of the enzymes described in other crops for these two metabolic pathways were found in our EST collection.A large collection of new genetic markers was provided by this ESTs collection. This EST collection displays a good representation of the T. cacao transcriptome, suitable for analysis of biochemical pathways based on oligonucleotide microarrays derived from these ESTs. It will provide numerous genetic markers that will allow the construction of a high density gene map of T. cacao. This EST collection represents a unique and important molecular resource for T. cacao study and improvement, facilitating the discovery of candidate genes for important T. cacao trait variation.
Sequence evaluation of four specific cDNA libraries for developmental genomics of sunflower.
Tamborindeguy, C; Ben, C; Liboz, T; Gentzbittel, L
2004-04-01
Four different cDNA libraries were constructed from sunflower protoplasts growing under embryogenic and non-embryogenic conditions: one standard library from each condition and two subtractive libraries in opposite sense. A total of 22,876 cDNA clones were obtained and 4800 ESTs were sequenced, giving rise to 2479 high quality ESTs representing an unigene set of 1502 sequences. This set was compared with ESTs represented in public databases using the programs BLASTN and BLASTX, and its members were classified according to putative function using the catalog in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Some 33% of sequences failed to align with existing plant ESTs and therefore represent putative novel genes. The libraries show a low level of redundancy and, on average, 50% of the present ESTs have not been previously reported for sunflower. Several potentially interesting genes were identified, based on their homology with genes involved in animal zygotic division or plant embryogenesis. We also identified two ESTs that show significantly different levels of expression under embryogenic and non-embryogenic conditions. The libraries described here represent an original and valuable resource for the discovery of yet unknown genes putatively involved in dicot embryogenesis and improving our knowledge of the mechanisms involved in polarity acquisition by plant embryos.
Construction, database integration, and application of an Oenothera EST library.
Mrácek, Jaroslav; Greiner, Stephan; Cho, Won Kyong; Rauwolf, Uwe; Braun, Martha; Umate, Pavan; Altstätter, Johannes; Stoppel, Rhea; Mlcochová, Lada; Silber, Martina V; Volz, Stefanie M; White, Sarah; Selmeier, Renate; Rudd, Stephen; Herrmann, Reinhold G; Meurer, Jörg
2006-09-01
Coevolution of cellular genetic compartments is a fundamental aspect in eukaryotic genome evolution that becomes apparent in serious developmental disturbances after interspecific organelle exchanges. The genus Oenothera represents a unique, at present the only available, resource to study the role of the compartmentalized plant genome in diversification of populations and speciation processes. An integrated approach involving cDNA cloning, EST sequencing, and bioinformatic data mining was chosen using Oenothera elata with the genetic constitution nuclear genome AA with plastome type I. The Gene Ontology system grouped 1621 unique gene products into 17 different functional categories. Application of arrays generated from a selected fraction of ESTs revealed significantly differing expression profiles among closely related Oenothera species possessing the potential to generate fertile and incompatible plastid/nuclear hybrids (hybrid bleaching). Furthermore, the EST library provides a valuable source of PCR-based polymorphic molecular markers that are instrumental for genotyping and molecular mapping approaches.
Gao, Beile; Gupta, Radhey S
2007-01-01
Background The Archaea are highly diverse in terms of their physiology, metabolism and ecology. Presently, very few molecular characteristics are known that are uniquely shared by either all archaea or the different main groups within archaea. The evolutionary relationships among different groups within the Euryarchaeota branch are also not clearly understood. Results We have carried out comprehensive analyses on each open reading frame (ORFs) in the genomes of 11 archaea (3 Crenarchaeota – Aeropyrum pernix, Pyrobaculum aerophilum and Sulfolobus acidocaldarius; 8 Euryarchaeota – Pyrococcus abyssi, Methanococcus maripaludis, Methanopyrus kandleri, Methanococcoides burtonii, Halobacterium sp. NCR-1, Haloquadratum walsbyi, Thermoplasma acidophilum and Picrophilus torridus) to search for proteins that are unique to either all Archaea or for its main subgroups. These studies have identified 1448 proteins or ORFs that are distinctive characteristics of Archaea and its various subgroups and whose homologues are not found in other organisms. Six of these proteins are unique to all Archaea, 10 others are only missing in Nanoarchaeum equitans and a large number of other proteins are specific for various main groups within the Archaea (e.g. Crenarchaeota, Euryarchaeota, Sulfolobales and Desulfurococcales, Halobacteriales, Thermococci, Thermoplasmata, all methanogenic archaea or particular groups of methanogens). Of particular importance is the observation that 31 proteins are uniquely present in virtually all methanogens (including M. kandleri) and 10 additional proteins are only found in different methanogens as well as A. fulgidus. In contrast, no protein was exclusively shared by various methanogen and any of the Halobacteriales or Thermoplasmatales. These results strongly indicate that all methanogenic archaea form a monophyletic group exclusive of other archaea and that this lineage likely evolved from Archaeoglobus. In addition, 15 proteins that are uniquely shared by M. kandleri and Methanobacteriales suggest a close evolutionary relationship between them. In contrast to the phylogenomics studies, a monophyletic grouping of archaea is not supported by phylogenetic analyses based on protein sequences. Conclusion The identified archaea-specific proteins provide novel molecular markers or signature proteins that are distinctive characteristics of Archaea and all of its major subgroups. The species distributions of these proteins provide novel insights into the evolutionary relationships among different groups within Archaea, particularly regarding the origin of methanogenesis. Most of these proteins are of unknown function and further studies should lead to discovery of novel biochemical and physiological characteristics that are unique to either all archaea or its different subgroups. PMID:17394648
Barvkar, Vitthal T; Pardeshi, Varsha C; Kale, Sandip M; Kadoo, Narendra Y; Gupta, Vidya S
2012-05-08
The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions.
2012-01-01
Background The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Results Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Conclusions Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions. PMID:22568875
Etude de faisabilite d'un systeme eolien diesel avec stockage d'air comprime
NASA Astrophysics Data System (ADS)
Benchaabane, Youssef
Le Systeme Hybride Eolien-Diesel avec Stockage d'Air Comprime (SHEDAC) utilise l'hybridation pneumatique pour remplacer la consommation des combustibles fossiles par de l'energie renouvelable, plus particulierement de l'energie eolienne. Le surplus de l'energie eolienne est utilise pour comprimer et stocker de l'air qui est utilise ensuite pour suralimenter le moteur diesel. Le memoire de maitrise est constitue de deux articles scientifiques. Le premier article presente le developpement d'un logiciel dedie a l'etude de faisabilite d'un systeme eolien-diesel avec stockage d'air comprime. Cette etude est basee sur l'analyse des couts et des revenus, des couts des equipements (eolienne, moteur diesel, systeme de stockage d'air). Elle est completee par une analyse de sensibilite aux differents parametres, une analyse des risques et des emissions des gaz a effet de serre (GES). Le deuxieme article est une application de ce logiciel pour l'installation d'un systeme SHEDAC au camp minier Esker au Quebec en remplacement des sources actuelles de production d'energie. L'utilisation du stockage d'air comprime a l'aide d'un systeme SHEDAC est le plus rentable par rapport a l'utilisation de l'energie eolienne seule ou d'une centrale thermique au diesel seule ou des deux combinees. Avec une valeur actuelle nette et un taux de rendement interne plus eleves, cette solution permet d'obtenir le plus bas cout de l'energie pour cette region eloignee. None None None
Yutin, Natalya; Raoult, Didier; Koonin, Eugene V
2013-05-23
Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor. We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknown virus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements. The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements.
2013-01-01
Background Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor. Results We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknownvirus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements. Conclusions The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements. PMID:23701946
A phylogenomic profile of hemerythrins, the nonheme diiron binding respiratory proteins
2008-01-01
Background Hemerythrins, are the non-heme, diiron binding respiratory proteins of brachiopods, priapulids and sipunculans; they are also found in annelids and bacteria, where their functions have not been fully elucidated. Results A search for putative Hrs in the genomes of 43 archaea, 444 bacteria and 135 eukaryotes, revealed their presence in 3 archaea, 118 bacteria, several fungi, one apicomplexan, a heterolobosan, a cnidarian and several annelids. About a fourth of the Hr sequences were identified as N- or C-terminal domains of chimeric, chemotactic gene regulators. The function of the remaining single domain bacterial Hrs remains to be determined. In addition to oxygen transport, the possible functions in annelids have been proposed to include cadmium-binding, antibacterial action and immunoprotection. A Bayesian phylogenetic tree revealed a split into two clades, one encompassing archaea, bacteria and fungi, and the other comprising the remaining eukaryotes. The annelid and sipunculan Hrs share the same intron-exon structure, different from that of the cnidarian Hr. Conclusion The phylogenomic profile of Hrs demonstrated a limited occurrence in bacteria and archaea and a marked absence in the vast majority of multicellular organisms. Among the metazoa, Hrs have survived in a cnidarian and in a few protostome groups; hence, it appears that in metazoans the Hr gene was lost in deuterostome ancestor(s) after the radiata/bilateria split. Signal peptide sequences in several Hirudinea Hrs suggest for the first time, the possibility of extracellular localization. Since the α-helical bundle is likely to have been among the earliest protein folds, Hrs represent an ancient family of iron-binding proteins, whose primary function in bacteria may have been that of an oxygen sensor, enabling aerophilic or aerophobic responses. Although Hrs evolved to function as O2 transporters in brachiopods, priapulids and sipunculans, their function in annelids remains to be elucidated. Overall Hrs exhibit a considerable lack of evolutionary success in metazoans. PMID:18764950
Ješovnik, Ana; González, Vanessa L; Schultz, Ted R
2016-01-01
Fungus-farming ("attine") ants are model systems for studies of symbiosis, coevolution, and advanced eusociality. A New World clade of nearly 300 species in 15 genera, all attine ants cultivate fungal symbionts for food. In order to better understand the evolution of ant agriculture, we sequenced, assembled, and analyzed transcriptomes of four different attine ant species in two genera: three species in the higher-attine genus Sericomyrmex and a single lower-attine ant species, Apterostigma megacephala, representing the first genomic data for either genus. These data were combined with published genomes of nine other ant species and the honey bee Apis mellifera for phylogenomic and divergence-dating analyses. The resulting phylogeny confirms relationships inferred in previous studies of fungus-farming ants. Divergence-dating analyses recovered slightly older dates than most prior analyses, estimating that attine ants originated 53.6-66.7 million of years ago, and recovered a very long branch subtending a very recent, rapid radiation of the genus Sericomyrmex. This result is further confirmed by a separate analysis of the three Sericomyrmex species, which reveals that 92.71% of orthologs have 99% - 100% pairwise-identical nucleotide sequences. We searched the transcriptomes for genes of interest, most importantly argininosuccinate synthase and argininosuccinate lyase, which are functional in other ants but which are known to have been lost in seven previously studied attine ant species. Loss of the ability to produce the amino acid arginine has been hypothesized to contribute to the obligate dependence of attine ants upon their cultivated fungi, but the point in fungus-farming ant evolution at which these losses occurred has remained unknown. We did not find these genes in any of the sequenced transcriptomes. Although expected for Sericomyrmex species, the absence of arginine anabolic genes in the lower-attine ant Apterostigma megacephala strongly suggests that the loss coincided with the origin of attine ants.
Genetic Diversity in Lens Species Revealed by EST and Genomic Simple Sequence Repeat Analysis
Dikshit, Harsh Kumar; Singh, Akanksha; Singh, Dharmendra; Aski, Muraleedhar Sidaram; Prakash, Prapti; Jain, Neelu; Meena, Suresh; Kumar, Shiv; Sarker, Ashutosh
2015-01-01
Low productivity of pilosae type lentils grown in South Asia is attributed to narrow genetic base of the released cultivars which results in susceptibility to biotic and abiotic stresses. For enhancement of productivity and production, broadening of genetic base is essentially required. The genetic base of released cultivars can be broadened by using diverse types including bold seeded and early maturing lentils from Mediterranean region and related wild species. Genetic diversity in eighty six accessions of three species of genus Lens was assessed based on twelve genomic and thirty one EST-SSR markers. The evaluated set of genotypes included diverse lentil varieties and advanced breeding lines from Indian programme, two early maturing ICARDA lines and five related wild subspecies/species endemic to the Mediterranean region. Genomic SSRs exhibited higher polymorphism in comparison to EST SSRs. GLLC 598 produced 5 alleles with highest gene diversity value of 0.80. Among the studied subspecies/species 43 SSRs detected maximum number of alleles in L. orientalis. Based on Nei’s genetic distance cultivated lentil L. culinaris subsp. culinaris was found to be close to its wild progenitor L. culinaris subsp. orientalis. The Prichard’s structure of 86 genotypes distinguished different subspecies/species. Higher variability was recorded among individuals within population than among populations. PMID:26381889
NASA Astrophysics Data System (ADS)
De Montigny, Etienne
Cette these traite du developpement d'instrumentation pour l'imagerie medicale optique. Ces travaux sont centres sur une application particuliere ; faciliter l'identification des tissus durant les chirurgies de la thyroide et de la parathyroide. La thyroide est une glande situee dans le cou, attachee au larynx a la hauteur de la pomme d'Adam. Elle est entouree de plusieurs structures importantes : muscles, nerfs et glandes parathyroides. Ces dernieres controlent la calcemie et jouent donc un role essentiel dans le corps. Elles sont toutefois de petite taille et sont tres difficiles a distinguer du gras et des ganglions environnants. L'objectif principal de cette these est de developper une instrumentation basee sur la microscopie optique pour permettre l'identification des tissus : thyroide, parathyroide, gras et ganglions, durant les chirurgies. Les choix sont donc faits en fonction de cette application et du contexte specifique des mesures intra-operatoires sur des patients humains. Plusieurs modalites d'imagerie optique sont identifiees pour atteindre l'objectif : microscopie confocale en reflectance, tomographique par coherence optique, et mesure de l'autofluorescence des glandes parathyroides. Dans le but d'ameliorer leur compatibilite avec l'environnement clinique qui requiert stabilite dans le temps et resistance aux vibrations et aux conditions environnementales, ce projet se concentre sur les implementations miniaturisables et basees sur des fibres optiques. Pour implementer un systeme d'imagerie en fluorescence a balayage laser rapide, un systeme d'imagerie en fluorescence par encodage spectral est propose. Bien que l'utilisation de l'encodage spectral semble a priori incompatible avec le contraste en fluorescence, une implementation facile a realiser est proposee. Une seconde version du montage, compatible avec la clinique et facilitant le developpement d'un endoscope, est presentee. La preuve de principe de cette methode est faite a 1300nm, une longueur d'onde qui n'est pas appropriee pour la fluorescence intrinseque des parathyroides. Pour adresser cette lacune, une nouvelle source laser a balayage centree a 780nm a haute puissance (100mW) est montree. Ces developpements sont compatibles avec l'implementation de la microscopie confocale en reflectance identifiee pour l'identification des tissus durant les chirurgies de la thyroide. Cela permet de developper un montage combinant le contraste en reflectance et en fluorescence dans le meme instrument. La microscopie confocale en reflectance possede une tres grande resolution permettant l'examen au niveau cellulaire des tissus. Cette technique souffre toutefois d'un faible rapport signal sur bruit et d'un bruit de tavelure important, reduisant l'interpretabilite des images.
Soini, E. J. O.; García San Andrés, B.; Joensuu, T.
2011-01-01
Background: To assess the cost-effectiveness of trabectedin compared with end-stage treatment (EST) after failure with anthracycline and/or ifosfamide in metastatic soft tissue sarcoma (mSTS). Design: Analysis was carried out using a probabilistic Markov model with trabectedin → EST and EST arms, three health states (stable disease, progressive disease and death) and a lifetime perspective (3% annual discount rate). Finnish resources (drugs, mSTS, adverse events and travelling) and costs (year 2008) were used. Efficacy was based on an indirect comparison of the STS-201 and European Organisation for Research and Treatment of Cancer trials. QLQ-C30 scale scores were mapped to 15D, Short Form 6D and EuroQol 5D utilities. The outcome measures were the cost-effectiveness acceptability frontier, incremental cost per life year gained (LYG) and quality-adjusted life year (QALY) gained and the expected value of perfect information (EVPI). Results: Trabectedin → EST was associated with 14.0 (95% confidence interval 9.1–19.2) months longer survival, €36 778 higher costs (€32 816 using hospital price for trabectedin) and €31 590 (€28 192) incremental cost per LYG with an EVPI of €3008 (€3188) compared with EST. With a threshold of €50 000 per LYG, trabectedin → EST had 98.5% (98.2%) probability of being cost-effective. The incremental cost per QALY gained with trabectedin → EST was €42 633–47 735 (€37 992–42 819) compared with EST. The results were relatively insensitive to changes. Conclusion: Trabectedin is a potentially cost-effective treatment of mSTS patients. PMID:20627875
2012-01-01
Background MicroRNAs (miRNAs) are small RNAs (21-24 bp) providing an RNA-based system of gene regulation highly conserved in plants and animals. In plants, miRNAs control mRNA degradation or restrain translation, affecting development and responses to stresses. Plant miRNAs show imperfect but extensive complementarity to mRNA targets, making their computational prediction possible, useful when data mining is applied on different species. In this study we used a comparative approach to identify both miRNAs and their targets, in artichoke and safflower. Results Two complete expressed sequence tags (ESTs) datasets from artichoke (3.6·104 entries) and safflower (4.2·104), were analysed with a bioinformatic pipeline and in vitro experiments, identifying 17 potential miRNAs. For each EST, using RNAhybrid program and 953 non redundant miRNA mature sequences, available in mirBase as reference, we searched matching putative targets. 8730 out of 42011 ESTs from safflower and 7145 of 36323 ESTs from artichoke showed at least one predicted miRNA target. BLAST analysis showed that 75% of all ESTs shared at least a common homologous region (E-value < 10-4) and about 50% of these displayed 400 bp or longer aligned sequences as conserved homologous/orthologous (COS) regions. 960 and 890 ESTs of safflower and artichoke organized in COS shared 79 different miRNA targets, considered functionally conserved, and statistically significant when compared with random sequences (signal to noise ratio > 2 and specificity ≥ 0.85). Four highly significant miRNAs selected from in silico data were experimentally validated in globe artichoke leaves. Conclusions Mature miRNAs and targets were predicted within EST sequences of safflower and artichoke. Most of the miRNA targets appeared highly/moderately conserved, highlighting an important and conserved function. In this study we introduce a stringent parameter for the comparative sequence analysis, represented by the identification of the same target in the COS region. After statistical analysis 79 targets, found on the COS regions and belonging to 60 miRNA families, have a signal to noise ratio > 2, with ≥ 0.85 specificity. The putative miRNAs identified belong to 55 dicotyledon plants and to 24 families only in monocotyledon. PMID:22536958
Zhu, Qiyun; Kosoy, Michael; Olival, Kevin J.; Dittmar, Katharina
2014-01-01
Bartonellae are mammalian pathogens vectored by blood-feeding arthropods. Although of increasing medical importance, little is known about their ecological past, and host associations are underexplored. Previous studies suggest an influence of horizontal gene transfers in ecological niche colonization by acquisition of host pathogenicity genes. We here expand these analyses to metabolic pathways of 28 Bartonella genomes, and experimentally explore the distribution of bartonellae in 21 species of blood-feeding arthropods. Across genomes, repeated gene losses and horizontal gains in the phospholipid pathway were found. The evolutionary timing of these patterns suggests functional consequences likely leading to an early intracellular lifestyle for stem bartonellae. Comparative phylogenomic analyses discover three independent lineage-specific reacquisitions of a core metabolic gene—NAD(P)H-dependent glycerol-3-phosphate dehydrogenase (gpsA)—from Gammaproteobacteria and Epsilonproteobacteria. Transferred genes are significantly closely related to invertebrate Arsenophonus-, and Serratia-like endosymbionts, and mammalian Helicobacter-like pathogens, supporting a cellular association with arthropods and mammals at the base of extant Bartonella spp. Our studies suggest that the horizontal reacquisitions had a key impact on bartonellae lineage specific ecological and functional evolution. PMID:25106622
The future of genomics in polar and alpine cyanobacteria
Anesio, Alexandre M; Sánchez-Baracaldo, Patricia
2018-01-01
Abstract In recent years, genomic analyses have arisen as an exciting way of investigating the functional capacity and environmental adaptations of numerous micro-organisms of global relevance, including cyanobacteria. In the extreme cold of Arctic, Antarctic and alpine environments, cyanobacteria are of fundamental ecological importance as primary producers and ecosystem engineers. While their role in biogeochemical cycles is well appreciated, little is known about the genomic makeup of polar and alpine cyanobacteria. In this article, we present ways that genomic techniques might be used to further our understanding of cyanobacteria in cold environments in terms of their evolution and ecology. Existing examples from other environments (e.g. marine/hot springs) are used to discuss how methods developed there might be used to investigate specific questions in the cryosphere. Phylogenomics, comparative genomics and population genomics are identified as methods for understanding the evolution and biogeography of polar and alpine cyanobacteria. Transcriptomics will allow us to investigate gene expression under extreme environmental conditions, and metagenomics can be used to complement tradition amplicon-based methods of community profiling. Finally, new techniques such as single cell genomics and metagenome assembled genomes will also help to expand our understanding of polar and alpine cyanobacteria that cannot readily be cultured. PMID:29506259
A congruent phylogenomic signal places eukaryotes within the Archaea.
Williams, Tom A; Foster, Peter G; Nye, Tom M W; Cox, Cymon J; Embley, T Martin
2012-12-22
Determining the relationships among the major groups of cellular life is important for understanding the evolution of biological diversity, but is difficult given the enormous time spans involved. In the textbook 'three domains' tree based on informational genes, eukaryotes and Archaea share a common ancestor to the exclusion of Bacteria. However, some phylogenetic analyses of the same data have placed eukaryotes within the Archaea, as the nearest relatives of different archaeal lineages. We compared the support for these competing hypotheses using sophisticated phylogenetic methods and an improved sampling of archaeal biodiversity. We also employed both new and existing tests of phylogenetic congruence to explore the level of uncertainty and conflict in the data. Our analyses suggested that much of the observed incongruence is weakly supported or associated with poorly fitting evolutionary models. All of our phylogenetic analyses, whether on small subunit and large subunit ribosomal RNA or concatenated protein-coding genes, recovered a monophyletic group containing eukaryotes and the TACK archaeal superphylum comprising the Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota. Hence, while our results provide no support for the iconic three-domain tree of life, they are consistent with an extended eocyte hypothesis whereby vital components of the eukaryotic nuclear lineage originated from within the archaeal radiation.
Chipman, Ariel D; Erwin, Douglas H
2017-09-01
The last few years have seen a significant increase in the amount of data we have about the evolution of the arthropod body plan. This has come mainly from three separate sources: a new consensus and improved resolution of arthropod phylogeny, based largely on new phylogenomic analyses; a wealth of new early arthropod fossils from a number of Cambrian localities with excellent preservation, as well as a renewed analysis of some older fossils; and developmental data from a range of model and non-model pan-arthropod species that shed light on the developmental origins and homologies of key arthropod traits. However, there has been relatively little synthesis among these different data sources, and the three communities studying them have little overlap. The symposium "The Evolution of Arthropod Body Plans-Integrating Phylogeny, Fossils and Development" brought together leading researchers in these three disciplines and made a significant contribution to the emerging synthesis of arthropod evolution, which will help advance the field and will be useful for years to come. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Zhou, Zhijun; Shi, Fuming; Zhao, Ling
2014-01-01
Hagloidea Handlirsch, 1906 was an ancient group of Ensifera, that was much more diverse in the past extending at least into the Triassic, apparently diminishing in diversity through the Cretaceous, and now only represented by a few extant species. In this paper, we report the complete mitochondrial genome (mitogenome) of Tarragoilus diuturnus Gorochov, 2001, representing the first mitogenome of the superfamily Hagloidea. The size of the entire mitogenome of T. diuturnus is 16144 bp, containing 13 protein-coding genes (PCGs), 2 ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes and one control region. The order and orientation of the gene arrangement pattern is identical to that of D. yakuba and most ensiferans species. A phylogenomic analysis was carried out based on the concatenated dataset of 13 PCGs and 2 rRNA genes from mitogenome sequences of 15 ensiferan species, comprising four superfamilies Grylloidea, Tettigonioidae, Rhaphidophoroidea and Hagloidea. Both maximum likelihood and Bayesian inference analyses strongly support Hagloidea T. diuturnus and Rhaphidophoroidea Troglophilus neglectus as forming a monophyletic group, sister to the Tettigonioidea. The relationships among four superfamilies of Ensifera were (Grylloidea, (Tettigonioidea, (Hagloidea, Rhaphidophoroidea))). PMID:24465850
Symbiosis in eukaryotic evolution.
López-García, Purificación; Eme, Laura; Moreira, David
2017-12-07
Fifty years ago, Lynn Margulis, inspiring in early twentieth-century ideas that put forward a symbiotic origin for some eukaryotic organelles, proposed a unified theory for the origin of the eukaryotic cell based on symbiosis as evolutionary mechanism. Margulis was profoundly aware of the importance of symbiosis in the natural microbial world and anticipated the evolutionary significance that integrated cooperative interactions might have as mechanism to increase cellular complexity. Today, we have started fully appreciating the vast extent of microbial diversity and the importance of syntrophic metabolic cooperation in natural ecosystems, especially in sediments and microbial mats. Also, not only the symbiogenetic origin of mitochondria and chloroplasts has been clearly demonstrated, but improvement in phylogenomic methods combined with recent discoveries of archaeal lineages more closely related to eukaryotes further support the symbiogenetic origin of the eukaryotic cell. Margulis left us in legacy the idea of 'eukaryogenesis by symbiogenesis'. Although this has been largely verified, when, where, and specifically how eukaryotic cells evolved are yet unclear. Here, we shortly review current knowledge about symbiotic interactions in the microbial world and their evolutionary impact, the status of eukaryogenetic models and the current challenges and perspectives ahead to reconstruct the evolutionary path to eukaryotes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Streicher, Jeffrey W; Wiens, John J
2017-09-01
Squamate reptiles (lizards and snakes) are the most diverse group of terrestrial vertebrates, with more than 10 000 species. Despite considerable effort to resolve relationships among major squamates clades, some branches have remained difficult. Among the most vexing has been the placement of snakes among lizard families, with most studies yielding only weak support for the position of snakes. Furthermore, the placement of iguanian lizards has remained controversial. Here we used targeted sequence capture to obtain data from 4178 nuclear loci from ultraconserved elements from 32 squamate taxa (and five outgroups) including representatives of all major squamate groups. Using both concatenated and species-tree methods, we recover strong support for a sister relationship between iguanian and anguimorph lizards, with snakes strongly supported as the sister group of these two clades. These analyses strongly resolve the difficult placement of snakes within squamates and show overwhelming support for the contentious position of iguanians. More generally, we provide a strongly supported hypothesis of higher-level relationships in the most species-rich tetrapod clade using coalescent-based species-tree methods and approximately 100 times more loci than previous estimates. © 2017 The Author(s).
[Multiplexing mapping of human cDNAs]. Final report, September 1, 1991--February 28, 1994
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
Using PCR with automated product analysis, 329 human brain cDNA sequences have been assigned to individual human chromosomes. Primers were designed from single-pass cDNA sequences expressed sequence tags (ESTs). Primers were used in PCR reactions with DNA from somatic cell hybrid mapping panels as templates, often with multiplexing. Many ESTs mapped match sequence database records. To evaluate of these matches, the position of the primers relative to the matching region (In), the BLAST scores and the Poisson probability values of the EST/sequence record match were determined. In cases where the gene product was stringently identified by the sequence match hadmore » already been mapped, the gene locus determined by EST was consistent with the previous position which strongly supports the validity of assigning unknown genes to human chromosomes based on the EST sequence matches. In the present cases mapping the ESTs to a chromosome can also be considered to have mapped the known gene product: rolipram-sensitive cAMP phosphodiesterase, chromosome 1; protein phosphatase 2A{beta}, chromosome 4; alpha-catenin, chromosome 5; the ELE1 oncogene, chromosome 10q11.2 or q2.1-q23; MXII protein, chromosome l0q24-qter; ribosomal protein L18a homologue, chromosome 14; ribosomal protein L3, chromosome 17; and moesin, Xp11-cen. There were also ESTs mapped that were closely related to non-human sequence records. These matches therefore can be considered to identify human counterparts of known gene products, or members of known gene families. Examples of these include membrane proteins, translation-associated proteins, structural proteins, and enzymes. These data then demonstrate that single pass sequence information is sufficient to design PCR primers useful for assigning cDNA sequences to human chromosomes. When the EST sequence matches previous sequence database records, the chromosome assignments of the EST can be used to make preliminary assignments of the human gene to a chromosome.« less
Yuan, Zhaohe; Fang, Yanming; Zhang, Taikui; Fei, Zhangjun; Han, Fengming; Liu, Cuiyu; Liu, Min; Xiao, Wei; Zhang, Wenjing; Wu, Shan; Zhang, Mengwei; Ju, Youhui; Xu, Huili; Dai, He; Liu, Yujun; Chen, Yanhui; Wang, Lili; Zhou, Jianqing; Guan, Dian; Yan, Ming; Xia, Yanhua; Huang, Xianbin; Liu, Dongyuan; Wei, Hongmin; Zheng, Hongkun
2017-12-22
Pomegranate (Punica granatum L.) has an ancient cultivation history and has become an emerging profitable fruit crop due to its attractive features such as the bright red appearance and the high abundance of medicinally valuable ellagitannin-based compounds in its peel and aril. However, the limited genomic resources have restricted further elucidation of genetics and evolution of these interesting traits. Here, we report a 274-Mb high-quality draft pomegranate genome sequence, which covers approximately 81.5% of the estimated 336-Mb genome, consists of 2177 scaffolds with an N50 size of 1.7 Mb and contains 30 903 genes. Phylogenomic analysis supported that pomegranate belongs to the Lythraceae family rather than the monogeneric Punicaceae family, and comparative analyses showed that pomegranate and Eucalyptus grandis share the paleotetraploidy event. Integrated genomic and transcriptomic analyses provided insights into the molecular mechanisms underlying the biosynthesis of ellagitannin-based compounds, the colour formation in both peels and arils during pomegranate fruit development, and the unique ovule development processes that are characteristic of pomegranate. This genome sequence provides an important resource to expand our understanding of some unique biological processes and to facilitate both comparative biology studies and crop breeding. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
QuEST: Qualifying Environmentally Sustainable Technologies. Volume 6
NASA Technical Reports Server (NTRS)
Lewis, Pattie
2011-01-01
QuEST is a publication of the NASA Technology Evaluation for Environmental Risk Mitigation Principal Center (TEERM). This issue contains brief articles on: Risk Identification and Mitigation, Material Management and Substitution Efforts--Hexavalent Chrome-free Coatings and Low volatile organic compounds (VOCs) Coatings, Lead-Free Electronics, Corn-Based Depainting Media; Alternative Energy Efforts Hydrogen Sensors and Solar Air Conditioning. Other TEERM Efforts include: Energy and Water Management and Remediation Technology Collaboration.
Zhang, Jinpeng; Liu, Weihua; Lu, Yuqing; Liu, Qunxing; Yang, Xinming; Li, Xiuquan; Li, Lihui
2017-09-20
Agropyron cristatum is a wild grass of the tribe Triticeae and serves as a gene donor for wheat improvement. However, very few markers can be used to monitor A. cristatum chromatin introgressions in wheat. Here, we reported a resource of large-scale molecular markers for tracking alien introgressions in wheat based on transcriptome sequences. By aligning A. cristatum unigenes with the Chinese Spring reference genome sequences, we designed 9602 A. cristatum expressed sequence tag-sequence-tagged site (EST-STS) markers for PCR amplification and experimental screening. As a result, 6063 polymorphic EST-STS markers were specific for the A. cristatum P genome in the single-receipt wheat background. A total of 4956 randomly selected polymorphic EST-STS markers were further tested in eight wheat variety backgrounds, and 3070 markers displaying stable and polymorphic amplification were validated. These markers covered more than 98% of the A. cristatum genome, and the marker distribution density was approximately 1.28 cM. An application case of all EST-STS markers was validated on the A. cristatum 6 P chromosome. These markers were successfully applied in the tracking of alien A. cristatum chromatin. Altogether, this study provided a universal method of large-scale molecular marker development to monitor wild relative chromatin in wheat.
Obstructive apnea hypopnea index estimation by analysis of nocturnal snoring signals in adults.
Ben-Israel, Nir; Tarasiuk, Ariel; Zigel, Yaniv
2012-09-01
To develop a whole-night snore sounds analysis algorithm enabling estimation of obstructive apnea hypopnea index (AHI(EST)) among adult subjects. Snore sounds were recorded using a directional condenser microphone placed 1 m above the bed. Acoustic features exploring intra-(mel- cepstability, pitch density) and inter-(running variance, apnea phase ratio, inter-event silence) snore properties were extracted and integrated to assess AHI(EST). University-affiliated sleep-wake disorder center and biomedical signal processing laboratory. Ninety subjects (age 53 ± 13 years, BMI 31 ± 5 kg/m(2)) referred for polysomnography (PSG) diagnosis of OSA were prospectively and consecutively recruited. The system was trained and tested on 60 subjects. Validation was blindly performed on the additional 30 consecutive subjects. AHI(EST) correlated with AHI (AHI(PSG); r(2) = 0.81, P < 0.001). Area under the receiver operating characteristic curve of 85% and 92% for thresholds of 10 and 20 events/h, respectively, were obtained for OSA detection. Both Altman-Bland analysis and diagnostic agreement criteria revealed 80% and 83% agreements of AHI(EST) with AHI(PSG), respectively. Acoustic analysis based on intra- and inter-snore properties can differentiate subjects according to AHI. An acoustic-based screening system may address the growing needs for reliable OSA screening tool. Further studies are needed to support these findings.
Zhang, Gu-wen; Xu, Sheng-chun; Mao, Wei-hua; Hu, Qi-zan; Gong, Ya-ming
2013-01-01
The development of expressed sequence tag-derived simple sequence repeats (EST-SSRs) provided a useful tool for investigating plant genetic diversity. In the present study, 22 polymorphic EST-SSRs from grain soybean were identified and used to assess the genetic diversity in 48 vegetable soybean accessions. Among the 22 EST-SSR loci, tri-nucleotides were the most abundant repeats, accounting for 50.00% of the total motifs. GAA was the most common motif among tri-nucleotide repeats, with a frequency of 18.18%. Polymorphic analysis identified a total of 71 alleles, with an average of 3.23 per locus. The polymorphism information content (PIC) values ranged from 0.144 to 0.630, with a mean of 0.386. Observed heterozygosity (H o) values varied from 0.0196 to 1.0000, with an average of 0.6092, while the expected heterozygosity (H e) values ranged from 0.1502 to 0.6840, with a mean value of 0.4616. Principal coordinate analysis and phylogenetic tree analysis indicated that the accessions could be assigned to different groups based to a large extent on their geographic distribution, and most accessions from China were clustered into the same groups. These results suggest that Chinese vegetable soybean accessions have a narrow genetic base. The results of this study indicate that EST-SSRs from grain soybean have high transferability to vegetable soybean, and that these new markers would be helpful in taxonomy, molecular breeding, and comparative mapping studies of vegetable soybean in the future. PMID:23549845
Chen, Qi; Luan, Zheng-Jiao; Yu, Hui-Lei; Cheng, Xiaolin; Xu, Jian-He
2015-11-01
A new carboxylic esterase RhEst1 which catalyzes the hydrolysis of (S)-(+)-2,2-dimethylcyclopropanecarboxylate (S-DmCpCe), the key chiral building block of cilastatin, was identified and subsequently crystallized in our previous work. Mutant RhEst1A147I/V148F/G254A was found to show a 5-fold increase in the catalytic activity. In this work, molecular dynamic simulations were performed to elucidate the molecular determinant of the enzyme activity. Our simulations show that the substrate binds much more strongly in the A147I/V148F/G254A mutant than in wild type, with more hydrogen bonds formed between the substrate and the catalytic triad and the oxyanion hole. The OH group of the catalytic residue Ser101 in the mutant is better positioned to initiate the nucleophilic attack on S-DmCpCe. Interestingly, the "170-179" loop which is involved in shaping the catalytic sites and facilitating the product release shows remarkable dynamic differences in the two systems. Based on the simulation results, six residues were identified as potential "hot-spots" for further experimental testing. Consequently, the G126S and R133L mutants show higher catalytic efficiency as compared with the wild type. This work provides molecular-level insights into the substrate binding mechanism of carboxylic esterase RhEst1, facilitating future experimental efforts toward developing more efficient RhEst1 variants for industrial applications. Copyright © 2015 Elsevier Inc. All rights reserved.
Introgression of the Kinetoplast DNA: An Unusual Evolutionary Journey in Trypanosoma cruzi.
Tomasini, Nicolás
2018-02-01
Phylogenetic relationships between different lineages of Trypanosoma cruzi, the agent of Chagas disease, have been controversial for several years. However, recent phylogenetic and phylogenomic analyses clarified the nuclear relationships among such lineages. However, incongruence between nuclear and kinetoplast DNA phylogenies has emerged as a new challenge. This incongruence implies several events of mitochondrial introgression at evolutionary level. However, the mechanism that gave origin to introgressed lineages is unknown. Here, I will review and discuss how maxicircles of the kinetoplast were horizontally and vertically transferred between different lineages of T. cruzi. Finally, I will discuss what we know - and what we don't - about the kDNA transference and inheritance in the context of sexual reproduction in this parasite.
Ramchiary, Nirala; Nguyen, Van Dan; Li, Xiaonan; Hong, Chang Pyo; Dhandapani, Vignesh; Choi, Su Ryun; Yu, Ge; Piao, Zhong Yun; Lim, Yong Pyo
2011-01-01
Genic microsatellite markers, also known as functional markers, are preferred over anonymous markers as they reveal the variation in transcribed genes among individuals. In this study, we developed a total of 707 expressed sequence tag-derived simple sequence repeat markers (EST-SSRs) and used for development of a high-density integrated map using four individual mapping populations of B. rapa. This map contains a total of 1426 markers, consisting of 306 EST-SSRs, 153 intron polymorphic markers, 395 bacterial artificial chromosome-derived SSRs (BAC-SSRs), and 572 public SSRs and other markers covering a total distance of 1245.9 cM of the B. rapa genome. Analysis of allelic diversity in 24 B. rapa germplasm using 234 mapped EST-SSR markers showed amplification of 2 alleles by majority of EST-SSRs, although amplification of alleles ranging from 2 to 8 was found. Transferability analysis of 167 EST-SSRs in 35 species belonging to cultivated and wild brassica relatives showed 42.51% (Sysimprium leteum) to 100% (B. carinata, B. juncea, and B. napus) amplification. Our newly developed EST-SSRs and high-density linkage map based on highly transferable genic markers would facilitate the molecular mapping of quantitative trait loci and the positional cloning of specific genes, in addition to marker-assisted selection and comparative genomic studies of B. rapa with other related species. PMID:21768136
L’actinomycose thoracique multiple chez l’immunocompétent
Msougar, Yassine; Fenane, Hicham; Maidi, Mehdi; Benosman, Abdellatif
2013-01-01
L′actinomycose est une affection bactérienne granulomateuse, suppurative, étendue et chronique provoquée par la bactérie anaérobique gram positif Actinomyces israelii. La localisation thoracique est rare, elle peut simuler une pathologie néoplasique ou une tuberculose. Il s’agit d’un patient de 54ans sans antécédents pathologiques, qui s’est présenté avec deux tuméfactions pariétales basithoarciques droites, l’une antérieure et l’autre postérieure s’accompagnant d’une altération de l’état général. L’examen clinique ainsi que le bilan radiologique ont montré deux masses de la paroi thoracique et une atteinte parenchymateuse basale droite. L’examen anatomopathologique de la biopsie de la masse antérieure a montré des foyers d’actinomycose permettant d’établir le diagnostic d’actinomycose thoraco-pulmonaire. Un bilan immunologique s’est révélé normal. Le patient est alors mis sous traitement antibiotique à base d’amoxicilline protégée avec bonne évolution clinique et radiologique. Le but de cette observation est de rappeler les aspects radio-clinique, histologiques, thérapeutiques et évolutifs ainsi que les difficultés diagnostiques de cette affection. PMID:24672630
PipeOnline 2.0: automated EST processing and functional data sorting.
Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A
2002-11-01
Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.
Theoretical Analysis of the Electron Spiral Toroid Concept
NASA Technical Reports Server (NTRS)
Cambier, Jean-Luc; Micheletti, David A.; Bushnell, Dennis M. (Technical Monitor)
2000-01-01
This report describes the analysis of the Electron Spiral Toroid (EST) concept being promoted by Electron Power Systems Inc. (EPS). The EST is described as a toroidal plasma structure composed Of ion and electron shells. It is claimed that the EST requires little or no external confinement, despite the extraordinarily large energy densities resulting from the self-generating magnetic fields. The present analysis is based upon documentation made available by EPS, a previous description of the model by the Massachusetts Institute of Technology (MIT), and direct discussions with EPS and MIT. It is found that claims of absolute stability and large energy storage capacities of the EST concept have not been substantiated. Notably, it can be demonstrated that the ion fluid is fundamentally unstable. Although various scenarios for ion confinement were subsequently suggested by EPS and MIT, none were found to be plausible. Although the experimental data does not prove the existence of EST configurations, there is undeniable experimental evidence that some type of plasma structures whose characteristics remain to be determined are observed. However, more realistic theoretical models must first be developed to explain their existence and properties before applications of interest to NASA can he assessed and developed.
Rudd, Stephen
2005-01-01
The public expressed sequence tag collections are continually being enriched with high-quality sequences that represent an ever-expanding range of taxonomically diverse plant species. While these sequence collections provide biased insight into the populations of expressed genes available within individual species and their associated tissues, the information is conceivably of wider relevance in a comparative context. When we consider the available expressed sequence tag (EST) collections of summer 2004, most of the major plant taxonomic clades are at least superficially represented. Investigation of the five million available plant ESTs provides a wealth of information that has applications in modelling the routes of plant genome evolution and the identification of lineage-specific genes and gene families. Over four million ESTs from over 50 distinct plant species have been collated within an EST analysis pipeline called openSputnik. The ESTs were resolved down into approximately one million unigene sequences. These have been annotated using orthology-based annotation transfer from reference plant genomes and using a variety of contemporary bioinformatics methods to assign peptide, structural and functional attributes. The openSputnik database is available at http://sputnik.btk.fi.
2010-01-01
Background Genetic markers and linkage mapping are basic prerequisites for marker-assisted selection and map-based cloning. In the case of the key grassland species Lolium spp., numerous mapping populations have been developed and characterised for various traits. Although some genetic linkage maps of these populations have been aligned with each other using publicly available DNA markers, the number of common markers among genetic maps is still low, limiting the ability to compare candidate gene and QTL locations across germplasm. Results A set of 204 expressed sequence tag (EST)-derived simple sequence repeat (SSR) markers has been assigned to map positions using eight different ryegrass mapping populations. Marker properties of a subset of 64 EST-SSRs were assessed in six to eight individuals of each mapping population and revealed 83% of the markers to be polymorphic in at least one population and an average number of alleles of 4.88. EST-SSR markers polymorphic in multiple populations served as anchor markers and allowed the construction of the first comprehensive consensus map for ryegrass. The integrated map was complemented with 97 SSRs from previously published linkage maps and finally contained 284 EST-derived and genomic SSR markers. The total map length was 742 centiMorgan (cM), ranging for individual chromosomes from 70 cM of linkage group (LG) 6 to 171 cM of LG 2. Conclusions The consensus linkage map for ryegrass based on eight mapping populations and constructed using a large set of publicly available Lolium EST-SSRs mapped for the first time together with previously mapped SSR markers will allow for consolidating existing mapping and QTL information in ryegrass. Map and markers presented here will prove to be an asset in the development for both molecular breeding of ryegrass as well as comparative genetics and genomics within grass species. PMID:20712870
2009-01-01
Background Expressed sequence tags (ESTs) are an important source of gene-based markers such as those based on insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs), to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes. Results A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 × G19833 recombinant inbred line (RIL) population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 × 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes. Conclusion The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction of a transcript map and given their high conservation between species allowed synteny comparisons to be made to sequenced genomes. This synteny analysis may support positional cloning of target genes in common bean through the use of genomic information from these other legumes. PMID:20030833
Xi, Zhenxiang; Liu, Liang; Davis, Charles C
2015-11-01
The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014). Copyright © 2015 Elsevier Inc. All rights reserved.
Nouveaux supraconducteurs à haute température critique à base de mercure
NASA Astrophysics Data System (ADS)
Michel, C.; Hervieu, M.; Martin, C.; Maignan, A.; Pelloquin, D.; Goutenoire, F.; Huvé, M.; Raveau, B.
1994-11-01
Structural and superconducting properties of new cations substituted mercury based oxides are described. They are mainly characterized by [ Hg{1-x}MxOδ] infty monolayers, however a compound with doubled [ Hg{1-x}MxOδ] infty layer is described for the first time. Critical temperatures vary in a large range 0leqslant T_cleqslant 130 K, but they are lower than those of related to pure mercury oxides. The influence of annealings in various atmospheres upon T_c is discussed. Les caractéristiques structurales et supraconductrices de nouveaux oxydes à base de mercure, dans lesquels le mercure est partiellement remplacé par un autre cation, sont décrites. Dans la majorité des cas, ces oxydes sont caractérisés par une monocouche [ Hg{1-x}MxOδ] infty ; cependant, pour la première fois un composé contenant une double couche majoritaire en mercure est isolé. Les températures critiques varient dans un large domaine (0-130 K) mais restent inférieures à celles des oxydes parents ll tout mercure gg. L'influence des recuits sous des atmosphères diverses est discutée.
Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus
2013-01-01
Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered – 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had “no hits found”, 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. Conclusions High-quality EST-SNPs from different citrus genotypes were detected, and compared to estimate the heterozygosity of each genome. All the SNP oligo sequences were aligned with the Clementine citrus genome to determine their distribution and uniqueness and for in silico validation, in addition to SNaPshot and sequencing validation of selected SNPs. PMID:24175923
Optimisation des trajectoires verticales par la methode de la recherche de l'harmonie =
NASA Astrophysics Data System (ADS)
Ruby, Margaux
Face au rechauffement climatique, les besoins de trouver des solutions pour reduire les emissions de CO2 sont urgentes. L'optimisation des trajectoires est un des moyens pour reduire la consommation de carburant lors d'un vol. Afin de determiner la trajectoire optimale de l'avion, differents algorithmes ont ete developpes. Le but de ces algorithmes est de reduire au maximum le cout total d'un vol d'un avion qui est directement lie a la consommation de carburant et au temps de vol. Un autre parametre, nomme l'indice de cout est considere dans la definition du cout de vol. La consommation de carburant est fournie via des donnees de performances pour chaque phase de vol. Dans le cas de ce memoire, les phases d'un vol complet, soit, une phase de montee, une phase de croisiere et une phase de descente, sont etudies. Des " marches de montee " etaient definies comme des montees de 2 000ft lors de la phase de croisiere sont egalement etudiees. L'algorithme developpe lors de ce memoire est un metaheuristique, nomme la recherche de l'harmonie, qui, concilie deux types de recherches : la recherche locale et la recherche basee sur une population. Cet algorithme se base sur l'observation des musiciens lors d'un concert, ou plus exactement sur la capacite de la musique a trouver sa meilleure harmonie, soit, en termes d'optimisation, le plus bas cout. Differentes donnees d'entrees comme le poids de l'avion, la destination, la vitesse de l'avion initiale et le nombre d'iterations doivent etre, entre autre, fournies a l'algorithme pour qu'il soit capable de determiner la solution optimale qui est definie comme : [Vitesse de montee, Altitude, Vitesse de croisiere, Vitesse de descente]. L'algorithme a ete developpe a l'aide du logiciel MATLAB et teste pour plusieurs destinations et plusieurs poids pour un seul type d'avion. Pour la validation, les resultats obtenus par cet algorithme ont ete compares dans un premier temps aux resultats obtenus suite a une recherche exhaustive qui a utilisee toutes les combinaisons possibles. Cette recherche exhaustive nous a fourni l'optimal global; ainsi, la solution de notre algorithme doit se rapprocher le plus possible de la recherche exhaustive afin de prouver qu'il donne des resultats proche de l'optimal global. Une seconde comparaison a ete effectuee entre les resultats fournis par l'algorithme et ceux du Flight Management System (FMS) qui est un systeme d'avionique situe dans le cockpit de l'avion fournissant la route a suivre afin d'optimiser la trajectoire. Le but est de prouver que l'algorithme de la recherche de l'harmonie donne de meilleurs resultats que l'algorithme implemente dans le FMS.
Synthese de champs sonores adaptative
NASA Astrophysics Data System (ADS)
Gauthier, Philippe-Aubert
La reproduction de champs acoustiques est une approche physique au probleme technologique de la spatialisation sonore. Cette these concerne l'aspect physique de la reproduction de champs acoustiques. L'objectif principal est l'amelioration de la reproduction de champs acoustiques par "synthese de champs acoustiques" ("Wave Field Synthesis", WFS), une approche connue, basee sur des hypotheses de champ libre, a l'aide du controle actif par l'ajout de capteurs de l'erreur de reproduction et d'une boucle fermee. Un premier chapitre technique (chapitre 4) expose les resultats d'appreciation objective de la WFS par simulations et mesures experimentales. L'effet indesirable de la salle de reproduction sur les qualites objectives de la WFS fut illustre. Une premiere question de recherche fut ensuite abordee (chapitre 5), a savoir s'il est possible de reproduire des champs progressifs en salle dans un paradigme physique de controle actif: cette possibilite fut prouvee. L'approche technique privilegiee, "synthese de champs adaptative" ("Adaptive Wave Field Synthesis" [AWFS]), fut definie, puis simulee (chapitre 6). Cette approche d'AWFS comporte une originalite en controle actif et en reproduction de champs acoustiques: la fonction cout quadratique representant la minimisation des erreurs de reproduction inclut une regularisation de Tikhonov avec solution a priori qui vient de la WFS. L'etude de l'AWFS a l'aide de la decomposition en valeurs singulieres (chapitre 7) a permis de comprendre les mecanismes propres a l'AWFS. C'est la deuxieme principale originalite de la these. L'algorithme FXLMS (LMS et reference filtree) est modifie pour l'AWFS (chapitre 8). Le decouplage du systeme par decomposition en valeurs singulieres est illustre dans le domaine du traitement de signal et l'AWFS basee sur le controle independant des modes de rayonnement est simulee (chapitre 8). Ce qui constitue la troisieme originalite principale de cette these. Ces simulations du traitement de signal montrent l'efficacite des algorithmes et la capacite de l'AWFS a attenuer les erreurs attribuables a des reflexions acoustiques. Le neuvieme chapitre presente des resultats experimentaux d'AWFS. L'objectif etait de valider la methode et d'evaluer les performances de l'AWFS. Un autre algorithme prometteur est aussi teste. Les resultats demontrent la bonne marche de l'AWFS et des algorithmes testes. Autant dans le cas de la reproduction de champs harmoniques que dans le cas de la reproduction de champs a large bande, l'AWFS reduit l'erreur de reproduction de la WFS et les effets indesirables causes par les lieux de reproduction.
Work productivity in a population-based cohort of patients with spondyloarthritis.
Haglund, Emma; Bremander, Ann; Bergman, Stefan; Jacobsson, Lennart T H; Petersson, Ingemar F
2013-09-01
To assess work productivity and associated factors in patients with SpA. This cross-sectional postal survey included 1773 patients with SpA identified in a regional health care register. Items on presenteeism (reduced productivity at work, 0-100%, 0 = no reduction) were answered by 1447 individuals. Absenteeism was defined as register-based sick leave using data from a national register. Disease duration, disease activity (BASDAI), physical function (BASFI), health-related quality of life (EQ-5D), anxiety (HAD-a), depression (HAD-d), self-efficacy [Arthritis Self-efficacy Scale (ASES) pain and symptom], physical activity and education were also measured. Forty-five per cent reported reduced productivity at work with a mean reduction of 20% (95% CI 18, 21) and women reported a higher mean reduction than men (mean 23% vs 17%, P < 0.001). Worse quality of life, disease activity, physical function and anxiety all correlated with reduced productivity (r = 0.52-0.66, P < 0.001), while sick leave did not. Worse outcomes on the EQ-5D (β-est -9.6, P < 0.001), BASDAI (β-est 7.8, P < 0.001), BASFI (β-est 7.3, P < 0.001), ASES pain (β-est -0.5, P < 0.001) and HAD-d (β-est 3.4, P < 0.001) were associated with reduced productivity at work in patients with SpA regardless of age, gender and disease subgroup. ASES symptoms, HAD-a and education level <12 years were associated with reduced productivity but were not significant in all strata for age, gender and disease subgroup. Work productivity was reduced in patients with SpA and more so in women. Worse quality of life, disease activity, physical function, self-efficacy and depression were all associated with reduced productivity at work in patients with SpA.
NASA Astrophysics Data System (ADS)
Keller, D.; Gervais, A.; Chambonnet, D.; Belouet, C.; Audry, C.
1995-02-01
In the field of superconducting devices devoted to microwave applications, the crystalline texture of high quality thin films based on YBa{2}Cu{3}O{7 - δ} is of primary importance. This study presents the formation of this texture on MgO substrates with the nucleation and growth steps up to a film thickness of 300 nm as observed by means of AFM, HRTEM and XRD. The influence of deposition temperature on the growth mode is shown and a nucleation/growth model is discussed. The minimum roughness of c_{bot 0}{(^1)} textured films, 300 nm thick and 20 × 20 mm2 in size is as slow as 2 nm. Dans le cadre de la réalisation de composants supraconducteurs de haute qualité à base du composé YBa{2}Cu{3}O{7 - δ} destinés aux applications en hyperfréquences, le contrôle de la texture cristalline des films est de première importance. La formation de celle-ci sur substrat MgO est étudiée depuis la nucléation jusqu'à une épaisseur de 300 nm au moyen de la microscopie à force atomique, de la microscopie électronique en transmission à haute résolution et de la diffraction des rayons X. L'influence de la température de dépôt sur le mode de croissance est abordée et un modèle de nucléation/croissance est discuté. La rugosité minimale des films d'épaisseur 300 nm et de dimensions 20 × 20 mm2 de texture c_{bot 0}{(^1)} est voisine de 2 nm.
Zhou, Rongqiong; Xia, Qingyou; Huang, Hancheng; Lai, Min; Wang, Zhenxin
2011-10-01
Toxocara canis is a widespread intestinal nematode parasite of dogs, which can also cause disease in humans. We employed an expressed sequence tag (EST) strategy in order to study gene-expression including development, digestion and reproduction of T. canis. ESTs provided a rapid way to identify genes, particularly in organisms for which we have very little molecular information. In this study, a cDNA library was constructed from a female adult of T. canis and 215 high-quality ESTs from 5'-ends of the cDNA clones representing 79 unigenes were obtained. The titer of the primary cDNA library was 1.83×10(6)pfu/mL with a recombination rate of 99.33%. Most of the sequences ranged from 300 to 900bp with an average length of 656bp. Cluster analysis of these ESTs allowed identification of 79 unique sequences containing 28 contigs and 51 singletons. BLASTX searches revealed that 18 unigenes (22.78% of the total) or 70 ESTs (32.56% of the total) were novel genes that had no significant matches to any protein sequences in the public databases. The rest of the 61 unigenes (77.22% of the total) or 145 ESTs (67.44% of the total) were closely matched to the known genes or sequences deposited in the public databases. These genes were classified into seven groups based on their known or putative biological functions. We also confirmed the gene expression patterns of several immune-related genes using RT-PCR examination. This work will provide a valuable resource for the further investigations in the stage-, sex- and tissue-specific gene transcription or expression. Copyright © 2011. Published by Elsevier Inc.
Construction of new EST-SSRs for Fusarium resistant wheat breeding.
Yumurtaci, Aysen; Sipahi, Hulya; Al-Abdallat, Ayed; Jighly, Abdulqader; Baum, Michael
2017-06-01
Surveying Fusarium resistance in wheat with easy applicable molecular markers such as simple sequence repeats (SSRs) is a prerequest for molecular breeding. Expressed sequence tags (ESTs) are one of the main sources for development of new SSR candidates. Therefore, 18.292 publicly available wheat ESTs were mined and genotyping of newly developed 55 EST-SSR derived primer pairs produced clear fragments in ten wheat cultivars carrying different levels of Fusarium resistance. Among the proved markers, 23 polymorphic EST-SSRs were obtained and related alleles were mostly found on B and D genome. Based on the fragment profiling and similarity analysis, a 327bp amplicon, which was a product of contig 1207 (chromosome 5BL), was detected only in Fusarium head blight (FHB) resistant cultivars (CM82036 and Sumai) and the amino acid sequences showed a similarity to pathogen related proteins. Another FHB resistance related EST-SSR, Contig 556 (chromosome 1BL) produced a 151bp fragment in Sumai and was associated to wax2-like protein. A polymorphic 204bp fragment, derived from Contig 578 (chromosome 1DL), was generated from root rot (FRR) resistant cultivars (2-49; Altay2000 and Sunco). A total of 98 alleles were displayed with an average of 1.8 alleles per locus and the polymorphic information content (PIC) ranged from 0.11 to 0.78. Dendrogram tree with two main and five sub-groups were displayed the highest genetic relationship between FRR resistant cultivars (2-49 and Altay2000), FRR sensitive cultivars (Seri82 and Scout66) and FHB resistant cultivars (CM82036 and Sumai). Thus, exploitation of these candidate EST-SSRs may help to genotype other wheat sources for Fusarium resistance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Validation de modeles d'eclairement incident a la surface de l'eau en Arctique
NASA Astrophysics Data System (ADS)
Julien, Laliberte
Dans ce memoire, deux methodes d'estimation d'eclairement incident a la surface de l'Arctique sont evaluees. Une base de donnees in situ a ete constituee a partir de 16 campagnes oceanographiques en Arctique. Pour les dates ou l'eclairement est mesure, les estimations d'eclairement journalier incident a la surface obtenues a partir des satellites de la couleur de l'eau (Frouin et al. 2003) et a partir des satellites meteorologiques (Belanger et al. 2013) sont produites. De meme, un exercise de comparaison entre les estimations satellitaires est produit pour l'annee 2004 sur tout le territoire Arctique. La comparaison entre les donnees observees et les donnees estimees a partir des satellites meteorologiques donnent un biais de 6% et une quadratique moyenne 33%. La comparaison entre les observations et les satellites de la couleur de l'eau donnent un biais de 2% et 20%. Finalement, la difference moyenne entre les estimations des 2 methodes d'estimation satellitaires pour tout l'Arctique pour l'annee 2004 est de 0,29 Einstein/m2/jour avec un ecart-type de 6,78 Einstein/m2/jour. Les resultats montrent entre autres que la methode qui utilise les satellites de la couleur de l'eau est plus precise pour estimer l'eclairement sur une petite superficie puisqu'elle rend mieux les variations locales dans l'eclairement. La methode qui utilise les satellites meteorologique est plus precise pour estimer l'eclairement sur une grande superficie, puisqu'elle est moins restreinte dans les conditions qui permettent de fournir une estimation. Ainsi, la methode qui utilise les satellites meteorologiques montre qu'un eclairement annuel de l'Arctique de 38% n'est pas prise en compte par les satellites de la couleur de l'eau.
Entrevue avec le Dr Charley Zeanah
2013-01-01
Le Dr Charles Zeanah est titulaire de la chaire de psychiatrie Mary K. Sellars-Polchow, professeur de pédiatrie clinique et vice-président de la pédopsychiatrie au département de psychiatrie et des sciences du comportement de la faculté de médecine de l’Université Tulane, à la Nouvelle-Orléans. Il est également directeur général de l’institut de la santé mentale des nourrissons et des jeunes enfants de Tulane. Il est récipiendaire de nombreux prix, notamment le prix de prévention Irving Phillips (AACAP), la mention élogieuse présidentielle pour sa recherche et son leadership exceptionnels en santé mentale des nourrissons (American Orthopsychiatric Association), le prix d’excellence clinique Sarah Haley Memorial (International Society for Traumatic Stress Studies), le prix de recherche en pédopsychiatrie Blanche F. Ittelson (APA), et le prix Serge Lebovici Award soulignant les contributions internationales à la santé mentale des nourrissons (World Association for Infant Mental Health). Le Dr Zeanah est fellow distingué de l’AACAP, fellow distingué de l’APA et membre du conseil d’administration de Zero to Three. Il est l’éditeur scientifique de Handbook of Infant Mental Health (3e édition) qui est considéré comme étant le manuel de pointe et la référence de base du domaine de la santé mentale des nourrissons.
Uneven distribution of expressed sequence tag loci on maize pachytene chromosomes
Anderson, Lorinda K.; Lai, Ann; Stack, Stephen M.; Rizzon, Carene; Gaut, Brandon S.
2006-01-01
Examining the relationships among DNA sequence, meiotic recombination, and chromosome structure at a genome-wide scale has been difficult because only a few markers connect genetic linkage maps with physical maps. Here, we have positioned 1195 genetically mapped expressed sequence tag (EST) markers onto the 10 pachytene chromosomes of maize by using a newly developed resource, the RN-cM map. The RN-cM map charts the distribution of crossing over in the form of recombination nodules (RNs) along synaptonemal complexes (SCs, pachytene chromosomes) and allows genetic cM distances to be converted into physical micrometer distances on chromosomes. When this conversion is made, most of the EST markers used in the study are located distally on the chromosomes in euchromatin. ESTs are significantly clustered on chromosomes, even when only euchromatic chromosomal segments are considered. Gene density and recombination rate (as measured by EST and RN frequencies, respectively) are strongly correlated. However, crossover frequencies for telomeric intervals are much higher than was expected from their EST frequencies. For pachytene chromosomes, EST density is about fourfold higher in euchromatin compared with heterochromatin, while DNA density is 1.4 times higher in heterochromatin than in euchromatin. Based on DNA density values and the fraction of pachytene chromosome length that is euchromatic, we estimate that ∼1500 Mbp of the maize genome is in euchromatin. This overview of the organization of the maize genome will be useful in examining genome and chromosome evolution in plants. PMID:16339046
Mornkham, T; Wangsomnuk, P P; Mo, X C; Francisco, F O; Gao, L Z; Kurzweil, H
2016-10-24
Jerusalem artichoke (Helianthus tuberosus L.) is a perennial tuberous plant and a traditional inulin-rich crop in Thailand. It has become the most important source of inulin and has great potential for use in chemical and food industries. In this study, expressed sequence tag (EST)-based simple sequence repeat (SSR) markers were developed from 40,362 Jerusalem artichoke ESTs retrieved from the NCBI database. Among 23,691 non-redundant identified ESTs, 1949 SSR motifs harboring 2 to 6 nucleotides with varied repeat motifs were discovered from 1676 assembled sequences. Seventy-nine primer pairs were generated from EST sequences harboring SSR motifs. Our results show that 43 primers are polymorphic for the six studied populations, while the remaining 36 were either monomorphic or failed to amplify. These 43 SSR loci exhibited a high level of genetic diversity among populations, with allele numbers varying from 2 to 7, with an average of 3.95 alleles per loci. Heterozygosity ranged from 0.096 to 0.774, with an average of 0.536; polymorphic index content ranged from 0.096 to 0.854, with an average of 0.568. Principal component analysis and neighbor-joining analysis revealed that the six populations could be divided into six clusters. Our results indicate that these newly characterized EST-SSR markers may be useful in the exploration of genetic diversity and range expansion of the Jerusalem artichoke, and in cross-species application for the genus Helianthus.
Brenner, Eric D; Katari, Manpreet S; Stevenson, Dennis W; Rudd, Stephen A; Douglas, Andrew W; Moss, Walter N; Twigg, Richard W; Runko, Suzan J; Stellari, Giulia M; McCombie, WR; Coruzzi, Gloria M
2005-01-01
Background Ginkgo biloba L. is the only surviving member of one of the oldest living seed plant groups with medicinal, spiritual and horticultural importance worldwide. As an evolutionary relic, it displays many characters found in the early, extinct seed plants and extant cycads. To establish a molecular base to understand the evolution of seeds and pollen, we created a cDNA library and EST dataset from the reproductive structures of male (microsporangiate), female (megasporangiate), and vegetative organs (leaves) of Ginkgo biloba. Results RNA from newly emerged male and female reproductive organs and immature leaves was used to create three distinct cDNA libraries from which 6,434 ESTs were generated. These 6,434 ESTs from Ginkgo biloba were clustered into 3,830 unigenes. A comparison of our Ginkgo unigene set against the fully annotated genomes of rice and Arabidopsis, and all available ESTs in Genbank revealed that 256 Ginkgo unigenes match only genes among the gymnosperms and non-seed plants – many with multiple matches to genes in non-angiosperm plants. Conversely, another group of unigenes in Gingko had highly significant homology to transcription factors in angiosperms involved in development, including MADS box genes as well as post-transcriptional regulators. Several of the conserved developmental genes found in Ginkgo had top BLAST homology to cycad genes. We also note here the presence of ESTs in G. biloba similar to genes that to date have only been found in gymnosperms and an additional 22 Ginkgo genes common only to genes from cycads. Conclusion Our analysis of an EST dataset from G. biloba revealed genes potentially unique to gymnosperms. Many of these genes showed homology to fully sequenced clones from our cycad EST dataset found in common only with gymnosperms. Other Ginkgo ESTs are similar to developmental regulators in higher plants. This work sets the stage for future studies on Ginkgo to better understand seed and pollen evolution, and to resolve the ambiguous phylogenetic relationship of G. biloba among the gymnosperms. PMID:16225698
NASA Astrophysics Data System (ADS)
Aboutajeddine, Ahmed
Les modeles micromecaniques de transition d'echelles qui permettent de determiner les proprietes effectives des materiaux heterogenes a partir de la microstructure sont consideres dans ce travail. L'objectif est la prise en compte de la presence d'une interphase entre la matrice et le renforcement dans les modeles micromecaniques classiques, de meme que la reconsideration des approximations de base de ces modeles, afin de traiter les materiaux multiphasiques. Un nouveau modele micromecanique est alors propose pour tenir compte de la presence d'une interphase elastique mince lors de la determination des proprietes effectives. Ce modele a ete construit grace a l'apport de l'equation integrale, des operateurs interfaciaux de Hill et de la methode de Mori-Tanaka. Les expressions obtenues pour les modules globaux et les champs dans l'enrobage sont de nature analytique. L'approximation de base de ce modele est amelioree par la suite dans un nouveau modele qui s'interesse aux inclusions enrobees avec un enrobage mince ou epais. La resolution utilisee s'appuie sur une double homogeneisation realisee au niveau de l'inclusion enrobee et du materiau. Cette nouvelle demarche, permettra d'apprehender completement les implications des approximations de la modelisation. Les resultats obtenus sont exploites par la suite dans la solution de l'assemblage de Hashin. Ainsi, plusieurs modeles micromecaniques classiques d'origines differentes se voient unifier et rattacher, dans ce travail, a la representation geometrique de Hashin. En plus de pouvoir apprecier completement la pertinence de l'approximation de chaque modele dans cette vision unique, l'extension correcte de ces modeles aux materiaux multiphasiques est rendue possible. Plusieurs modeles analytiques et explicites sont alors proposee suivant des solutions de differents ordres de l'assemblage de Hashin. L'un des modeles explicite apparait comme une correction directe du modele de Mori-Tanaka, dans les cas ou celui ci echoue a donner de bons resultats. Finalement, ce modele de Mori-Tanaka corrige est utilise avec les operateurs de Hill pour construire un modele de transition d'echelle pour les materiaux ayant une interphase elastoplastique. La loi de comportement effective trouvee est de nature incrementale et elle est conjuguee a la relation de la plasticite de l'interphase. Des simulations d'essais mecaniques pour plusieurs proprietes de l'interphase plastique a permis de dresser des profils de l'enrobage octroyant un meilleur comportement au materiau.
Imbir, Kamil K.; Spustek, Tomasz; Duda, Joanna; Bernatowicz, Gabriela; Żygierewicz, Jarosław
2017-01-01
Affective meaning of verbal stimuli was found to influence cognitive control as expressed in the Emotional Stroop Task (EST). Behavioral studies have shown that factors such as valence, arousal, and emotional origin of reaction to stimuli associated with words can lead to lengthening of reaction latencies in EST. Moreover, electrophysiological studies have revealed that affective meaning altered amplitude of some components of evoked potentials recorded during EST, and that this alteration correlated with the performance in EST. The emotional origin was defined as processing based on automatic vs. reflective mechanisms, that underlines formation of emotional reactions to words. The aim of the current study was to investigate, within the framework of EST, correlates of processing of words differing in valence and origin levels, but matched in arousal, concreteness, frequency of appearance and length. We found no behavioral differences in response latencies. When controlling for origin, we found no effects of valence. We found the effect of origin on ERP in two time windows: 290–570 and 570–800 ms. The earlier effect can be attributed to cognitive control while the latter is rather the manifestation of explicit processing of words. In each case, reflective originated stimuli evoked more positive amplitudes compared to automatic originated words. PMID:28611717
Imbir, Kamil K; Spustek, Tomasz; Duda, Joanna; Bernatowicz, Gabriela; Żygierewicz, Jarosław
2017-01-01
Affective meaning of verbal stimuli was found to influence cognitive control as expressed in the Emotional Stroop Task (EST). Behavioral studies have shown that factors such as valence, arousal, and emotional origin of reaction to stimuli associated with words can lead to lengthening of reaction latencies in EST. Moreover, electrophysiological studies have revealed that affective meaning altered amplitude of some components of evoked potentials recorded during EST, and that this alteration correlated with the performance in EST. The emotional origin was defined as processing based on automatic vs. reflective mechanisms, that underlines formation of emotional reactions to words. The aim of the current study was to investigate, within the framework of EST, correlates of processing of words differing in valence and origin levels, but matched in arousal, concreteness, frequency of appearance and length. We found no behavioral differences in response latencies. When controlling for origin, we found no effects of valence. We found the effect of origin on ERP in two time windows: 290-570 and 570-800 ms. The earlier effect can be attributed to cognitive control while the latter is rather the manifestation of explicit processing of words. In each case, reflective originated stimuli evoked more positive amplitudes compared to automatic originated words.
NASA Astrophysics Data System (ADS)
Lachaine, Remi
Les chirurgiens generent des bulles dans le corps humain a l'aide d'irradiation laser depuis plusieurs decennies. Ils utilisent ces bulles comme de petits scalpels, leur permettant de faire des incisions precises et localisees. Une des applications de cet outil chirurgical est la perforation cellulaire. Au lieu d'utiliser une aiguille pour perforer la membrane des cellules, il est possible de focaliser des impulsions laser en surface d'une cellule, formant un plasma au point focal du laser et generant une bulle qui perfore la membrane cellulaire. Toutefois, ce procede est assez lent et la perforation massive de cellules in-vivo n'est pas envisageable. Pour accelerer le processus, il est possible d'utiliser des nanoparticules plasmoniques. Ces dernieres agissent comme des nano-antennes qui permettent de concentrer la lumiere sur une echelle nanometrique. La possibilite d'irradier un grand nombre de nanoparticules simultanement a donne un nouvel elan a la generation de bulle comme outil de perforation cellulaire. L'utilisation de nanoparticules dans un contexte biomedical comporte toutefois certains risques. En particulier, la fragmentation de nanoparticules peut augmenter la toxicite du traitement. Dans un cas ideal, il est preferable d'utiliser des nanoparticules qui ne sont pas endommagees par l'irradiation laser. Cette these a pour but de developper une methode d'ingenierie de nanoparticules robustes permettant la generation efficace de bulles a des fins biomedicales. Il est tout d'abord demontre experimentalement que la formation de plasma est bel et bien le mecanisme physique principal menant a la generation de bulles lors de l'irradiation infrarouge (longueur d'onde de 800 nm) et ultrarapide (temps d'impulsion entre 45 fs et 1 ps) de nanoparticules d'or de 100 nm. Pour realiser cette demonstration, une methode pompe-sonde de detection de bulles d'environ 1 mum a ete elaboree. Cette methode a permis de mettre en evidence une difference de taille de 18% entre les bulles generees avec une irradiation de polarisation lineaire par rapport a une polarisation circulaire lorsque la duree d'impulsion etait inferieure a la picoseconde. Pour des impulsions plus longues, il est montre que les tailles de bulles sont independantes de la polarisation des impulsions incidentes. Ce comportement particulier est en accord avec les predictions theoriques qui incluent la formation non-lineaire de plasma et ne peut pas etre explique en considerant uniquement l'absorption des particules. Ensuite, une methode de conception de nanoparticules robustes pour la generation de bulles est elaboree. Cette methode se base sur les proprietes optiques des nanostructures.
MEPD: a Medaka gene expression pattern database
Henrich, Thorsten; Ramialison, Mirana; Quiring, Rebecca; Wittbrodt, Beate; Furutani-Seiki, Makoto; Wittbrodt, Joachim; Kondoh, Hisato
2003-01-01
The Medaka Expression Pattern Database (MEPD) stores and integrates information of gene expression during embryonic development of the small freshwater fish Medaka (Oryzias latipes). Expression patterns of genes identified by ESTs are documented by images and by descriptions through parameters such as staining intensity, category and comments and through a comprehensive, hierarchically organized dictionary of anatomical terms. Sequences of the ESTs are available and searchable through BLAST. ESTs in the database are clustered upon entry and have been blasted against public data-bases. The BLAST results are updated regularly, stored within the database and searchable. The MEPD is a project within the Medaka Genome Initiative (MGI) and entries will be interconnected to integrated genomic map databases. MEPD is accessible through the WWW at http://medaka.dsp.jst.go.jp/MEPD. PMID:12519950
Forme pseudo tumorale d'une pneumopathie chronique à éosinophiles d’évolution fatale
Hammoune, Nabil; El Guendouz, Faycal; Elhaddad, Siham; Janah, Hicham; Hommadi, Abdelaziz
2015-01-01
La pneumopathie chronique idiopathique à éosinophile est une pathologie rare, de cause inconnue, caractérisée par des opacités pulmonaires périphériques, une éosinophilie périphérique >1000/mm3 et /ou une éosinophilie alvéolaire >25%. Le diagnostic est difficile à cause de la non spécificité des signes cliniques et radiologiques. Le traitement se base essentiellement sur la corticothérapie. L’évolution est généralement favorable. Nous rapportons un cas de cette entité rare dans sa forme pseudotumorale sans hyperéosinophilie, de diagnostic tardif suite à l’étude histologique de la lobectomie chirurgicale et d’évolution fatale. PMID:26113897
Kelly, S; Wickstead, B; Gull, K
2011-04-07
We have developed a machine-learning approach to identify 3537 discrete orthologue protein sequence groups distributed across all available archaeal genomes. We show that treating these orthologue groups as binary detection/non-detection data is sufficient to capture the majority of archaeal phylogeny. We subsequently use the sequence data from these groups to infer a method and substitution-model-independent phylogeny. By holding this phylogeny constrained and interrogating the intersection of this large dataset with both the Eukarya and the Bacteria using Bayesian and maximum-likelihood approaches, we propose and provide evidence for a methanogenic origin of the Archaea. By the same criteria, we also provide evidence in support of an origin for Eukarya either within or as sisters to the Thaumarchaea.
New genes often acquire male-specific functions but rarely become essential in Drosophila.
Kondo, Shu; Vedanayagam, Jeffrey; Mohammed, Jaaved; Eizadshenass, Sogol; Kan, Lijuan; Pang, Nan; Aradhya, Rajaguru; Siepel, Adam; Steinhauer, Josefa; Lai, Eric C
2017-09-15
Relatively little is known about the in vivo functions of newly emerging genes, especially in metazoans. Although prior RNAi studies reported prevalent lethality among young gene knockdowns, our phylogenomic analyses reveal that young Drosophila genes are frequently restricted to the nonessential male reproductive system. We performed large-scale CRISPR/Cas9 mutagenesis of "conserved, essential" and "young, RNAi-lethal" genes and broadly confirmed the lethality of the former but the viability of the latter. Nevertheless, certain young gene mutants exhibit defective spermatogenesis and/or male sterility. Moreover, we detected widespread signatures of positive selection on young male-biased genes. Thus, young genes have a preferential impact on male reproductive system function. © 2017 Kondo et al.; Published by Cold Spring Harbor Laboratory Press.
Sobre os sistemas de referência celeste
NASA Astrophysics Data System (ADS)
Poppe, P. C. R.; Martin, V. A. F.
2003-02-01
Apresentamos neste trabalho, algumas discussões sobre os sistemas de referência utilizados em Astronomia. Claramente, não é possível esgotar todo este assunto num único texto, mas esperamos, contudo, que o presente material possa ser apreciado nos cursos de Introdução à Astronomia, que estão cada vez mais presentes nas atuais propostas curriculares das graduações de Física. As discussões pertinentes às "Bases de Referência Celeste", serão apresentadas em um outro trabalho.
DNA sequence chromatogram browsing using JAVA and CORBA.
Parsons, J D; Buehler, E; Hillier, L
1999-03-01
DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.
Lee, Imchang; Chalita, Mauricio; Ha, Sung-Min; Na, Seong-In; Yoon, Seok-Hwan; Chun, Jongsik
2017-06-01
Thanks to the recent advancement of DNA sequencing technology, the cost and time of prokaryotic genome sequencing have been dramatically decreased. It has repeatedly been reported that genome sequencing using high-throughput next-generation sequencing is prone to contaminations due to its high depth of sequencing coverage. Although a few bioinformatics tools are available to detect potential contaminations, these have inherited limitations as they only use protein-coding genes. Here we introduce a new algorithm, called ContEst16S, to detect potential contaminations using 16S rRNA genes from genome assemblies. We screened 69 745 prokaryotic genomes from the NCBI Assembly Database using ContEst16S and found that 594 were contaminated by bacteria, human and plants. Of the predicted contaminated genomes, 8 % were not predicted by the existing protein-coding gene-based tool, implying that both methods can be complementary in the detection of contaminations. A web-based service of the algorithm is available at www.ezbiocloud.net/tools/contest16s.
Genome-Based Taxonomic Classification of Bacteroidetes
Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; ...
2016-12-20
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogeneticmore » analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.« less
Cuadrat, Rafael R C; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M; Mattoso, Marta; Dávila, Alberto M R
2014-08-01
A key focus in 21(st) century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools.
Genome-Based Taxonomic Classification of Bacteroidetes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogeneticmore » analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.« less
Cuadrat, Rafael R. C.; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M.; Mattoso, Marta
2014-01-01
Abstract A key focus in 21st century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools. PMID:24960463
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex
Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less
Genome-Based Taxonomic Classification of Bacteroidetes
Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N.; Woyke, Tanja; Kyrpides, Nikos C.; Klenk, Hans-Peter; Göker, Markus
2016-01-01
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved. PMID:28066339
Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio
Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex; ...
2017-03-10
Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less
Identification, validation and high-throughput genotyping of transcribed gene SNPs in cassava.
Ferguson, Morag E; Hearne, Sarah J; Close, Timothy J; Wanamaker, Steve; Moskal, William A; Town, Christopher D; de Young, Joe; Marri, Pradeep Reddy; Rabbi, Ismail Yusuf; de Villiers, Etienne P
2012-03-01
The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation.
Development of Pineapple Microsatellite Markers and Germplasm Genetic Diversity Analysis
Tong, Helin; Chen, You; Wang, Jingyi; Chen, Yeyuan; Sun, Guangming; He, Junhu; Wu, Yaoting
2013-01-01
Two methods were used to develop pineapple microsatellite markers. Genomic library-based SSR development: using selectively amplified microsatellite assay, 86 sequences were generated from pineapple genomic library. 91 (96.8%) of the 94 Simple Sequence Repeat (SSR) loci were dinucleotide repeats (39 AC/GT repeats and 52 GA/TC repeats, accounting for 42.9% and 57.1%, resp.), and the other three were mononucleotide repeats. Thirty-six pairs of SSR primers were designed; 24 of them generated clear bands of expected sizes, and 13 of them showed polymorphism. EST-based SSR development: 5659 pineapple EST sequences obtained from NCBI were analyzed; among 1397 nonredundant EST sequences, 843 were found containing 1110 SSR loci (217 of them contained more than one SSR locus). Frequency of SSRs in pineapple EST sequences is 1SSR/3.73 kb, and 44 types were found. Mononucleotide, dinucleotide, and trinucleotide repeats dominate, accounting for 95.6% in total. AG/CT and AGC/GCT were the dominant type of dinucleotide and trinucleotide repeats, accounting for 83.5% and 24.1%, respectively. Thirty pairs of primers were designed for each of randomly selected 30 sequences; 26 of them generated clear and reproducible bands, and 22 of them showed polymorphism. Eighteen pairs of primers obtained by the one or the other of the two methods above that showed polymorphism were selected to carry out germplasm genetic diversity analysis for 48 breeds of pineapple; similarity coefficients of these breeds were between 0.59 and 1.00, and they can be divided into four groups accordingly. Amplification products of five SSR markers were extracted and sequenced, corresponding repeat loci were found and locus mutations are mainly in copy number of repeats and base mutations in the flanking region. PMID:24024187
Agha, Riaz; Agha, Maliha
2011-01-01
Much has been achieved in the scientific and surgical fields over the last 360 years. Some institutions have contributed disproportionately to these advances. The medical schools and hospitals of Guy's (est. 1721), King's (est. 1840) and St. Thomas' (est. 1173) seem to provide a focus and a catalyst for much innovation and creativity dating back to 1608. This review sets to provide an overview of the major contributors to surgical advances at these institutions over the last 360 years and what factors affected unique to these institutions contributed to the climate of discovery. It is based on a lecture given to the Osler Club of London (est. 1928) at the Royal College of Physicians in London on 4 November 2010. It is the author's premise that the people and the discoveries they made within these institutions within three square miles of London changed the practice and understanding of science and healthcare as we know it today. Copyright © 2011 Surgical Associates Ltd. Published by Elsevier Ltd. All rights reserved.
Realisation de composants tout-fibre passifs bases sur des fibres optiques a deux coeurs
NASA Astrophysics Data System (ADS)
Jacob Poulin, Anne C.
2002-01-01
Les composants passifs tout-optique sont des elements de choix dans les systemes de communications optiques. Cette these presente l'utilisation experimentale de fibres a deux coeurs dissimilaires pour la realisation de filtres passe-bande. Les fibres a deux coeurs ont la particularite de favoriser un couplage d'un coeur a l'autre a intervalles reguliers lorsque les coeurs sont exactement identiques. Dans le cas ou une legere difference apparait, ce couplage est rapidement reduit a zero. La premiere partie de la these montre comment, par l'emploi d'une geometrie de fibre appropriee, il est possible de compenser cette desyntonisation et de fabriquer des coupleurs 100%. Les filtres obtenus ayant toutefois une largeur de bande trop grande pour les besoins du marche des communications optiques, il est montre dans la deuxieme partie de la these comment, en alliant la technologie des reseaux de Bragg avec celle des coupleurs, il est possible de realiser des filtres operant en transmission et possedant d'excellentes caracteristiques spectrales, toujours avec ces memes fibres a deux coeurs.
2010-01-01
Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size. PMID:20565983
Orchid phylogenomics and multiple drivers of their extraordinary diversification
Givnish, Thomas J.; Spalink, Daniel; Ames, Mercedes; Lyon, Stephanie P.; Hunter, Steven J.; Zuluaga, Alejandro; Iles, William J. D.; Clements, Mark A.; Arroyo, Mary T. K.; Leebens-Mack, James; Endara, Lorena; Kriebel, Ricardo; Neubig, Kurt M.; Whitten, W. Mark; Williams, Norris H.; Cameron, Kenneth M.
2015-01-01
Orchids are the most diverse family of angiosperms, with over 25 000 species, more than mammals, birds and reptiles combined. Tests of hypotheses to account for such diversity have been stymied by the lack of a fully resolved broad-scale phylogeny. Here, we provide such a phylogeny, based on 75 chloroplast genes for 39 species representing all orchid subfamilies and 16 of 17 tribes, time-calibrated against 17 angiosperm fossils. A supermatrix analysis places an additional 144 species based on three plastid genes. Orchids appear to have arisen roughly 112 million years ago (Mya); the subfamilies Orchidoideae and Epidendroideae diverged from each other at the end of the Cretaceous; and the eight tribes and three previously unplaced subtribes of the upper epidendroids diverged rapidly from each other between 37.9 and 30.8 Mya. Orchids appear to have undergone one significant acceleration of net species diversification in the orchidoids, and two accelerations and one deceleration in the upper epidendroids. Consistent with theory, such accelerations were correlated with the evolution of pollinia, the epiphytic habit, CAM photosynthesis, tropical distribution (especially in extensive cordilleras), and pollination via Lepidoptera or euglossine bees. Deceit pollination appears to have elevated the number of orchid species by one-half but not via acceleration of the rate of net diversification. The highest rate of net species diversification within the orchids (0.382 sp sp−1 My−1) is 6.8 times that at the Asparagales crown. PMID:26311671
Phylogenomic Insights into Animal Evolution.
Telford, Maximilian J; Budd, Graham E; Philippe, Hervé
2015-10-05
Animals make up only a small fraction of the eukaryotic tree of life, yet, from our vantage point as members of the animal kingdom, the evolution of the bewildering diversity of animal forms is endlessly fascinating. In the century following the publication of Darwin's Origin of Species, hypotheses regarding the evolution of the major branches of the animal kingdom - their relationships to each other and the evolution of their body plans - was based on a consideration of the morphological and developmental characteristics of the different animal groups. This morphology-based approach had many successes but important aspects of the evolutionary tree remained disputed. In the past three decades, molecular data, most obviously primary sequences of DNA and proteins, have provided an estimate of animal phylogeny largely independent of the morphological evolution we would ultimately like to understand. The molecular tree that has evolved over the past three decades has drastically altered our view of animal phylogeny and many aspects of the tree are no longer contentious. The focus of molecular studies on relationships between animal groups means, however, that the discipline has become somewhat divorced from the underlying biology and from the morphological characteristics whose evolution we aim to understand. Here, we consider what we currently know of animal phylogeny; what aspects we are still uncertain about and what our improved understanding of animal phylogeny can tell us about the evolution of the great diversity of animal life. Copyright © 2015 Elsevier Ltd. All rights reserved.
Rajaram, Vengaldas; Nepolean, Thirunavukkarasu; Senthilvel, Senapathy; Varshney, Rajeev K; Vadez, Vincent; Srivastava, Rakesh K; Shah, Trushar M; Supriya, Ambawat; Kumar, Sushil; Ramana Kumari, Basava; Bhanuprakash, Amindala; Narasu, Mangamoori Lakshmi; Riera-Lizarazu, Oscar; Hash, Charles Thomas
2013-03-09
Pearl millet [Pennisetum glaucum (L.) R. Br.] is a widely cultivated drought- and high-temperature tolerant C4 cereal grown under dryland, rainfed and irrigated conditions in drought-prone regions of the tropics and sub-tropics of Africa, South Asia and the Americas. It is considered an orphan crop with relatively few genomic and genetic resources. This study was undertaken to increase the EST-based microsatellite marker and genetic resources for this crop to facilitate marker-assisted breeding. Newly developed EST-SSR markers (99), along with previously mapped EST-SSR (17), genomic SSR (53) and STS (2) markers, were used to construct linkage maps of four F7 recombinant inbred populations (RIP) based on crosses ICMB 841-P3 × 863B-P2 (RIP A), H 77/833-2 × PRLT 2/89-33 (RIP B), 81B-P6 × ICMP 451-P8 (RIP C) and PT 732B-P2 × P1449-2-P1 (RIP D). Mapped loci numbers were greatest for RIP A (104), followed by RIP B (78), RIP C (64) and RIP D (59). Total map lengths (Haldane) were 615 cM, 690 cM, 428 cM and 276 cM, respectively. A total of 176 loci detected by 171 primer pairs were mapped among the four crosses. A consensus map of 174 loci (899 cM) detected by 169 primer pairs was constructed using MergeMap to integrate the individual linkage maps. Locus order in the consensus map was well conserved for nearly all linkage groups. Eighty-nine EST-SSR marker loci from this consensus map had significant BLAST hits (top hits with e-value ≤ 1E-10) on the genome sequences of rice, foxtail millet, sorghum, maize and Brachypodium with 35, 88, 58, 48 and 38 loci, respectively. The consensus map developed in the present study contains the largest set of mapped SSRs reported to date for pearl millet, and represents a major consolidation of existing pearl millet genetic mapping information. This study increased numbers of mapped pearl millet SSR markers by >50%, filling important gaps in previously published SSR-based linkage maps for this species and will greatly facilitate SSR-based QTL mapping and applied marker-assisted selection programs.
Choi, Hong-Kyu; Kim, Dongjin; Uhm, Taesik; Limpens, Eric; Lim, Hyunju; Mun, Jeong-Hwan; Kalo, Peter; Penmetsa, R Varma; Seres, Andrea; Kulikova, Olga; Roe, Bruce A; Bisseling, Ton; Kiss, Gyorgy B; Cook, Douglas R
2004-01-01
A core genetic map of the legume Medicago truncatula has been established by analyzing the segregation of 288 sequence-characterized genetic markers in an F(2) population composed of 93 individuals. These molecular markers correspond to 141 ESTs, 80 BAC end sequence tags, and 67 resistance gene analogs, covering 513 cM. In the case of EST-based markers we used an intron-targeted marker strategy with primers designed to anneal in conserved exon regions and to amplify across intron regions. Polymorphisms were significantly more frequent in intron vs. exon regions, thus providing an efficient mechanism to map transcribed genes. Genetic and cytogenetic analysis produced eight well-resolved linkage groups, which have been previously correlated with eight chromosomes by means of FISH with mapped BAC clones. We anticipated that mapping of conserved coding regions would have utility for comparative mapping among legumes; thus 60 of the EST-based primer pairs were designed to amplify orthologous sequences across a range of legume species. As an initial test of this strategy, we used primers designed against M. truncatula exon sequences to rapidly map genes in M. sativa. The resulting comparative map, which includes 68 bridging markers, indicates that the two Medicago genomes are highly similar and establishes the basis for a Medicago composite map. PMID:15082563
Phylogenomic Relationships between Amylolytic Enzymes from 85 Strains of Fungi
Chen, Wanping; Xie, Ting; Shao, Yanchun; Chen, Fusheng
2012-01-01
Fungal amylolytic enzymes, including α-amylase, gluocoamylase and α-glucosidase, have been extensively exploited in diverse industrial applications such as high fructose syrup production, paper making, food processing and ethanol production. In this paper, amylolytic genes of 85 strains of fungi from the phyla Ascomycota, Basidiomycota, Chytridiomycota and Zygomycota were annotated on the genomic scale according to the classification of glycoside hydrolase (GH) from the Carbohydrate-Active enZymes (CAZy) Database. Comparisons of gene abundance in the fungi suggested that the repertoire of amylolytic genes adapted to their respective lifestyles. Amylolytic enzymes in family GH13 were divided into four distinct clades identified as heterologous α- amylases, eukaryotic α-amylases, bacterial and fungal α-amylases and GH13 α-glucosidases. Family GH15 had two branches, one for gluocoamylases, and the other with currently unknown function. GH31 α-glucosidases showed diverse branches consisting of neutral α-glucosidases, lysosomal acid α-glucosidases and a new clade phylogenetically related to the bacterial counterparts. Distribution of starch-binding domains in above fungal amylolytic enzymes was related to the enzyme source and phylogeny. Finally, likely scenarios for the evolution of amylolytic enzymes in fungi based on phylogenetic analyses were proposed. Our results provide new insights into evolutionary relationships among subgroups of fungal amylolytic enzymes and fungal evolutionary adaptation to ecological conditions. PMID:23166747
Zhang, Qian; Jun, Se -Ran; Leuze, Michael; ...
2017-01-19
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Qian; Jun, Se -Ran; Leuze, Michael
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
The evolutionary history of bears is characterized by gene flow across species
Kumar, Vikas; Lammers, Fritjof; Bidon, Tobias; Pfenninger, Markus; Kolter, Lydia; Nilsson, Maria A.; Janke, Axel
2017-01-01
Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to large amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. Evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow. PMID:28422140
Comparative phylogenomic analysis provides insights into TCP gene functions in Sorghum
Francis, Aleena; Dhaka, Namrata; Bakshi, Mohit; Jung, Ki-Hong; Sharma, Manoj K.; Sharma, Rita
2016-01-01
Sorghum is a highly efficient C4 crop with potential to mitigate challenges associated with food, feed and fuel. TCP proteins are of particular interest for crop improvement programs due to their well-demonstrated roles in crop domestication and shaping plant architecture thereby, affecting agronomic traits. We identified 20 TCP genes from Sorghum. Except SbTCP8, all are either intronless or contain introns in the untranslated regions. Comparative phylogenetic analysis of Arabidopsis, rice, Brachypodium and Sorghum TCP proteins revealed two distinct classes categorized into ten sub-clades. Sub-clade F is dicot-specific, whereas A2, G1 and I1 groups only contained genes from grasses. Sub-clade B was missing in Sorghum, whereas group A1 was missing in rice indicating species-specific divergence of TCP proteins. TCP proteins of Sorghum are enriched in disorder promoting residues with class I containing higher percent disorder than class II proteins. Seven pairs of paralogous TCP genes were identified from Sorghum, five of which seem to predate Rice-Sorghum divergence. All of them have diverged in their expression. Based on the expression and orthology analysis, five Sorghum genes have been shortlisted for further investigation for their roles in regulating plant morphology. Whereas, three genes have been identified as candidates for engineering abiotic stress tolerance. PMID:27917941
Midha, Samriti
2014-01-01
Xanthomonas axonopodis pv. citri (Xac) is the causal agent of citrus bacterial canker (CBC) and is a serious problem worldwide. Like CBC, several important diseases in other fruits, such as mango, pomegranate, and grape, are also caused by Xanthomonas pathovars that display remarkable specificity toward their hosts. While citrus and mango diseases were documented more than 100 years ago, the pomegranate and grape diseases have been known only since the 1950s and 1970s, respectively. Interestingly, diseases caused by all these pathovars were noted first in India. Our genome-based phylogenetic studies suggest that these diverse pathogens belong to a single species and these pathovars may be just a group of rapidly evolving strains. Furthermore, the recently reported pathovars, such as those infecting grape and pomegranate, form independent clonal lineages, while the citrus and mango pathovars that have been known for a long time form one clonal lineage. Such an understanding of their phylogenomic relationship has further allowed us to understand major and unique variations in the lineages that give rise to these pathovars. Whole-genome sequencing studies including ecological relatives from their putative country of origin has allowed us to understand the evolutionary history of Xac and other pathovars that infect fruits. PMID:25085494
Comparative transcriptomics of early dipteran development
2013-01-01
Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies). PMID:23432914
Li, Lei; Wong, Hin-chung; Nong, Wenyan; Cheung, Man Kit; Law, Patrick Tik Wan; Kam, Kai Man; Kwan, Hoi Shan
2014-12-18
Vibrio parahaemolyticus is a Gram-negative halophilic bacterium. Infections with the bacterium could become systemic and can be life-threatening to immunocompromised individuals. Genome sequences of a few clinical isolates of V. parahaemolyticus are currently available, but the genome dynamics across the species and virulence potential of environmental strains on a genome-scale have not been described before. Here we present genome sequences of four V. parahaemolyticus clinical strains from stool samples of patients and five environmental strains in Hong Kong. Phylogenomics analysis based on single nucleotide polymorphisms revealed a clear distinction between the clinical and environmental isolates. A new gene cluster belonging to the biofilm associated proteins of V. parahaemolyticus was found in clincial strains. In addition, a novel small genomic island frequently found among clinical isolates was reported. A few environmental strains were found harboring virulence genes and prophage elements, indicating their virulence potential. A unique biphenyl degradation pathway was also reported. A database for V. parahaemolyticus (http://kwanlab.bio.cuhk.edu.hk/vp) was constructed here as a platform to access and analyze genome sequences and annotations of the bacterium. We have performed a comparative genomics analysis of clinical and environmental strains of V. parahaemolyticus. Our analyses could facilitate understanding of the phylogenetic diversity and niche adaptation of this bacterium.
Zhang, Qian; Jun, Se-Ran; Leuze, Michael; Ussery, David; Nookaew, Intawat
2017-01-01
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral “tree of life”. However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conserved proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. The resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses. PMID:28102365
Svartström, Olov; Alneberg, Johannes; Terrapon, Nicolas; Lombard, Vincent; de Bruijn, Ino; Malmsten, Jonas; Dalin, Ann-Marie; El Muller, Emilie; Shah, Pranjul; Wilmes, Paul; Henrissat, Bernard; Aspeborg, Henrik; Andersson, Anders F
2017-11-01
The moose (Alces alces) is a ruminant that harvests energy from fiber-rich lignocellulose material through carbohydrate-active enzymes (CAZymes) produced by its rumen microbes. We applied shotgun metagenomics to rumen contents from six moose to obtain insights into this microbiome. Following binning, 99 metagenome-assembled genomes (MAGs) belonging to 11 prokaryotic phyla were reconstructed and characterized based on phylogeny and CAZyme profile. The taxonomy of these MAGs reflected the overall composition of the metagenome, with dominance of the phyla Bacteroidetes and Firmicutes. Unlike in other ruminants, Spirochaetes constituted a significant proportion of the community and our analyses indicate that the corresponding strains are primarily pectin digesters. Pectin-degrading genes were also common in MAGs of Ruminococcus, Fibrobacteres and Bacteroidetes and were overall overrepresented in the moose microbiome compared with other ruminants. Phylogenomic analyses revealed several clades within the Bacteriodetes without previously characterized genomes. Several of these MAGs encoded a large numbers of dockerins, a module usually associated with cellulosomes. The Bacteroidetes dockerins were often linked to CAZymes and sometimes encoded inside polysaccharide utilization loci, which has never been reported before. The almost 100 CAZyme-annotated genomes reconstructed in this study provide an in-depth view of an efficient lignocellulose-degrading microbiome and prospects for developing enzyme technology for biorefineries.
Comparative phylogenomic analysis provides insights into TCP gene functions in Sorghum.
Francis, Aleena; Dhaka, Namrata; Bakshi, Mohit; Jung, Ki-Hong; Sharma, Manoj K; Sharma, Rita
2016-12-05
Sorghum is a highly efficient C4 crop with potential to mitigate challenges associated with food, feed and fuel. TCP proteins are of particular interest for crop improvement programs due to their well-demonstrated roles in crop domestication and shaping plant architecture thereby, affecting agronomic traits. We identified 20 TCP genes from Sorghum. Except SbTCP8, all are either intronless or contain introns in the untranslated regions. Comparative phylogenetic analysis of Arabidopsis, rice, Brachypodium and Sorghum TCP proteins revealed two distinct classes categorized into ten sub-clades. Sub-clade F is dicot-specific, whereas A2, G1 and I1 groups only contained genes from grasses. Sub-clade B was missing in Sorghum, whereas group A1 was missing in rice indicating species-specific divergence of TCP proteins. TCP proteins of Sorghum are enriched in disorder promoting residues with class I containing higher percent disorder than class II proteins. Seven pairs of paralogous TCP genes were identified from Sorghum, five of which seem to predate Rice-Sorghum divergence. All of them have diverged in their expression. Based on the expression and orthology analysis, five Sorghum genes have been shortlisted for further investigation for their roles in regulating plant morphology. Whereas, three genes have been identified as candidates for engineering abiotic stress tolerance.
Phylogenomics of African guenons.
Moulin, Sibyle; Gerbault-Seureau, Michèle; Dutrillaux, Bernard; Richard, Florence Anne
2008-01-01
The karyotypes of 28 specimens belonging to 26 species of Cercopithecinae have been compared with each other and with human karyotype by chromosome banding and, for some of them, by Zoo-FISH (human painting probes) techniques. The study includes the first description of the karyotypes of four species and a synonym of Cercopithecus nictitans. The chromosomal homologies obtained provide us with new data on a large number of rearrangements. This allows us to code chromosomal characters to draw Cercopithecini phylogenetic trees, which are compared to phylogenetic data based on DNA sequences. Our findings show that some of the superspecies proposed by Kingdon (1997 The Kingdon Field Guide to African Mammals, Academic Press.) and Groves (2001 Primates Taxonomy, Smithsonian Institution Press) do not form homogeneous groups and that the genus Cercopithecus is paraphyletic, in agreement with previous molecular analyses. The evolution of Cercopithecini karyotypes is mainly due to non-centromeric chromosome fissions and centromeric shifts or inversions. Non-Robertsonian translocations occurred in C. hamlyni and C. neglectus. The position of chromosomal rearrangements in the phylogenetic tree leads us to propose that the Cercopithecini evolution proceeded by either repeated fission events facilitated by peculiar genomic structures or successive reticulate phases, in which heterozygous populations for few rearranged chromosomes were present, allowing the spreading of chromosomal forms in various combinations, before the speciation process.
Wu, Chung-Shien; Wang, Ting-Jen; Wu, Chia-Wen; Wang, Ya-Nan
2017-01-01
Abstract To date, little is known about the evolution of plastid genomes (plastomes) in Lauraceae. As one of the top five largest families in tropical forests, the Lauraceae contain many species that are important ecologically and economically. Lauraceous species also provide wonderful materials to study the evolutionary trajectory in response to parasitism because they contain both nonparasitic and parasitic species. This study compared the plastomes of nine Lauraceous species, including the sole hemiparasitic and herbaceous genus Cassytha (laurel dodder; here represented by Cassytha filiformis). We found differential contractions of the canonical inverted repeat (IR), resulting in two IR types present in Lauraceae. These two IR types reinforce Cryptocaryeae and Neocinnamomum—Perseeae–Laureae as two separate clades. Our data reveal several traits unique to Cas. filiformis, including loss of IRs, loss or pseudogenization of 11 ndh and rpl23 genes, richness of repeats, and accelerated rates of nucleotide substitutions in protein-coding genes. Although Cas. filiformis is low in chlorophyll content, our analysis based on dN/dS ratios suggests that both its plastid house-keeping and photosynthetic genes are under strong selective constraints. Hence, we propose that short generation time and herbaceous lifestyle rather than reduced photosynthetic ability drive the accelerated rates of nucleotide substitutions in Cas. filiformis. PMID:28985306
The evolutionary history of bears is characterized by gene flow across species.
Kumar, Vikas; Lammers, Fritjof; Bidon, Tobias; Pfenninger, Markus; Kolter, Lydia; Nilsson, Maria A; Janke, Axel
2017-04-19
Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to large amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. Evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow.
The performance of the Congruence Among Distance Matrices (CADM) test in phylogenetic analysis
2011-01-01
Background CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. Results Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa), the number of partially or completely congruent matrices and the level of congruence among distance matrices. Conclusions Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously. PMID:21388552
Rodríguez, María Cecilia; Loaces, Inés; Amarelle, Vanesa; Senatore, Daniella; Iriarte, Andrés; Fabiano, Elena; Noya, Francisco
2015-01-01
A metagenomic fosmid library from bovine rumen was used to identify clones with lipolytic activity. One positive clone was isolated. The gene responsible for the observed phenotype was identified by in vitro transposon mutagenesis and sequencing and was named est10. The 367 amino acids sequence harbors a signal peptide, the conserved secondary structure arrangement of alpha/beta hydrolases, and a GHSQG pentapeptide which is characteristic of esterases and lipases. Homology based 3D-modelling confirmed the conserved spatial orientation of the serine in a nucleophilic elbow. By sequence comparison, Est10 is related to hydrolases that are grouped into the non-specific Pfam family DUF3089 and to other characterized esterases that were recently classified into the new family XV of lipolytic enzymes. Est10 was heterologously expressed in Escherichia coli as a His-tagged fusion protein, purified and biochemically characterized. Est10 showed maximum activity towards C4 aliphatic chains and undetectable activity towards C10 and longer chains which prompted its classification as an esterase. However, it was able to efficiently catalyze the hydrolysis of aryl esters such as methyl phenylacetate and phenyl acetate. The optimum pH of this enzyme is 9.0, which is uncommon for esterases, and it exhibits an optimal temperature at 40°C. The activity of Est10 was inhibited by metal ions, detergents, chelating agents and additives. We have characterized an alkaline esterase produced by a still unidentified bacterium belonging to a recently proposed new family of esterases. PMID:25973851
Li, Yu-Ping; Xia, Run-Xi; Wang, Huan; Li, Xi-Sheng; Liu, Yan-Qun; Wei, Zhao-Jun; Lu, Cheng; Xiang, Zhong-Huai
2009-06-24
In this study we successfully constructed a full-length cDNA library from Chinese oak silkworm, Antheraea pernyi, the most well-known wild silkworm used for silk production and insect food. Total RNA was extracted from a single fresh female pupa at the diapause stage. The titer of the library was 5 x 10(5) cfu/ml and the proportion of recombinant clones was approximately 95%. Expressed sequence tag (EST) analysis was used to characterize the library. A total of 175 clustered ESTs consisting of 24 contigs and 151 singlets were generated from 250 effective sequences. Of the 175 unigenes, 97 (55.4%) were known genes but only five from A. pernyi, 37 (21.2%) were known ESTs without function annotation, and 41 (23.4%) were novel ESTs. By EST sequencing, a gene coding KK-42-binding protein in A. pernyi (named as ApKK42-BP; GenBank accession no. FJ744151) was identified and characterized. Protein sequence analysis showed that ApKK42-BP was not a membrane protein but an extracellular protein with a signal peptide at position 1-18, and contained two putative conserved domains, abhydro_lipase and abhydrolase_1, suggesting it may be a member of lipase superfamily. Expression analysis based on number of ESTs showed that ApKK42-BP was an abundant gene in the period of diapause stage, suggesting it may also be involved in pupa-diapause termination.
Li, Yu-Ping; Xia, Run-Xi; Wang, Huan; Li, Xi-Sheng; Liu, Yan-Qun; Wei, Zhao-Jun; Lu, Cheng; Xiang, Zhong-Huai
2009-01-01
In this study we successfully constructed a full-length cDNA library from Chinese oak silkworm, Antheraea pernyi, the most well-known wild silkworm used for silk production and insect food. Total RNA was extracted from a single fresh female pupa at the diapause stage. The titer of the library was 5 × 105 cfu/ml and the proportion of recombinant clones was approximately 95%. Expressed sequence tag (EST) analysis was used to characterize the library. A total of 175 clustered ESTs consisting of 24 contigs and 151 singlets were generated from 250 effective sequences. Of the 175 unigenes, 97 (55.4%) were known genes but only five from A. pernyi, 37 (21.2%) were known ESTs without function annotation, and 41 (23.4%) were novel ESTs. By EST sequencing, a gene coding KK-42-binding protein in A. pernyi (named as ApKK42-BP; GenBank accession no. FJ744151) was identified and characterized. Protein sequence analysis showed that ApKK42-BP was not a membrane protein but an extracellular protein with a signal peptide at position 1-18, and contained two putative conserved domains, abhydro_lipase and abhydrolase_1, suggesting it may be a member of lipase superfamily. Expression analysis based on number of ESTs showed that ApKK42-BP was an abundant gene in the period of diapause stage, suggesting it may also be involved in pupa-diapause termination. PMID:19564928
Abdelkafi, Slim; Ogata, Hiroyuki; Barouh, Nathalie; Fouquet, Benjamin; Lebrun, Régine; Pina, Michel; Scheirlinckx, Frantz; Villeneuve, Pierre; Carrière, Frédéric
2009-11-01
An esterase (CpEst) showing high specific activities on tributyrin and short chain vinyl esters was obtained from Carica papaya latex after an extraction step with zwitterionic detergent and sonication, followed by gel filtration chromatography. Although the protein could not be purified to complete homogeneity due to its presence in high molecular mass aggregates, a major protein band with an apparent molecular mass of 41 kDa was obtained by SDS-PAGE. This material was digested with trypsin and the amino acid sequences of the tryptic peptides were determined by LC/ESI/MS/MS. These sequences were used to identify a partial cDNA (679 bp) from expressed sequence tags (ESTs) of C. papaya. Based upon EST sequences, a full-length gene was identified in the genome of C. papaya, with an open reading frame of 1029 bp encoding a protein of 343 amino acid residues, with a theoretical molecular mass of 38 kDa. From sequence analysis, CpEst was identified as a GDSL-motif carboxylester hydrolase belonging to the SGNH protein family and four potential N-glycosylation sites were identified. The putative catalytic triad was localised (Ser(35)-Asp(307)-His(310)) with the nucleophile serine being part of the GDSL-motif. A 3D-model of CpEst was built from known X-ray structures and sequence alignments and the catalytic triad was found to be exposed at the surface of the molecule, thus confirming the results of CpEst inhibition by tetrahydrolipstatin suggesting a direct accessibility of the inhibitor to the active site.
Rodríguez, María Cecilia; Loaces, Inés; Amarelle, Vanesa; Senatore, Daniella; Iriarte, Andrés; Fabiano, Elena; Noya, Francisco
2015-01-01
A metagenomic fosmid library from bovine rumen was used to identify clones with lipolytic activity. One positive clone was isolated. The gene responsible for the observed phenotype was identified by in vitro transposon mutagenesis and sequencing and was named est10. The 367 amino acids sequence harbors a signal peptide, the conserved secondary structure arrangement of alpha/beta hydrolases, and a GHSQG pentapeptide which is characteristic of esterases and lipases. Homology based 3D-modelling confirmed the conserved spatial orientation of the serine in a nucleophilic elbow. By sequence comparison, Est10 is related to hydrolases that are grouped into the non-specific Pfam family DUF3089 and to other characterized esterases that were recently classified into the new family XV of lipolytic enzymes. Est10 was heterologously expressed in Escherichia coli as a His-tagged fusion protein, purified and biochemically characterized. Est10 showed maximum activity towards C4 aliphatic chains and undetectable activity towards C10 and longer chains which prompted its classification as an esterase. However, it was able to efficiently catalyze the hydrolysis of aryl esters such as methyl phenylacetate and phenyl acetate. The optimum pH of this enzyme is 9.0, which is uncommon for esterases, and it exhibits an optimal temperature at 40 °C. The activity of Est10 was inhibited by metal ions, detergents, chelating agents and additives. We have characterized an alkaline esterase produced by a still unidentified bacterium belonging to a recently proposed new family of esterases.
2011-01-01
Background Nocturnal insects such as moths are ideal models to study the molecular bases of olfaction that they use, among examples, for the detection of mating partners and host plants. Knowing how an odour generates a neuronal signal in insect antennae is crucial for understanding the physiological bases of olfaction, and also could lead to the identification of original targets for the development of olfactory-based control strategies against herbivorous moth pests. Here, we describe an Expressed Sequence Tag (EST) project to characterize the antennal transcriptome of the noctuid pest model, Spodoptera littoralis, and to identify candidate genes involved in odour/pheromone detection. Results By targeting cDNAs from male antennae, we biased gene discovery towards genes potentially involved in male olfaction, including pheromone reception. A total of 20760 ESTs were obtained from a normalized library and were assembled in 9033 unigenes. 6530 were annotated based on BLAST analyses and gene prediction software identified 6738 ORFs. The unigenes were compared to the Bombyx mori proteome and to ESTs derived from Lepidoptera transcriptome projects. We identified a large number of candidate genes involved in odour and pheromone detection and turnover, including 31 candidate chemosensory receptor genes, but also genes potentially involved in olfactory modulation. Conclusions Our project has generated a large collection of antennal transcripts from a Lepidoptera. The normalization process, allowing enrichment in low abundant genes, proved to be particularly relevant to identify chemosensory receptors in a species for which no genomic data are available. Our results also suggest that olfactory modulation can take place at the level of the antennae itself. These EST resources will be invaluable for exploring the mechanisms of olfaction and pheromone detection in S. littoralis, and for ultimately identifying original targets to fight against moth herbivorous pests. PMID:21276261
Uyeda, Josef C; Harmon, Luke J; Blank, Carrine E
2016-01-01
Cyanobacteria have exerted a profound influence on the progressive oxygenation of Earth. As a complementary approach to examining the geologic record-phylogenomic and trait evolutionary analyses of extant species can lead to new insights. We constructed new phylogenomic trees and analyzed phenotypic trait data using novel phylogenetic comparative methods. We elucidated the dynamics of trait evolution in Cyanobacteria over billion-year timescales, and provide evidence that major geologic events in early Earth's history have shaped-and been shaped by-evolution in Cyanobacteria. We identify a robust core cyanobacterial phylogeny and a smaller set of taxa that exhibit long-branch attraction artifacts. We estimated the age of nodes and reconstruct the ancestral character states of 43 phenotypic characters. We find high levels of phylogenetic signal for nearly all traits, indicating the phylogeny carries substantial predictive power. The earliest cyanobacterial lineages likely lived in freshwater habitats, had small cell diameters, were benthic or sessile, and possibly epilithic/endolithic with a sheath. We jointly analyzed a subset of 25 binary traits to determine whether rates of trait evolution have shifted over time in conjunction with major geologic events. Phylogenetic comparative analysis reveal an overriding signal of decreasing rates of trait evolution through time. Furthermore, the data suggest two major rate shifts in trait evolution associated with bursts of evolutionary innovation. The first rate shift occurs in the aftermath of the Great Oxidation Event and "Snowball Earth" glaciations and is associated with decrease in the evolutionary rates around 1.8-1.6 Ga. This rate shift seems to indicate the end of a major diversification of cyanobacterial phenotypes-particularly related to traits associated with filamentous morphology, heterocysts and motility in freshwater ecosystems. Another burst appears around the time of the Neoproterozoic Oxidation Event in the Neoproterozoic, and is associated with the acquisition of traits involved in planktonic growth in marine habitats. Our results demonstrate how uniting genomic and phenotypic datasets in extant bacterial species can shed light on billion-year old events in Earth's history.