phylogenetic comparative method: Topics by Science.gov

Sample records for phylogenetic comparative method

GENOME-WIDE COMPARATIVE ANALYSIS OF PHYLOGENETIC TREES: THE PROKARYOTIC FOREST OF LIFE

PubMed Central

Puigbò, Pere; Wolf, Yuri I.; Koonin, Eugene V.

2013-01-01

Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance (SD) method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the applications methods used to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a ‘species tree’. PMID:22399455
Genome-wide comparative analysis of phylogenetic trees: the prokaryotic forest of life.

PubMed

Puigbò, Pere; Wolf, Yuri I; Koonin, Eugene V

2012-01-01

Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article, we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the application of these methods to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."
Phylogenic inference using alignment-free methods for applications in microbial community surveys using 16s rRNA gene

PubMed Central

2017-01-01

The diversity of microbiota is best explored by understanding the phylogenetic structure of the microbial communities. Traditionally, sequence alignment has been used for phylogenetic inference. However, alignment-based approaches come with significant challenges and limitations when massive amounts of data are analyzed. In the recent decade, alignment-free approaches have enabled genome-scale phylogenetic inference. Here we evaluate three alignment-free methods: ACS, CVTree, and Kr for phylogenetic inference with 16s rRNA gene data. We use a taxonomic gold standard to compare the accuracy of alignment-free phylogenetic inference with that of common microbiome-wide phylogenetic inference pipelines based on PyNAST and MUSCLE alignments with FastTree and RAxML. We re-simulate fecal communities from Human Microbiome Project data to evaluate the performance of the methods on datasets with properties of real data. Our comparisons show that alignment-free methods are not inferior to alignment-based methods in giving accurate and robust phylogenic trees. Moreover, consensus ensembles of alignment-free phylogenies are superior to those built from alignment-based methods in their ability to highlight community differences in low power settings. In addition, the overall running times of alignment-based and alignment-free phylogenetic inference are comparable. Taken together our empirical results suggest that alignment-free methods provide a viable approach for microbiome-wide phylogenetic inference. PMID:29136663
Multivariate Phylogenetic Comparative Methods: Evaluations, Comparisons, and Recommendations.

PubMed

Adams, Dean C; Collyer, Michael L

2018-01-01

Recent years have seen increased interest in phylogenetic comparative analyses of multivariate data sets, but to date the varied proposed approaches have not been extensively examined. Here we review the mathematical properties required of any multivariate method, and specifically evaluate existing multivariate phylogenetic comparative methods in this context. Phylogenetic comparative methods based on the full multivariate likelihood are robust to levels of covariation among trait dimensions and are insensitive to the orientation of the data set, but display increasing model misspecification as the number of trait dimensions increases. This is because the expected evolutionary covariance matrix (V) used in the likelihood calculations becomes more ill-conditioned as trait dimensionality increases, and as evolutionary models become more complex. Thus, these approaches are only appropriate for data sets with few traits and many species. Methods that summarize patterns across trait dimensions treated separately (e.g., SURFACE) incorrectly assume independence among trait dimensions, resulting in nearly a 100% model misspecification rate. Methods using pairwise composite likelihood are highly sensitive to levels of trait covariation, the orientation of the data set, and the number of trait dimensions. The consequences of these debilitating deficiencies are that a user can arrive at differing statistical conclusions, and therefore biological inferences, simply from a dataspace rotation, like principal component analysis. By contrast, algebraic generalizations of the standard phylogenetic comparative toolkit that use the trace of covariance matrices are insensitive to levels of trait covariation, the number of trait dimensions, and the orientation of the data set. Further, when appropriate permutation tests are used, these approaches display acceptable Type I error and statistical power. We conclude that methods summarizing information across trait dimensions, as well as pairwise composite likelihood methods should be avoided, whereas algebraic generalizations of the phylogenetic comparative toolkit provide a useful means of assessing macroevolutionary patterns in multivariate data. Finally, we discuss areas in which multivariate phylogenetic comparative methods are still in need of future development; namely highly multivariate Ornstein-Uhlenbeck models and approaches for multivariate evolutionary model comparisons. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks.

PubMed

Oh, S June; Joung, Je-Gun; Chang, Jeong-Ho; Zhang, Byoung-Tak

2006-06-06

To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence information. This method may yield further information about biological evolution, such as the history of horizontal transfer of each gene, by studying the detailed structure of the phylogenetic tree constructed by the kernel-based method.
Phylogenetic inference under varying proportions of indel-induced alignment gaps

PubMed Central

Dwivedi, Bhakti; Gadagkar, Sudhindra R

2009-01-01

Background The effect of alignment gaps on phylogenetic accuracy has been the subject of numerous studies. In this study, we investigated the relationship between the total number of gapped sites and phylogenetic accuracy, when the gaps were introduced (by means of computer simulation) to reflect indel (insertion/deletion) events during the evolution of DNA sequences. The resulting (true) alignments were subjected to commonly used gap treatment and phylogenetic inference methods. Results (1) In general, there was a strong – almost deterministic – relationship between the amount of gap in the data and the level of phylogenetic accuracy when the alignments were very "gappy", (2) gaps resulting from deletions (as opposed to insertions) contributed more to the inaccuracy of phylogenetic inference, (3) the probabilistic methods (Bayesian, PhyML & "MLε, " a method implemented in DNAML in PHYLIP) performed better at most levels of gap percentage when compared to parsimony (MP) and distance (NJ) methods, with Bayesian analysis being clearly the best, (4) methods that treat gapped sites as missing data yielded less accurate trees when compared to those that attribute phylogenetic signal to the gapped sites (by coding them as binary character data – presence/absence, or as in the MLε method), and (5) in general, the accuracy of phylogenetic inference depended upon the amount of available data when the gaps resulted from mainly deletion events, and the amount of missing data when insertion events were equally likely to have caused the alignment gaps. Conclusion When gaps in an alignment are a consequence of indel events in the evolution of the sequences, the accuracy of phylogenetic analysis is likely to improve if: (1) alignment gaps are categorized as arising from insertion events or deletion events and then treated separately in the analysis, (2) the evolutionary signal provided by indels is harnessed in the phylogenetic analysis, and (3) methods that utilize the phylogenetic signal in indels are developed for distance methods too. When the true homology is known and the amount of gaps is 20 percent of the alignment length or less, the methods used in this study are likely to yield trees with 90–100 percent accuracy. PMID:19698168
How does cognition evolve? Phylogenetic comparative psychology

PubMed Central

Matthews, Luke J.; Hare, Brian A.; Nunn, Charles L.; Anderson, Rindy C.; Aureli, Filippo; Brannon, Elizabeth M.; Call, Josep; Drea, Christine M.; Emery, Nathan J.; Haun, Daniel B. M.; Herrmann, Esther; Jacobs, Lucia F.; Platt, Michael L.; Rosati, Alexandra G.; Sandel, Aaron A.; Schroepfer, Kara K.; Seed, Amanda M.; Tan, Jingzhi; van Schaik, Carel P.; Wobber, Victoria

2014-01-01

Now more than ever animal studies have the potential to test hypotheses regarding how cognition evolves. Comparative psychologists have developed new techniques to probe the cognitive mechanisms underlying animal behavior, and they have become increasingly skillful at adapting methodologies to test multiple species. Meanwhile, evolutionary biologists have generated quantitative approaches to investigate the phylogenetic distribution and function of phenotypic traits, including cognition. In particular, phylogenetic methods can quantitatively (1) test whether specific cognitive abilities are correlated with life history (e.g., lifespan), morphology (e.g., brain size), or socio-ecological variables (e.g., social system), (2) measure how strongly phylogenetic relatedness predicts the distribution of cognitive skills across species, and (3) estimate the ancestral state of a given cognitive trait using measures of cognitive performance from extant species. Phylogenetic methods can also be used to guide the selection of species comparisons that offer the strongest tests of a priori predictions of cognitive evolutionary hypotheses (i.e., phylogenetic targeting). Here, we explain how an integration of comparative psychology and evolutionary biology will answer a host of questions regarding the phylogenetic distribution and history of cognitive traits, as well as the evolutionary processes that drove their evolution. PMID:21927850
How does cognition evolve? Phylogenetic comparative psychology.

PubMed

MacLean, Evan L; Matthews, Luke J; Hare, Brian A; Nunn, Charles L; Anderson, Rindy C; Aureli, Filippo; Brannon, Elizabeth M; Call, Josep; Drea, Christine M; Emery, Nathan J; Haun, Daniel B M; Herrmann, Esther; Jacobs, Lucia F; Platt, Michael L; Rosati, Alexandra G; Sandel, Aaron A; Schroepfer, Kara K; Seed, Amanda M; Tan, Jingzhi; van Schaik, Carel P; Wobber, Victoria

2012-03-01

Now more than ever animal studies have the potential to test hypotheses regarding how cognition evolves. Comparative psychologists have developed new techniques to probe the cognitive mechanisms underlying animal behavior, and they have become increasingly skillful at adapting methodologies to test multiple species. Meanwhile, evolutionary biologists have generated quantitative approaches to investigate the phylogenetic distribution and function of phenotypic traits, including cognition. In particular, phylogenetic methods can quantitatively (1) test whether specific cognitive abilities are correlated with life history (e.g., lifespan), morphology (e.g., brain size), or socio-ecological variables (e.g., social system), (2) measure how strongly phylogenetic relatedness predicts the distribution of cognitive skills across species, and (3) estimate the ancestral state of a given cognitive trait using measures of cognitive performance from extant species. Phylogenetic methods can also be used to guide the selection of species comparisons that offer the strongest tests of a priori predictions of cognitive evolutionary hypotheses (i.e., phylogenetic targeting). Here, we explain how an integration of comparative psychology and evolutionary biology will answer a host of questions regarding the phylogenetic distribution and history of cognitive traits, as well as the evolutionary processes that drove their evolution.
Mapping Phylogenetic Trees to Reveal Distinct Patterns of Evolution

PubMed Central

Kendall, Michelle; Colijn, Caroline

2016-01-01

Evolutionary relationships are frequently described by phylogenetic trees, but a central barrier in many fields is the difficulty of interpreting data containing conflicting phylogenetic signals. We present a metric-based method for comparing trees which extracts distinct alternative evolutionary relationships embedded in data. We demonstrate detection and resolution of phylogenetic uncertainty in a recent study of anole lizards, leading to alternate hypotheses about their evolutionary relationships. We use our approach to compare trees derived from different genes of Ebolavirus and find that the VP30 gene has a distinct phylogenetic signature composed of three alternatives that differ in the deep branching structure. Key words: phylogenetics, evolution, tree metrics, genetics, sequencing. PMID:27343287
Genomic Repeat Abundances Contain Phylogenetic Signal

PubMed Central

Dodsworth, Steven; Chase, Mark W.; Kelly, Laura J.; Leitch, Ilia J.; Macas, Jiří; Novák, Petr; Piednoël, Mathieu; Weiss-Schneeweiss, Hanna; Leitch, Andrew R.

2015-01-01

A large proportion of genomic information, particularly repetitive elements, is usually ignored when researchers are using next-generation sequencing. Here we demonstrate the usefulness of this repetitive fraction in phylogenetic analyses, utilizing comparative graph-based clustering of next-generation sequence reads, which results in abundance estimates of different classes of genomic repeats. Phylogenetic trees are then inferred based on the genome-wide abundance of different repeat types treated as continuously varying characters; such repeats are scattered across chromosomes and in angiosperms can constitute a majority of nuclear genomic DNA. In six diverse examples, five angiosperms and one insect, this method provides generally well-supported relationships at interspecific and intergeneric levels that agree with results from more standard phylogenetic analyses of commonly used markers. We propose that this methodology may prove especially useful in groups where there is little genetic differentiation in standard phylogenetic markers. At the same time as providing data for phylogenetic inference, this method additionally yields a wealth of data for comparative studies of genome evolution. PMID:25261464
A fully resolved consensus between fully resolved phylogenetic trees.

PubMed

Quitzau, José Augusto Amgarten; Meidanis, João

2006-03-31

Nowadays, there are many phylogeny reconstruction methods, each with advantages and disadvantages. We explored the advantages of each method, putting together the common parts of trees constructed by several methods, by means of a consensus computation. A number of phylogenetic consensus methods are already known. Unfortunately, there is also a taboo concerning consensus methods, because most biologists see them mainly as comparators and not as phylogenetic tree constructors. We challenged this taboo by defining a consensus method that builds a fully resolved phylogenetic tree based on the most common parts of fully resolved trees in a given collection. We also generated results showing that this consensus is in a way a kind of "median" of the input trees; as such it can be closer to the correct tree in many situations.
Bayesian models for comparative analysis integrating phylogenetic uncertainty

PubMed Central

2012-01-01

Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602
Autoregressive models for estimating phylogenetic and environmental effects: accounting for within-species variations.

PubMed

Cornillon, P A; Pontier, D; Rochet, M J

2000-02-21

Comparative methods are used to investigate the attributes of present species or higher taxa. Difficulties arise from the phylogenetic heritage: taxa are not independent and neglecting phylogenetic inertia can lead to inaccurate results. Within-species variations in life-history traits are also not negligible, but most comparative methods are not designed to take them into account. Taxa are generally described by a single value for each trait. We have developed a new model which permits the incorporation of both the phylogenetic relationships among populations and within-species variations. This is an extension of classical autoregressive models. This family of models was used to study the effect of fishing on six demographic traits measured on 77 populations of teleost fishes. Copyright 2000 Academic Press.
Mapping Phylogenetic Trees to Reveal Distinct Patterns of Evolution.

PubMed

Kendall, Michelle; Colijn, Caroline

2016-10-01

Evolutionary relationships are frequently described by phylogenetic trees, but a central barrier in many fields is the difficulty of interpreting data containing conflicting phylogenetic signals. We present a metric-based method for comparing trees which extracts distinct alternative evolutionary relationships embedded in data. We demonstrate detection and resolution of phylogenetic uncertainty in a recent study of anole lizards, leading to alternate hypotheses about their evolutionary relationships. We use our approach to compare trees derived from different genes of Ebolavirus and find that the VP30 gene has a distinct phylogenetic signature composed of three alternatives that differ in the deep branching structure. phylogenetics, evolution, tree metrics, genetics, sequencing. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Comparative Analysis of Begonia Plastid Genomes and Their Utility for Species-Level Phylogenetics

PubMed Central

Harrison, Nicola; Harrison, Richard J.

2016-01-01

Recent, rapid radiations make species-level phylogenetics difficult to resolve. We used a multiplexed, high-throughput sequencing approach to identify informative genomic regions to resolve phylogenetic relationships at low taxonomic levels in Begonia from a survey of sixteen species. A long-range PCR method was used to generate draft plastid genomes to provide a strong phylogenetic backbone, identify fast evolving regions and provide informative molecular markers for species-level phylogenetic studies in Begonia. PMID:27058864
Bridging meta-analysis and the comparative method: a test of seed size effect on germination after frugivores' gut passage.

PubMed

Verdú, Miguel; Traveset, Anna

2004-02-01

Most studies using meta-analysis try to establish relationships between traits across taxa from interspecific databases and, thus, the phylogenetic relatedness among these taxa should be taken into account to avoid pseudoreplication derived from common ancestry. This paper illustrates, with a representative example of the relationship between seed size and the effect of frugivore's gut on seed germination, that meta-analytic procedures can also be phylogenetically corrected by means of the comparative method. The conclusions obtained in the meta-analytical and phylogenetical approaches are very different. The meta-analysis revealed that the positive effects that gut passage had on seed germination increased with seed size in the case of gut passage through birds whereas decreased in the case of gut passage through non-flying mammals. However, once the phylogenetic relatedness among plant species was taken into account, the effects of gut passage on seed germination did not depend on seed size and were similar between birds and non-flying mammals. Some methodological considerations are given to improve the bridge between the meta-analysis and the comparative method.
Phylogenetic comparative methods on phylogenetic networks with reticulations.

PubMed

Bastide, Paul; Solís-Lemus, Claudia; Kriebel, Ricardo; Sparks, K William; Ané, Cécile

2018-04-25

The goal of Phylogenetic Comparative Methods (PCMs) is to study the distribution of quantitative traits among related species. The observed traits are often seen as the result of a Brownian Motion (BM) along the branches of a phylogenetic tree. Reticulation events such as hybridization, gene flow or horizontal gene transfer, can substantially affect a species' traits, but are not modeled by a tree. Phylogenetic networks have been designed to represent reticulate evolution. As they become available for downstream analyses, new models of trait evolution are needed, applicable to networks. One natural extension of the BM is to use a weighted average model for the trait of a hybrid, at a reticulation point. We develop here an efficient recursive algorithm to compute the phylogenetic variance matrix of a trait on a network, in only one preorder traversal of the network. We then extend the standard PCM tools to this new framework, including phylogenetic regression with covariates (or phylogenetic ANOVA), ancestral trait reconstruction, and Pagel's λ test of phylogenetic signal. The trait of a hybrid is sometimes outside of the range of its two parents, for instance because of hybrid vigor or hybrid depression. These two phenomena are rather commonly observed in present-day hybrids. Transgressive evolution can be modeled as a shift in the trait value following a reticulation point. We develop a general framework to handle such shifts, and take advantage of the phylogenetic regression view of the problem to design statistical tests for ancestral transgressive evolution in the evolutionary history of a group of species. We study the power of these tests in several scenarios, and show that recent events have indeed the strongest impact on the trait distribution of present-day taxa. We apply those methods to a dataset of Xiphophorus fishes, to confirm and complete previous analysis in this group. All the methods developed here are available in the Julia package PhyloNetworks.
A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.

PubMed

Rajan, Vaibhav

2013-03-01

Inaccurate inference of positional homologies in multiple sequence alignments and systematic errors introduced by alignment heuristics obfuscate phylogenetic inference. Alignment masking, the elimination of phylogenetically uninformative or misleading sites from an alignment before phylogenetic analysis, is a common practice in phylogenetic analysis. Although masking is often done manually, automated methods are necessary to handle the much larger data sets being prepared today. In this study, we introduce the concept of subsplits and demonstrate their use in extracting phylogenetic signal from alignments. We design a clustering approach for alignment masking where each cluster contains similar columns-similarity being defined on the basis of compatible subsplits; our approach then identifies noisy clusters and eliminates them. Trees inferred from the columns in the retained clusters are found to be topologically closer to the reference trees. We test our method on numerous standard benchmarks (both synthetic and biological data sets) and compare its performance with other methods of alignment masking. We find that our method can eliminate sites more accurately than other methods, particularly on divergent data, and can improve the topologies of the inferred trees in likelihood-based analyses. Software available upon request from the author.
The Independent Evolution Method Is Not a Viable Phylogenetic Comparative Method

PubMed Central

2015-01-01

Phylogenetic comparative methods (PCMs) use data on species traits and phylogenetic relationships to shed light on evolutionary questions. Recently, Smaers and Vinicius suggested a new PCM, Independent Evolution (IE), which purportedly employs a novel model of evolution based on Felsenstein’s Adaptive Peak Model. The authors found that IE improves upon previous PCMs by producing more accurate estimates of ancestral states, as well as separate estimates of evolutionary rates for each branch of a phylogenetic tree. Here, we document substantial theoretical and computational issues with IE. When data are simulated under a simple Brownian motion model of evolution, IE produces severely biased estimates of ancestral states and changes along individual branches. We show that these branch-specific changes are essentially ancestor-descendant or “directional” contrasts, and draw parallels between IE and previous PCMs such as “minimum evolution”. Additionally, while comparisons of branch-specific changes between variables have been interpreted as reflecting the relative strength of selection on those traits, we demonstrate through simulations that regressing IE estimated branch-specific changes against one another gives a biased estimate of the scaling relationship between these variables, and provides no advantages or insights beyond established PCMs such as phylogenetically independent contrasts. In light of our findings, we discuss the results of previous papers that employed IE. We conclude that Independent Evolution is not a viable PCM, and should not be used in comparative analyses. PMID:26683838
An integrative view of phylogenetic comparative methods: connections to population genetics, community ecology, and paleobiology.

PubMed

Pennell, Matthew W; Harmon, Luke J

2013-06-01

Recent innovations in phylogenetic comparative methods (PCMs) have spurred a renaissance of research into the causes and consequences of large-scale patterns of biodiversity. In this paper, we review these advances. We also highlight the potential of comparative methods to integrate across fields and focus on three examples where such integration might be particularly valuable: quantitative genetics, community ecology, and paleobiology. We argue that PCMs will continue to be a key set of tools in evolutionary biology, shedding new light on how evolutionary processes have shaped patterns of biodiversity through deep time. © 2013 New York Academy of Sciences.

The comparative ecology and biogeography of parasites

PubMed Central

Poulin, Robert; Krasnov, Boris R.; Mouillot, David; Thieltges, David W.

2011-01-01

Comparative ecology uses interspecific relationships among traits, while accounting for the phylogenetic non-independence of species, to uncover general evolutionary processes. Applied to biogeographic questions, it can be a powerful tool to explain the spatial distribution of organisms. Here, we review how comparative methods can elucidate biogeographic patterns and processes, using analyses of distributional data on parasites (fleas and helminths) as case studies. Methods exist to detect phylogenetic signals, i.e. the degree of phylogenetic dependence of a given character, and either to control for these signals in statistical analyses of interspecific data, or to measure their contribution to variance. Parasite–host interactions present a special case, as a given trait may be a parasite trait, a host trait or a property of the coevolved association rather than of one participant only. For some analyses, it is therefore necessary to correct simultaneously for both parasite phylogeny and host phylogeny, or to evaluate which has the greatest influence on trait expression. Using comparative approaches, we show that two fundamental properties of parasites, their niche breadth, i.e. host specificity, and the nature of their life cycle, can explain interspecific and latitudinal variation in the sizes of their geographical ranges, or rates of distance decay in the similarity of parasite communities. These findings illustrate the ways in which phylogenetically based comparative methods can contribute to biogeographic research. PMID:21768153
Undergraduate Students’ Difficulties in Reading and Constructing Phylogenetic Tree

NASA Astrophysics Data System (ADS)

Sa'adah, S.; Tapilouw, F. S.; Hidayat, T.

2017-02-01

Representation is a very important communication tool to communicate scientific concepts. Biologists produce phylogenetic representation to express their understanding of evolutionary relationships. The phylogenetic tree is visual representation depict a hypothesis about the evolutionary relationship and widely used in the biological sciences. Phylogenetic tree currently growing for many disciplines in biology. Consequently, learning about phylogenetic tree become an important part of biological education and an interesting area for biology education research. However, research showed many students often struggle with interpreting the information that phylogenetic trees depict. The purpose of this study was to investigate undergraduate students’ difficulties in reading and constructing a phylogenetic tree. The method of this study is a descriptive method. In this study, we used questionnaires, interviews, multiple choice and open-ended questions, reflective journals and observations. The findings showed students experiencing difficulties, especially in constructing a phylogenetic tree. The students’ responds indicated that main reasons for difficulties in constructing a phylogenetic tree are difficult to placing taxa in a phylogenetic tree based on the data provided so that the phylogenetic tree constructed does not describe the actual evolutionary relationship (incorrect relatedness). Students also have difficulties in determining the sister group, character synapomorphy, autapomorphy from data provided (character table) and comparing among phylogenetic tree. According to them building the phylogenetic tree is more difficult than reading the phylogenetic tree. Finding this studies provide information to undergraduate instructor and students to overcome learning difficulties of reading and constructing phylogenetic tree.
A Penalized Likelihood Framework For High-Dimensional Phylogenetic Comparative Methods And An Application To New-World Monkeys Brain Evolution.

PubMed

Julien, Clavel; Leandro, Aristide; Hélène, Morlon

2018-06-19

Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Macroevolutionary developmental biology: Embryos, fossils, and phylogenies.

PubMed

Organ, Chris L; Cooper, Lisa Noelle; Hieronymus, Tobin L

2015-10-01

The field of evolutionary developmental biology is broadly focused on identifying the genetic and developmental mechanisms underlying morphological diversity. Connecting the genotype with the phenotype means that evo-devo research often considers a wide range of evidence, from genetics and morphology to fossils. In this commentary, we provide an overview and framework for integrating fossil ontogenetic data with developmental data using phylogenetic comparative methods to test macroevolutionary hypotheses. We survey the vertebrate fossil record of preserved embryos and discuss how phylogenetic comparative methods can integrate data from developmental genetics and paleontology. Fossil embryos provide limited, yet critical, developmental data from deep time. They help constrain when developmental innovations first appeared during the history of life and also reveal the order in which related morphologies evolved. Phylogenetic comparative methods provide a powerful statistical approach that allows evo-devo researchers to infer the presence of nonpreserved developmental traits in fossil species and to detect discordant evolutionary patterns and processes across levels of biological organization. © 2015 Wiley Periodicals, Inc.
Comparing Mycobacterium tuberculosis genomes using genome topology networks.

PubMed

Jiang, Jianping; Gu, Jianlei; Zhang, Liang; Zhang, Chenyi; Deng, Xiao; Dou, Tonghai; Zhao, Guoping; Zhou, Yan

2015-02-14

Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. In this work, we introduce a 'Genome Topology Network' (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of 'unfixed ortholog' has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309). The GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/ , and allows re-annotating the 'lost' genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes were found to be affected by SVs in M. tuberculosis genomes. We believe that the GTN method will be suitable for the exploration of genomic SVs in connection with biological features of bacterial strains, and that GTN-based phylogenetic analysis will provide additional insights into whole genome-based phylogenetic analysis.
Tetrapods on the EDGE: Overcoming data limitations to identify phylogenetic conservation priorities

PubMed Central

Gray, Claudia L.; Wearn, Oliver R.; Owen, Nisha R.

2018-01-01

The scale of the ongoing biodiversity crisis requires both effective conservation prioritisation and urgent action. As extinction is non-random across the tree of life, it is important to prioritise threatened species which represent large amounts of evolutionary history. The EDGE metric prioritises species based on their Evolutionary Distinctiveness (ED), which measures the relative contribution of a species to the total evolutionary history of their taxonomic group, and Global Endangerment (GE), or extinction risk. EDGE prioritisations rely on adequate phylogenetic and extinction risk data to generate meaningful priorities for conservation. However, comprehensive phylogenetic trees of large taxonomic groups are extremely rare and, even when available, become quickly out-of-date due to the rapid rate of species descriptions and taxonomic revisions. Thus, it is important that conservationists can use the available data to incorporate evolutionary history into conservation prioritisation. We compared published and new methods to estimate missing ED scores for species absent from a phylogenetic tree whilst simultaneously correcting the ED scores of their close taxonomic relatives. We found that following artificial removal of species from a phylogenetic tree, the new method provided the closest estimates of their “true” ED score, differing from the true ED score by an average of less than 1%, compared to the 31% and 38% difference of the previous methods. The previous methods also substantially under- and over-estimated scores as more species were artificially removed from a phylogenetic tree. We therefore used the new method to estimate ED scores for all tetrapods. From these scores we updated EDGE prioritisation rankings for all tetrapod species with IUCN Red List assessments, including the first EDGE prioritisation for reptiles. Further, we identified criteria to identify robust priority species in an effort to further inform conservation action whilst limiting uncertainty and anticipating future phylogenetic advances. PMID:29641585
DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

PubMed

Kelly, Steven; Maini, Philip K

2013-01-01

The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.
The space of ultrametric phylogenetic trees.

PubMed

Gavryushkin, Alex; Drummond, Alexei J

2016-08-21

The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Distance-Based Phylogenetic Methods Around a Polytomy.

PubMed

Davidson, Ruth; Sullivant, Seth

2014-01-01

Distance-based phylogenetic algorithms attempt to solve the NP-hard least-squares phylogeny problem by mapping an arbitrary dissimilarity map representing biological data to a tree metric. The set of all dissimilarity maps is a Euclidean space properly containing the space of all tree metrics as a polyhedral fan. Outputs of distance-based tree reconstruction algorithms such as UPGMA and neighbor-joining are points in the maximal cones in the fan. Tree metrics with polytomies lie at the intersections of maximal cones. A phylogenetic algorithm divides the space of all dissimilarity maps into regions based upon which combinatorial tree is reconstructed by the algorithm. Comparison of phylogenetic methods can be done by comparing the geometry of these regions. We use polyhedral geometry to compare the local nature of the subdivisions induced by least-squares phylogeny, UPGMA, and neighbor-joining when the true tree has a single polytomy with exactly four neighbors. Our results suggest that in some circumstances, UPGMA and neighbor-joining poorly match least-squares phylogeny.
Visualizing phylogenetic tree landscapes.

PubMed

Wilgenbusch, James C; Huang, Wen; Gallivan, Kyle A

2017-02-02

Genomic-scale sequence alignments are increasingly used to infer phylogenies in order to better understand the processes and patterns of evolution. Different partitions within these new alignments (e.g., genes, codon positions, and structural features) often favor hundreds if not thousands of competing phylogenies. Summarizing and comparing phylogenies obtained from multi-source data sets using current consensus tree methods discards valuable information and can disguise potential methodological problems. Discovery of efficient and accurate dimensionality reduction methods used to display at once in 2- or 3- dimensions the relationship among these competing phylogenies will help practitioners diagnose the limits of current evolutionary models and potential problems with phylogenetic reconstruction methods when analyzing large multi-source data sets. We introduce several dimensionality reduction methods to visualize in 2- and 3-dimensions the relationship among competing phylogenies obtained from gene partitions found in three mid- to large-size mitochondrial genome alignments. We test the performance of these dimensionality reduction methods by applying several goodness-of-fit measures. The intrinsic dimensionality of each data set is also estimated to determine whether projections in 2- and 3-dimensions can be expected to reveal meaningful relationships among trees from different data partitions. Several new approaches to aid in the comparison of different phylogenetic landscapes are presented. Curvilinear Components Analysis (CCA) and a stochastic gradient decent (SGD) optimization method give the best representation of the original tree-to-tree distance matrix for each of the three- mitochondrial genome alignments and greatly outperformed the method currently used to visualize tree landscapes. The CCA + SGD method converged at least as fast as previously applied methods for visualizing tree landscapes. We demonstrate for all three mtDNA alignments that 3D projections significantly increase the fit between the tree-to-tree distances and can facilitate the interpretation of the relationship among phylogenetic trees. We demonstrate that the choice of dimensionality reduction method can significantly influence the spatial relationship among a large set of competing phylogenetic trees. We highlight the importance of selecting a dimensionality reduction method to visualize large multi-locus phylogenetic landscapes and demonstrate that 3D projections of mitochondrial tree landscapes better capture the relationship among the trees being compared.
Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.

PubMed

Sferra, Gabriella; Fratini, Federica; Ponzi, Marta; Pizzi, Elisabetta

2017-09-05

Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson's correlation as measures of profile similarity. In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request.
Tanglegrams for rooted phylogenetic trees and networks

PubMed Central

Scornavacca, Celine; Zickmann, Franziska; Huson, Daniel H.

2011-01-01

Motivation: In systematic biology, one is often faced with the task of comparing different phylogenetic trees, in particular in multi-gene analysis or cospeciation studies. One approach is to use a tanglegram in which two rooted phylogenetic trees are drawn opposite each other, using auxiliary lines to connect matching taxa. There is an increasing interest in using rooted phylogenetic networks to represent evolutionary history, so as to explicitly represent reticulate events, such as horizontal gene transfer, hybridization or reassortment. Thus, the question arises how to define and compute a tanglegram for such networks. Results: In this article, we present the first formal definition of a tanglegram for rooted phylogenetic networks and present a heuristic approach for computing one, called the NN-tanglegram method. We compare the performance of our method with existing tree tanglegram algorithms and also show a typical application to real biological datasets. For maximum usability, the algorithm does not require that the trees or networks are bifurcating or bicombining, or that they are on identical taxon sets. Availability: The algorithm is implemented in our program Dendroscope 3, which is freely available from www.dendroscope.org. Contact: scornava@informatik.uni-tuebingen.de; huson@informatik.uni-tuebingen.de PMID:21685078
Cross-validation to select Bayesian hierarchical models in phylogenetics.

PubMed

Duchêne, Sebastián; Duchêne, David A; Di Giallonardo, Francesca; Eden, John-Sebastian; Geoghegan, Jemma L; Holt, Kathryn E; Ho, Simon Y W; Holmes, Edward C

2016-05-26

Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clock models. Cross-validation is a useful method for Bayesian phylogenetic model selection. This method can be readily implemented even when considering complex models where selecting an appropriate prior for all parameters may be difficult.
Bayesian models for comparative analysis integrating phylogenetic uncertainty.

PubMed

de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P

2012-06-28

Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.
Cophenetic correlation analysis as a strategy to select phylogenetically informative proteins: an example from the fungal kingdom

PubMed Central

Kuramae, Eiko E; Robert, Vincent; Echavarri-Erasun, Carlos; Boekhout, Teun

2007-01-01

Background The construction of robust and well resolved phylogenetic trees is important for our understanding of many, if not all biological processes, including speciation and origin of higher taxa, genome evolution, metabolic diversification, multicellularity, origin of life styles, pathogenicity and so on. Many older phylogenies were not well supported due to insufficient phylogenetic signal present in the single or few genes used in phylogenetic reconstructions. Importantly, single gene phylogenies were not always found to be congruent. The phylogenetic signal may, therefore, be increased by enlarging the number of genes included in phylogenetic studies. Unfortunately, concatenation of many genes does not take into consideration the evolutionary history of each individual gene. Here, we describe an approach to select informative phylogenetic proteins to be used in the Tree of Life (TOL) and barcoding projects by comparing the cophenetic correlation coefficients (CCC) among individual protein distance matrices of proteins, using the fungi as an example. The method demonstrated that the quality and number of concatenated proteins is important for a reliable estimation of TOL. Approximately 40–45 concatenated proteins seem needed to resolve fungal TOL. Results In total 4852 orthologous proteins (KOGs) were assigned among 33 fungal genomes from the Asco- and Basidiomycota and 70 of these represented single copy proteins. The individual protein distance matrices based on 531 concatenated proteins that has been used for phylogeny reconstruction before [14] were compared one with another in order to select those with the highest CCC, which then was used as a reference. This reference distance matrix was compared with those of the 70 single copy proteins selected and their CCC values were calculated. Sixty four KOGs showed a CCC above 0.50 and these were further considered for their phylogenetic potential. Proteins belonging to the cellular processes and signaling KOG category seem more informative than those belonging to the other three categories: information storage and processing; metabolism; and the poorly characterized category. After concatenation of 40 proteins the topology of the phylogenetic tree remained stable, but after concatenation of 60 or more proteins the bootstrap support values of some branches decreased, most likely due to the inclusion of proteins with lowers CCC values. The selection of protein sequences to be used in various TOL projects remains a critical and important process. The method described in this paper will contribute to a more objective selection of phylogenetically informative protein sequences. Conclusion This study provides candidate protein sequences to be considered as phylogenetic markers in different branches of fungal TOL. The selection procedure described here will be useful to select informative protein sequences to resolve branches of TOL that contain few or no species with completely sequenced genomes. The robust phylogenetic trees resulting from this method may contribute to our understanding of organismal diversification processes. The method proposed can be extended easily to other branches of TOL. PMID:17688684
Assigning protein functions by comparative genome analysis protein phylogenetic profiles

DOEpatents

Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

2003-05-13

A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies

PubMed Central

Leaché, Adam D.; Banbury, Barbara L.; Felsenstein, Joseph; de Oca, Adrián nieto-Montes; Stamatakis, Alexandros

2015-01-01

Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the presence of missing data. Phylogenetic analysis of RAD loci requires careful attention to model assumptions, especially if downstream analyses depend on branch lengths. PMID:26227865
Inferring Phylogenetic Networks Using PhyloNet.

PubMed

Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay

2018-07-01

PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Undergraduate Students’ Initial Ability in Understanding Phylogenetic Tree

NASA Astrophysics Data System (ADS)

Sa'adah, S.; Hidayat, T.; Sudargo, Fransisca

2017-04-01

The Phylogenetic tree is a visual representation depicts a hypothesis about the evolutionary relationship among taxa. Evolutionary experts use this representation to evaluate the evidence for evolution. The phylogenetic tree is currently growing for many disciplines in biology. Consequently, learning about the phylogenetic tree has become an important part of biological education and an interesting area of biology education research. Skill to understanding and reasoning of the phylogenetic tree, (called tree thinking) is an important skill for biology students. However, research showed many students have difficulty in interpreting, constructing, and comparing among the phylogenetic tree, as well as experiencing a misconception in the understanding of the phylogenetic tree. Students are often not taught how to reason about evolutionary relationship depicted in the diagram. Students are also not provided with information about the underlying theory and process of phylogenetic. This study aims to investigate the initial ability of undergraduate students in understanding and reasoning of the phylogenetic tree. The research method is the descriptive method. Students are given multiple choice questions and an essay that representative by tree thinking elements. Each correct answer made percentages. Each student is also given questionnaires. The results showed that the undergraduate students’ initial ability in understanding and reasoning phylogenetic tree is low. Many students are not able to answer questions about the phylogenetic tree. Only 19 % undergraduate student who answered correctly on indicator evaluate the evolutionary relationship among taxa, 25% undergraduate student who answered correctly on indicator applying concepts of the clade, 17% undergraduate student who answered correctly on indicator determines the character evolution, and only a few undergraduate student who can construct the phylogenetic tree.
Phylogenetic patterns of climatic, habitat and trophic niches in a European avian assemblage

PubMed Central

Pearman, Peter B; Lavergne, Sébastien; Roquet, Cristina; Wüest, Rafael; Zimmermann, Niklaus E; Thuiller, Wilfried

2014-01-01

Aim The origins of ecological diversity in continental species assemblages have long intrigued biogeographers. We apply phylogenetic comparative analyses to disentangle the evolutionary patterns of ecological niches in an assemblage of European birds. We compare phylogenetic patterns in trophic, habitat and climatic niche components. Location Europe. Methods From polygon range maps and handbook data we inferred the realized climatic, habitat and trophic niches of 405 species of breeding birds in Europe. We fitted Pagel's lambda and kappa statistics, and conducted analyses of disparity through time to compare temporal patterns of ecological diversification on all niche axes together. All observed patterns were compared with expectations based on neutral (Brownian) models of niche divergence. Results In this assemblage, patterns of phylogenetic signal (lambda) suggest that related species resemble each other less in regard to their climatic and habitat niches than they do in their trophic niche. Kappa estimates show that ecological divergence does not gradually increase with divergence time, and that this punctualism is stronger in climatic niches than in habitat and trophic niches. Observed niche disparity markedly exceeds levels expected from a Brownian model of ecological diversification, thus providing no evidence for past phylogenetic niche conservatism in these multivariate niches. Levels of multivariate disparity are greatest for the climatic niche, followed by disparity of the habitat and the trophic niches. Main conclusions Phylogenetic patterns in the three niche components differ within this avian assemblage. Variation in evolutionary rates (degree of gradualism, constancy through the tree) and/or non-random macroecological sampling probably lead here to differences in the phylogenetic structure of niche components. Testing hypotheses on the origin of these patterns requires more complete phylogenetic trees of the birds, and extended ecological data on different niche components for all bird species. PMID:24790525

Random sampling of constrained phylogenies: conducting phylogenetic analyses when the phylogeny is partially known.

PubMed

Housworth, E A; Martins, E P

2001-01-01

Statistical randomization tests in evolutionary biology often require a set of random, computer-generated trees. For example, earlier studies have shown how large numbers of computer-generated trees can be used to conduct phylogenetic comparative analyses even when the phylogeny is uncertain or unknown. These methods were limited, however, in that (in the absence of molecular sequence or other data) they allowed users to assume that no phylogenetic information was available or that all possible trees were known. Intermediate situations where only a taxonomy or other limited phylogenetic information (e.g., polytomies) are available are technically more difficult. The current study describes a procedure for generating random samples of phylogenies while incorporating limited phylogenetic information (e.g., four taxa belong together in a subclade). The procedure can be used to conduct comparative analyses when the phylogeny is only partially resolved or can be used in other randomization tests in which large numbers of possible phylogenies are needed.
Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions

PubMed Central

Giaro, Krzysztof

2017-01-01

Abstract Ability to quantify dissimilarity of different phylogenetic trees describing the relationship between the same group of taxa is required in various types of phylogenetic studies. For example, such metrics are used to assess the quality of phylogeny construction methods, to define optimization criteria in supertree building algorithms, or to find horizontal gene transfer (HGT) events. Among the set of metrics described so far in the literature, the most commonly used seems to be the Robinson–Foulds distance. In this article, we define a new metric for rooted trees—the Matching Pair (MP) distance. The MP metric uses the concept of the minimum-weight perfect matching in a complete bipartite graph constructed from partitions of all pairs of leaves of the compared phylogenetic trees. We analyze the properties of the MP metric and present computational experiments showing its potential applicability in tasks related to finding the HGT events. PMID:28177699
Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions.

PubMed

Bogdanowicz, Damian; Giaro, Krzysztof

2017-05-01

Ability to quantify dissimilarity of different phylogenetic trees describing the relationship between the same group of taxa is required in various types of phylogenetic studies. For example, such metrics are used to assess the quality of phylogeny construction methods, to define optimization criteria in supertree building algorithms, or to find horizontal gene transfer (HGT) events. Among the set of metrics described so far in the literature, the most commonly used seems to be the Robinson-Foulds distance. In this article, we define a new metric for rooted trees-the Matching Pair (MP) distance. The MP metric uses the concept of the minimum-weight perfect matching in a complete bipartite graph constructed from partitions of all pairs of leaves of the compared phylogenetic trees. We analyze the properties of the MP metric and present computational experiments showing its potential applicability in tasks related to finding the HGT events.
Enumerating all maximal frequent subtrees in collections of phylogenetic trees

PubMed Central

2014-01-01

Background A common problem in phylogenetic analysis is to identify frequent patterns in a collection of phylogenetic trees. The goal is, roughly, to find a subset of the species (taxa) on which all or some significant subset of the trees agree. One popular method to do so is through maximum agreement subtrees (MASTs). MASTs are also used, among other things, as a metric for comparing phylogenetic trees, computing congruence indices and to identify horizontal gene transfer events. Results We give algorithms and experimental results for two approaches to identify common patterns in a collection of phylogenetic trees, one based on agreement subtrees, called maximal agreement subtrees, the other on frequent subtrees, called maximal frequent subtrees. These approaches can return subtrees on larger sets of taxa than MASTs, and can reveal new common phylogenetic relationships not present in either MASTs or the majority rule tree (a popular consensus method). Our current implementation is available on the web at https://code.google.com/p/mfst-miner/. Conclusions Our computational results confirm that maximal agreement subtrees and all maximal frequent subtrees can reveal a more complete phylogenetic picture of the common patterns in collections of phylogenetic trees than maximum agreement subtrees; they are also often more resolved than the majority rule tree. Further, our experiments show that enumerating maximal frequent subtrees is considerably more practical than enumerating ordinary (not necessarily maximal) frequent subtrees. PMID:25061474
A methodological investigation of hominoid craniodental morphology and phylogenetics.

PubMed

Bjarnason, Alexander; Chamberlain, Andrew T; Lockwood, Charles A

2011-01-01

The evolutionary relationships of extant great apes and humans have been largely resolved by molecular studies, yet morphology-based phylogenetic analyses continue to provide conflicting results. In order to further investigate this discrepancy we present bootstrap clade support of morphological data based on two quantitative datasets, one dataset consisting of linear measurements of the whole skull from 5 hominoid genera and the second dataset consisting of 3D landmark data from the temporal bone of 5 hominoid genera, including 11 sub-species. Using similar protocols for both datasets, we were able to 1) compare distance-based phylogenetic methods to cladistic parsimony of quantitative data converted into discrete character states, 2) vary outgroup choice to observe its effect on phylogenetic inference, and 3) analyse male and female data separately to observe the effect of sexual dimorphism on phylogenies. Phylogenetic analysis was sensitive to methodological decisions, particularly outgroup selection, where designation of Pongo as an outgroup and removal of Hylobates resulted in greater congruence with the proposed molecular phylogeny. The performance of distance-based methods also justifies their use in phylogenetic analysis of morphological data. It is clear from our analyses that hominoid phylogenetics ought not to be used as an example of conflict between the morphological and molecular, but as an example of how outgroup and methodological choices can affect the outcome of phylogenetic analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.
Enumerating all maximal frequent subtrees in collections of phylogenetic trees.

PubMed

Deepak, Akshay; Fernández-Baca, David

2014-01-01

A common problem in phylogenetic analysis is to identify frequent patterns in a collection of phylogenetic trees. The goal is, roughly, to find a subset of the species (taxa) on which all or some significant subset of the trees agree. One popular method to do so is through maximum agreement subtrees (MASTs). MASTs are also used, among other things, as a metric for comparing phylogenetic trees, computing congruence indices and to identify horizontal gene transfer events. We give algorithms and experimental results for two approaches to identify common patterns in a collection of phylogenetic trees, one based on agreement subtrees, called maximal agreement subtrees, the other on frequent subtrees, called maximal frequent subtrees. These approaches can return subtrees on larger sets of taxa than MASTs, and can reveal new common phylogenetic relationships not present in either MASTs or the majority rule tree (a popular consensus method). Our current implementation is available on the web at https://code.google.com/p/mfst-miner/. Our computational results confirm that maximal agreement subtrees and all maximal frequent subtrees can reveal a more complete phylogenetic picture of the common patterns in collections of phylogenetic trees than maximum agreement subtrees; they are also often more resolved than the majority rule tree. Further, our experiments show that enumerating maximal frequent subtrees is considerably more practical than enumerating ordinary (not necessarily maximal) frequent subtrees.
A short note on the paper of Liu et al. (2012). A relative Lempel-Ziv complexity: Application to comparing biological sequences. Chemical Physics Letters, volume 530, 19 March 2012, pages 107-112

NASA Astrophysics Data System (ADS)

Arit, Turkan; Keskin, Burak; Firuzan, Esin; Cavas, Cagin Kandemir; Liu, Liwei; Cavas, Levent

2018-04-01

The report entitled "L. Liu, D. Li, F. Bai, A relative Lempel-Ziv complexity: Application to comparing biological sequences, Chem. Phys. Lett. 530 (2012) 107-112" mentions on the powerful construction of phylogenetic trees based on Lempel-Ziv algorithm. On the other hand, the method explained in the paper does not give promising result on the data set on invasive Caulerpa taxifolia in the Mediterranean Sea. The phylogenetic trees are obtained by the proposed method of the aforementioned paper in this short note.
The phylogenetic position of the Critically Endangered Saint Croix ground lizard Ameiva polops: revisiting molecular systematics of West Indian Ameiva.

PubMed

Hurtado, Luis A; Santamaria, Carlos A; Fitzgerald, Lee A

2014-05-06

The phylogenetic position of the critically endangered Saint Croix ground lizard Ameiva polops is presently unknown and several hypotheses have been proposed. We investigated the phylogenetic position of this species using molecular phylogenetic methods. We obtained sequences of DNA fragments of the mitochondrial ribosomal genes 12S rDNA and 16S rDNA for this species. We aligned these sequences with published sequences of other Ameiva species, which include most of the Ameiva species from the West Indies, three Ameiva species from Central America and South America, and one from the teiid lizard Tupinambis teguixin, which was used as outgroup. We conducted Maximum Likelihood and Bayesian phylogenetic analyses. The phylogenetic reconstructions among the different methods were very similar, supporting the monophyly of West Indian Ameiva and showing within this lineage, a basal polytomy of four clades that are separated geographically. Ameiva polops grouped in a cluster that included the other two Ameiva species found in the Puerto Rican Bank: A. wetmorei and A. exsul. A sister relationship between A. polops and A. wetmorei is suggested by our analyses. We compare our results with a previous study on molecular systematics of West Indian Ameiva.
Your place or mine? A phylogenetic comparative analysis of marital residence in Indo-European and Austronesian societies

PubMed Central

Fortunato, Laura; Jordan, Fiona

2010-01-01

Accurate reconstruction of prehistoric social organization is important if we are to put together satisfactory multidisciplinary scenarios about, for example, the dispersal of human groups. Such considerations apply in the case of Indo-European and Austronesian, two large-scale language families that are thought to represent Neolithic expansions. Ancestral kinship patterns have mostly been inferred through reconstruction of kin terminologies in ancestral proto-languages using the linguistic comparative method, and through geographical or distributional arguments based on the comparative patterns of kin terms and ethnographic kinship ‘facts’. While these approaches are detailed and valuable, the processes through which conclusions have been drawn from the data fail to provide explicit criteria for systematic testing of alternative hypotheses. Here, we use language trees derived using phylogenetic tree-building techniques on Indo-European and Austronesian vocabulary data. With these trees, ethnographic data and Bayesian phylogenetic comparative methods, we statistically reconstruct past marital residence and infer rates of cultural change between different residence forms, showing Proto-Indo-European to be virilocal and Proto-Malayo-Polynesian uxorilocal. The instability of uxorilocality and the rare loss of virilocality once gained emerge as common features of both families. PMID:21041215
Phylogenetic search through partial tree mixing

PubMed Central

2012-01-01

Background Recent advances in sequencing technology have created large data sets upon which phylogenetic inference can be performed. Current research is limited by the prohibitive time necessary to perform tree search on a reasonable number of individuals. This research develops new phylogenetic algorithms that can operate on tens of thousands of species in a reasonable amount of time through several innovative search techniques. Results When compared to popular phylogenetic search algorithms, better trees are found much more quickly for large data sets. These algorithms are incorporated in the PSODA application available at http://dna.cs.byu.edu/psoda Conclusions The use of Partial Tree Mixing in a partition based tree space allows the algorithm to quickly converge on near optimal tree regions. These regions can then be searched in a methodical way to determine the overall optimal phylogenetic solution. PMID:23320449
Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees.

PubMed

Rabosky, Daniel L

2014-01-01

A number of methods have been developed to infer differential rates of species diversification through time and among clades using time-calibrated phylogenetic trees. However, we lack a general framework that can delineate and quantify heterogeneous mixtures of dynamic processes within single phylogenies. I developed a method that can identify arbitrary numbers of time-varying diversification processes on phylogenies without specifying their locations in advance. The method uses reversible-jump Markov Chain Monte Carlo to move between model subspaces that vary in the number of distinct diversification regimes. The model assumes that changes in evolutionary regimes occur across the branches of phylogenetic trees under a compound Poisson process and explicitly accounts for rate variation through time and among lineages. Using simulated datasets, I demonstrate that the method can be used to quantify complex mixtures of time-dependent, diversity-dependent, and constant-rate diversification processes. I compared the performance of the method to the MEDUSA model of rate variation among lineages. As an empirical example, I analyzed the history of speciation and extinction during the radiation of modern whales. The method described here will greatly facilitate the exploration of macroevolutionary dynamics across large phylogenetic trees, which may have been shaped by heterogeneous mixtures of distinct evolutionary processes.
Automatic Detection of Key Innovations, Rate Shifts, and Diversity-Dependence on Phylogenetic Trees

PubMed Central

Rabosky, Daniel L.

2014-01-01

A number of methods have been developed to infer differential rates of species diversification through time and among clades using time-calibrated phylogenetic trees. However, we lack a general framework that can delineate and quantify heterogeneous mixtures of dynamic processes within single phylogenies. I developed a method that can identify arbitrary numbers of time-varying diversification processes on phylogenies without specifying their locations in advance. The method uses reversible-jump Markov Chain Monte Carlo to move between model subspaces that vary in the number of distinct diversification regimes. The model assumes that changes in evolutionary regimes occur across the branches of phylogenetic trees under a compound Poisson process and explicitly accounts for rate variation through time and among lineages. Using simulated datasets, I demonstrate that the method can be used to quantify complex mixtures of time-dependent, diversity-dependent, and constant-rate diversification processes. I compared the performance of the method to the MEDUSA model of rate variation among lineages. As an empirical example, I analyzed the history of speciation and extinction during the radiation of modern whales. The method described here will greatly facilitate the exploration of macroevolutionary dynamics across large phylogenetic trees, which may have been shaped by heterogeneous mixtures of distinct evolutionary processes. PMID:24586858
A phylogenetic comparative study of flowering phenology along an elevational gradient in the Canadian subarctic.

PubMed

Lessard-Therrien, Malie; Davies, T Jonathan; Bolmgren, Kjell

2014-05-01

Climate change is affecting high-altitude and high-latitude communities in significant ways. In the short growing season of subarctic habitats, it is essential that the timing and duration of phenological phases match favorable environmental conditions. We explored the time of the first appearance of flowers (first flowering day, FFD) and flowering duration across subarctic species composing different communities, from boreal forest to tundra, along an elevational gradient (600-800 m). The study was conducted on Mount Irony (856 m), North-East Canada (54°90'N, 67°16'W) during summer 2012. First, we quantified phylogenetic signal in FFD at different spatial scales. Second, we used phylogenetic comparative methods to explore the relationship between FFD, flowering duration, and elevation. We found that the phylogenetic signal for FFD was stronger at finer spatial scales and at lower elevations, indicating that closely related species tend to flower at similar times when the local environment is less harsh. The comparatively weaker phylogenetic signal at higher elevation may be indicative of convergent evolution for FFD. Flowering duration was correlated significantly with mean FFD, with later-flowering species having a longer flowering duration, but only at the lowest elevation. Our results indicate significant evolutionary conservatism in responses to phenological cues, but high phenotypic plasticity in flowering times. We suggest that phylogenetic relationships should be considered in the search for predictions and drivers of flowering time in comparative analyses, because species cannot be considered as statistically independent. Further, phenological drivers should be measured at spatial scales such that variation in flowering matches variation in environment.
Model selection and model averaging in phylogenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests.

PubMed

Posada, David; Buckley, Thomas R

2004-10-01

Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus(genus Carabus) ground beetles described by Sota and Vogler (2001).
The power and pitfalls of HIV phylogenetics in public health.

PubMed

Brooks, James I; Sandstrom, Paul A

2013-07-25

Phylogenetics is the application of comparative studies of genetic sequences in order to infer evolutionary relationships among organisms. This tool can be used as a form of molecular epidemiology to enhance traditional population-level communicable disease surveillance. Phylogenetic study has resulted in new paradigms being created in the field of communicable diseases and this commentary aims to provide the reader with an explanation of how phylogenetics can be used in tracking infectious diseases. Special emphasis will be placed upon the application of phylogenetics as a tool to help elucidate HIV transmission patterns and the limitations to these methods when applied to forensic analysis. Understanding infectious disease epidemiology in order to prevent new transmissions is the sine qua non of public health. However, with increasing epidemiological resolution, there may be an associated potential loss of privacy to the individual. It is within this context that we aim to promote the discussion on how to use phylogenetics to achieve important public health goals, while at the same time protecting the rights of the individual.
Using phylogeny and functional traits for assessing community assembly along environmental gradients: A deterministic process driven by elevation.

PubMed

Xu, Jinshi; Chen, Yu; Zhang, Lixia; Chai, Yongfu; Wang, Mao; Guo, Yaoxin; Li, Ting; Yue, Ming

2017-07-01

Community assembly processes is the primary focus of community ecology. Using phylogenetic-based and functional trait-based methods jointly to explore these processes along environmental gradients are useful ways to explain the change of assembly mechanisms under changing world. Our study combined these methods to test assembly processes in wide range gradients of elevation and other habitat environmental factors. We collected our data at 40 plots in Taibai Mountain, China, with more than 2,300 m altitude difference in study area and then measured traits and environmental factors. Variance partitioning was used to distinguish the main environment factors leading to phylogeny and traits change among 40 plots. Principal component analysis (PCA) was applied to colligate other environment factors. Community assembly patterns along environmental gradients based on phylogenetic and functional methods were studied for exploring assembly mechanisms. Phylogenetic signal was calculated for each community along environmental gradients in order to detect the variation of trait performance on phylogeny. Elevation showed a better explanatory power than other environment factors for phylogenetic and most traits' variance. Phylogenetic and several functional structure clustered at high elevation while some conserved traits overdispersed. Convergent tendency which might be caused by filtering or competition along elevation was detected based on functional traits. Leaf dry matter content (LDMC) and leaf nitrogen content along PCA 1 axis showed conflicting patterns comparing to patterns showed on elevation. LDMC exhibited the strongest phylogenetic signal. Only the phylogenetic signal of maximum plant height showed explicable change along environmental gradients. Synthesis . Elevation is the best environment factors for predicting phylogeny and traits change. Plant's phylogenetic and some functional structures show environmental filtering in alpine region while it shows different assembly processes in middle- and low-altitude region by different trait/phylogeny. The results highlight deterministic processes dominate community assembly in large-scale environmental gradients. Performance of phylogeny and traits along gradients may be independent with each other. The novel method for calculating functional structure which we used in this study and the focus of phylogenetic signal change along gradients may provide more useful ways to detect community assembly mechanisms.
Advances in the use of DNA barcodes to build a community phylogeny for tropical trees in a Puerto Rican forest dynamics plot.

PubMed

Kress, W John; Erickson, David L; Swenson, Nathan G; Thompson, Jill; Uriarte, Maria; Zimmerman, Jess K

2010-11-09

Species number, functional traits, and phylogenetic history all contribute to characterizing the biological diversity in plant communities. The phylogenetic component of diversity has been particularly difficult to quantify in species-rich tropical tree assemblages. The compilation of previously published (and often incomplete) data on evolutionary relationships of species into a composite phylogeny of the taxa in a forest, through such programs as Phylomatic, has proven useful in building community phylogenies although often of limited resolution. Recently, DNA barcodes have been used to construct a robust community phylogeny for nearly 300 tree species in a forest dynamics plot in Panama using a supermatrix method. In that study sequence data from three barcode loci were used to generate a well-resolved species-level phylogeny. Here we expand upon this earlier investigation and present results on the use of a phylogenetic constraint tree to generate a community phylogeny for a diverse, tropical forest dynamics plot in Puerto Rico. This enhanced method of phylogenetic reconstruction insures the congruence of the barcode phylogeny with broadly accepted hypotheses on the phylogeny of flowering plants (i.e., APG III) regardless of the number and taxonomic breadth of the taxa sampled. We also compare maximum parsimony versus maximum likelihood estimates of community phylogenetic relationships as well as evaluate the effectiveness of one- versus two- versus three-gene barcodes in resolving community evolutionary history. As first demonstrated in the Panamanian forest dynamics plot, the results for the Puerto Rican plot illustrate that highly resolved phylogenies derived from DNA barcode sequence data combined with a constraint tree based on APG III are particularly useful in comparative analysis of phylogenetic diversity and will enhance research on the interface between community ecology and evolution.
Plunging hands into the mushroom jar: a phylogenetic framework for Lyophyllaceae (Agaricales, Basidiomycota).

PubMed

Bellanger, J-M; Moreau, P-A; Corriol, G; Bidaud, A; Chalange, R; Dudova, Z; Richard, F

2015-04-01

During the last two decades, the unprecedented development of molecular phylogenetic tools has propelled an opportunity to revisit the fungal kingdom under an evolutionary perspective. Mycology has been profoundly changed but a sustained effort to elucidate large sections of the astonishing fungal diversity is still needed. Here we fill this gap in the case of Lyophyllaceae, a species-rich and ecologically diversified family of mushrooms. Assembly and genealogical concordance multigene phylogenetic analysis of a large dataset that includes original, vouchered material from expert field mycologists reveal the phylogenetic topology of the family, from higher (generic) to lower (species) levels. A comparative analysis of the most widely used phylogenetic markers in Fungi indicates that the nuc rDNA region encompassing the internal transcribed spacers 1 and 2, along with the 5.8S rDNA (ITS) and portions of the genes for RNA polymerase II second largest subunit (RPB2) is the most performing combination to resolve the broadest range of taxa within Lyophyllaceae. Eleven distinct evolutionary lineages are identified, that display partial overlap with traditional genera as well as with the phylogenetic framework previously proposed for the family. Eighty phylogenetic species are delineated, which shed light on a large number of morphological concepts, including rare and poorly documented ones. Probing these novel phylogenetic species to the barcoding method of species limit delineation, indicates that the latter method fully resolves Lyophyllaceae species, except in one clade. This case study provides the first comprehensive phylogenetic overview of Lyophyllaceae, a necessary step towards a taxonomical, ecological and nomenclatural revision of this family of mushrooms. It also proposes a set of methodological guidelines that may be of relevance for future taxonomic works in other groups of Fungi.
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

PubMed Central

Lanfear, Robert; Hua, Xia; Warren, Dan L.

2016-01-01

Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
BIMLR: a method for constructing rooted phylogenetic networks from rooted phylogenetic trees.

PubMed

Wang, Juan; Guo, Maozu; Xing, Linlin; Che, Kai; Liu, Xiaoyan; Wang, Chunyu

2013-09-15

Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. © 2013 Elsevier B.V. All rights reserved.

Phylogenetic relationships within the cyst-forming nematodes (Nematoda, Heteroderidae) based on analysis of sequences from the ITS regions of ribosomal DNA.

PubMed

Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R

2001-10-01

The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.
Including Fossils in Phylogenetic Climate Reconstructions: A Deep Time Perspective on the Climatic Niche Evolution and Diversification of Spiny Lizards (Sceloporus).

PubMed

Lawing, A Michelle; Polly, P David; Hews, Diana K; Martins, Emília P

2016-08-01

Fossils and other paleontological information can improve phylogenetic comparative method estimates of phenotypic evolution and generate hypotheses related to species diversification. Here, we use fossil information to calibrate ancestral reconstructions of suitable climate for Sceloporus lizards in North America. Integrating data from the fossil record, general circulation models of paleoclimate during the Miocene, climate envelope modeling, and phylogenetic comparative methods provides a geographically and temporally explicit species distribution model of Sceloporus-suitable habitat through time. We provide evidence to support the historic biogeographic hypothesis of Sceloporus diversification in warm North American deserts and suggest a relatively recent Sceloporus invasion into Mexico around 6 Ma. We use a physiological model to map extinction risk. We suggest that the number of hours of restriction to a thermal refuge limited Sceloporus from inhabiting Mexico until the climate cooled enough to provide suitable habitat at approximately 6 Ma. If the future climate returns to the hotter climates of the past, Mexico, the place of highest modern Sceloporus richness, will no longer provide suitable habitats for Sceloporus to survive and reproduce.
Phylobetadiversity among forest types in the Brazilian Atlantic Forest complex.

PubMed

Duarte, Leandro Da Silva; Bergamin, Rodrigo Scarton; Marcilio-Silva, Vinícius; Seger, Guilherme Dubal Dos Santos; Marques, Márcia Cristina Mendes

2014-01-01

Phylobetadiversity is defined as the phylogenetic resemblance between communities or biomes. Analyzing phylobetadiversity patterns among different vegetation physiognomies within a single biome is crucial to understand the historical affinities between them. Based on the widely accepted idea that different forest physiognomies within the Southern Brazilian Atlantic Forest constitute different facies of a single biome, we hypothesize that more recent phylogenetic nodes should drive phylobetadiversity gradients between the different forest types within the Atlantic Forest, as the phylogenetic divergence among those forest types is biogeographically recent. We compiled information from 206 checklists describing the occurrence of shrub/tree species across three different forest physiognomies within the Southern Brazilian Atlantic Forest (Dense, Mixed and Seasonal forests). We analyzed intra-site phylogenetic structure (phylogenetic diversity, net relatedness index and nearest taxon index) and phylobetadiversity between plots located at different forest types, using five different methods differing in sensitivity to either basal or terminal nodes (phylogenetic fuzzy weighting, COMDIST, COMDISTNT, UniFrac and Rao's H). Mixed forests showed higher phylogenetic diversity and overdispersion than the other forest types. Furthermore, all forest types differed from each other in relation phylobetadiversity patterns, particularly when phylobetadiversity methods more sensitive to terminal nodes were employed. Mixed forests tended to show higher phylogenetic differentiation to Dense and Seasonal forests than these latter from each other. The higher phylogenetic diversity and phylobetadiversity levels found in Mixed forests when compared to the others likely result from the biogeographical origin of several taxa occurring in these forests. On one hand, Mixed forests shelter several temperate taxa, like the conifers Araucaria and Podocarpus. On the other hand, tropical groups, like Myrtaceae, are also very representative of this forest type. We point out to the need of more attention to Mixed forests as a conservation target within the Brazilian Atlantic Forest given their high phylogenetic uniqueness.
Phylobetadiversity among Forest Types in the Brazilian Atlantic Forest Complex

PubMed Central

Duarte, Leandro Da Silva; Bergamin, Rodrigo Scarton; Marcilio-Silva, Vinícius; Seger, Guilherme Dubal Dos Santos; Marques, Márcia Cristina Mendes

2014-01-01

Phylobetadiversity is defined as the phylogenetic resemblance between communities or biomes. Analyzing phylobetadiversity patterns among different vegetation physiognomies within a single biome is crucial to understand the historical affinities between them. Based on the widely accepted idea that different forest physiognomies within the Southern Brazilian Atlantic Forest constitute different facies of a single biome, we hypothesize that more recent phylogenetic nodes should drive phylobetadiversity gradients between the different forest types within the Atlantic Forest, as the phylogenetic divergence among those forest types is biogeographically recent. We compiled information from 206 checklists describing the occurrence of shrub/tree species across three different forest physiognomies within the Southern Brazilian Atlantic Forest (Dense, Mixed and Seasonal forests). We analyzed intra-site phylogenetic structure (phylogenetic diversity, net relatedness index and nearest taxon index) and phylobetadiversity between plots located at different forest types, using five different methods differing in sensitivity to either basal or terminal nodes (phylogenetic fuzzy weighting, COMDIST, COMDISTNT, UniFrac and Rao’s H). Mixed forests showed higher phylogenetic diversity and overdispersion than the other forest types. Furthermore, all forest types differed from each other in relation phylobetadiversity patterns, particularly when phylobetadiversity methods more sensitive to terminal nodes were employed. Mixed forests tended to show higher phylogenetic differentiation to Dense and Seasonal forests than these latter from each other. The higher phylogenetic diversity and phylobetadiversity levels found in Mixed forests when compared to the others likely result from the biogeographical origin of several taxa occurring in these forests. On one hand, Mixed forests shelter several temperate taxa, like the conifers Araucaria and Podocarpus. On the other hand, tropical groups, like Myrtaceae, are also very representative of this forest type. We point out to the need of more attention to Mixed forests as a conservation target within the Brazilian Atlantic Forest given their high phylogenetic uniqueness. PMID:25121495
Life history and biogeographic diversification of an endemic western North American freshwater fish clade using a comparative species tree approach.

PubMed

Baumsteiger, Jason; Kinziger, Andrew P; Aguilar, Andres

2012-12-01

The west coast of North America contains a number of biogeographic freshwater provinces which reflect an ever-changing aquatic landscape. Clues to understanding this complex structure are often encapsulated genetically in the ichthyofauna, though frequently as unresolved evolutionary relationships and putative cryptic species. Advances in molecular phylogenetics through species tree analyses now allow for improved exploration of these relationships. Using a comprehensive approach, we analyzed two mitochondrial and nine nuclear loci for a group of endemic freshwater fish (sculpin-Cottus) known for a wide ranging distribution and complex species structure in this region. Species delimitation techniques identified three novel cryptic lineages, all well supported by phylogenetic analyses. Comparative phylogenetic analyses consistently found five distinct clades reflecting a number of unique biogeographic provinces. Some internal node relationships varied by species tree reconstruction method, and were associated with either Bayesian or maximum likelihood statistical approaches or between mitochondrial, nuclear, and combined datasets. Limited cases of mitochondrial capture were also evident, suggestive of putative ancestral hybridization between species. Biogeographic diversification was associated with four major regions and revealed historical faunal exchanges across regions. Mapping of an important life-history character (amphidromy) revealed two separate instances of trait evolution, a transition that has occurred repeatedly in Cottus. This study demonstrates the power of current phylogenetic methods, the need for a comprehensive phylogenetic approach, and the potential for sculpin to serve as an indicator of biogeographic history for native ichthyofauna in the region. Copyright © 2012 Elsevier Inc. All rights reserved.
A phylogenetic Kalman filter for ancestral trait reconstruction using molecular data.

PubMed

Lartillot, Nicolas

2014-02-15

Correlation between life history or ecological traits and genomic features such as nucleotide or amino acid composition can be used for reconstructing the evolutionary history of the traits of interest along phylogenies. Thus far, however, such ancestral reconstructions have been done using simple linear regression approaches that do not account for phylogenetic inertia. These reconstructions could instead be seen as a genuine comparative regression problem, such as formalized by classical generalized least-square comparative methods, in which the trait of interest and the molecular predictor are represented as correlated Brownian characters coevolving along the phylogeny. Here, a Bayesian sampler is introduced, representing an alternative and more efficient algorithmic solution to this comparative regression problem, compared with currently existing generalized least-square approaches. Technically, ancestral trait reconstruction based on a molecular predictor is shown to be formally equivalent to a phylogenetic Kalman filter problem, for which backward and forward recursions are developed and implemented in the context of a Markov chain Monte Carlo sampler. The comparative regression method results in more accurate reconstructions and a more faithful representation of uncertainty, compared with simple linear regression. Application to the reconstruction of the evolution of optimal growth temperature in Archaea, using GC composition in ribosomal RNA stems and amino acid composition of a sample of protein-coding genes, confirms previous findings, in particular, pointing to a hyperthermophilic ancestor for the kingdom. The program is freely available at www.phylobayes.org.
Phylogeny, host-parasite relationship and zoogeography

PubMed Central

1999-01-01

Phylogeny is the evolutionary history of a group or the lineage of organisms and is reconstructed based on morphological, molecular and other characteristics. The genealogical relationship of a group of taxa is often expressed as a phylogenetic tree. The difficulty in categorizing the phylogeny is mainly due to the existence of frequent homoplasies that deceive observers. At the present time, cladistic analysis is believed to be one of the most effective methods of reconstructing a phylogenetic tree. Excellent computer program software for phylogenetic analysis is available. As an example, cladistic analysis was applied for nematode genera of the family Acuariidae, and the phylogenetic tree formed was compared with the system used currently. Nematodes in the genera Nippostrongylus and Heligmonoides were also analyzed, and the validity of the reconstructed phylogenetic trees was observed from a zoogeographical point of view. Some of the theories of parasite evolution were briefly reviewed as well. Coevolution of parasites and humans was discussed with special reference to the evolutionary relationship between Enterobius and primates. PMID:10634036
Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction.

PubMed

Sayyari, Erfan; Mirarab, Siavash

2016-11-11

Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed. We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves. We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.
Measures of phylogenetic differentiation provide robust and complementary insights into microbial communities.

PubMed

Parks, Donovan H; Beiko, Robert G

2013-01-01

High-throughput sequencing techniques have made large-scale spatial and temporal surveys of microbial communities routine. Gaining insight into microbial diversity requires methods for effectively analyzing and visualizing these extensive data sets. Phylogenetic β-diversity measures address this challenge by allowing the relationship between large numbers of environmental samples to be explored using standard multivariate analysis techniques. Despite the success and widespread use of phylogenetic β-diversity measures, an extensive comparative analysis of these measures has not been performed. Here, we compare 39 measures of phylogenetic β diversity in order to establish the relative similarity of these measures along with key properties and performance characteristics. While many measures are highly correlated, those commonly used within microbial ecology were found to be distinct from those popular within classical ecology, and from the recently recommended Gower and Canberra measures. Many of the measures are surprisingly robust to different rootings of the gene tree, the choice of similarity threshold used to define operational taxonomic units, and the presence of outlying basal lineages. Measures differ considerably in their sensitivity to rare organisms, and the effectiveness of measures can vary substantially under alternative models of differentiation. Consequently, the depth of sequencing required to reveal underlying patterns of relationships between environmental samples depends on the selected measure. Our results demonstrate that using complementary measures of phylogenetic β diversity can further our understanding of how communities are phylogenetically differentiated. Open-source software implementing the phylogenetic β-diversity measures evaluated in this manuscript is available at http://kiwi.cs.dal.ca/Software/ExpressBetaDiversity.
Phylogenetic signal, feeding behaviour and brain volume in Neotropical bats.

PubMed

Rojas, D; Mancina, C A; Flores-Martínez, J J; Navarro, L

2013-09-01

Comparative correlational studies of brain size and ecological traits (e.g. feeding habits and habitat complexity) have increased our knowledge about the selective pressures on brain evolution. Studies conducted in bats as a model system assume that shared evolutionary history has a maximum effect on the traits. However, this effect has not been quantified. In addition, the effect of levels of diet specialization on brain size remains unclear. We examined the role of diet on the evolution of brain size in Mormoopidae and Phyllostomidae using two comparative methods. Body mass explained 89% of the variance in brain volume. The effect of feeding behaviour (either characterized as feeding habits, as levels of specialization on a type of item or as handling behaviour) on brain volume was also significant albeit not consistent after controlling for body mass and the strength of the phylogenetic signal (λ). Although the strength of the phylogenetic signal of brain volume and body mass was high when tested individually, λ values in phylogenetic generalized least squares models were significantly different from 1. This suggests that phylogenetic independent contrasts models are not always the best approach for the study of ecological correlates of brain size in New World bats. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.
Web-Based Phylogenetic Assignment Tool for Analysis of Terminal Restriction Fragment Length Polymorphism Profiles of Microbial Communities

PubMed Central

Kent, Angela D.; Smith, Dan J.; Benson, Barbara J.; Triplett, Eric W.

2003-01-01

Culture-independent DNA fingerprints are commonly used to assess the diversity of a microbial community. However, relating species composition to community profiles produced by community fingerprint methods is not straightforward. Terminal restriction fragment length polymorphism (T-RFLP) is a community fingerprint method in which phylogenetic assignments may be inferred from the terminal restriction fragment (T-RF) sizes through the use of web-based resources that predict T-RF sizes for known bacteria. The process quickly becomes computationally intensive due to the need to analyze profiles produced by multiple restriction digests and the complexity of profiles generated by natural microbial communities. A web-based tool is described here that rapidly generates phylogenetic assignments from submitted community T-RFLP profiles based on a database of fragments produced by known 16S rRNA gene sequences. Users have the option of submitting a customized database generated from unpublished sequences or from a gene other than the 16S rRNA gene. This phylogenetic assignment tool allows users to employ T-RFLP to simultaneously analyze microbial community diversity and species composition. An analysis of the variability of bacterial species composition throughout the water column in a humic lake was carried out to demonstrate the functionality of the phylogenetic assignment tool. This method was validated by comparing the results generated by this program with results from a 16S rRNA gene clone library. PMID:14602639
Comparison of multilocus sequence typing and pulsed-field gel electrophoresis for Salmonella spp. identification in surface water

NASA Astrophysics Data System (ADS)

Kuo, Chun Wei; Hao Huang, Kuan; Hsu, Bing Mu; Tsai, Hsien Lung; Tseng, Shao Feng; Kao, Po Min; Shen, Shu Min; Chou Chiu, Yi; Chen, Jung Sheng

2013-04-01

Salmonella is one of the most important pathogens of waterborne diseases with outbreaks from contaminated water reported worldwide. In addition, Salmonella spp. can survive for long periods in aquatic environments. To realize genotypes and serovars of Salmonella in aquatic environments, we isolated the Salmonella strains by selective culture plates to identify the serovars of Salmonella by serological assay, and identify the genotypes by Multilocus sequence typing (MLST) based on the sequence data from University College Cork (UCC), respectively. The results show that 36 stream water samples (30.1%) and 18 drinking water samples (23.3%) were confirmed the existence of Salmonella using culture method combined PCR specific invA gene amplification. In this study, 24 cultured isolates of Salmonella from water samples were classified to fifteen Salmonella enterica serovars. In addition, we construct phylogenetic analysis using phylogenetic tree and Minimum spanning tree (MST) method to analyze the relationship of clinical, environmental, and geographical data. Phylogenetic tree showed that four main clusters and our strains can be distributed in all. The genotypes of isolates from stream water are more biodiversity while comparing the Salmonella strains genotypes from drinking water sources. According to MST data, we can found the positive correlation between serovars and genotypes of Salmonella. Previous studies revealed that the result of Pulsed field gel electrophoresis (PFGE) method can predict the serovars of Salmonella strain. Hence, we used the MLST data combined phylogenetic analysis to identify the serovars of Salmonella strain and achieved effectiveness. While using the geographical data combined phylogenetic analysis, the result showed that the dominant strains were existed in whole stream area in rainy season. Keywords: Salmonella spp., MLST, phylogenetic analysis, PFGE
Specimen-level phylogenetics in paleontology using the Fossilized Birth-Death model with sampled ancestors.

PubMed

Cau, Andrea

2017-01-01

Bayesian phylogenetic methods integrating simultaneously morphological and stratigraphic information have been applied increasingly among paleontologists. Most of these studies have used Bayesian methods as an alternative to the widely-used parsimony analysis, to infer macroevolutionary patterns and relationships among species-level or higher taxa. Among recently introduced Bayesian methodologies, the Fossilized Birth-Death (FBD) model allows incorporation of hypotheses on ancestor-descendant relationships in phylogenetic analyses including fossil taxa. Here, the FBD model is used to infer the relationships among an ingroup formed exclusively by fossil individuals, i.e., dipnoan tooth plates from four localities in the Ain el Guettar Formation of Tunisia. Previous analyses of this sample compared the results of phylogenetic analysis using parsimony with stratigraphic methods, inferred a high diversity (five or more genera) in the Ain el Guettar Formation, and interpreted it as an artifact inflated by depositional factors. In the analysis performed here, the uncertainty on the chronostratigraphic relationships among the specimens was included among the prior settings. The results of the analysis confirm the referral of most of the specimens to the taxa Asiatoceratodus , Equinoxiodus, Lavocatodus and Neoceratodus , but reject those to Ceratodus and Ferganoceratodus . The resulting phylogeny constrained the evolution of the Tunisian sample exclusively in the Early Cretaceous, contrasting with the previous scenario inferred by the stratigraphically-calibrated topology resulting from parsimony analysis. The phylogenetic framework also suggests that (1) the sampled localities are laterally equivalent, (2) but three localities are restricted to the youngest part of the section; both results are in agreement with previous stratigraphic analyses of these localities. The FBD model of specimen-level units provides a novel tool for phylogenetic inference among fossils but also for independent tests of stratigraphic scenarios.
Encoding phylogenetic trees in terms of weighted quartets.

PubMed

Grünewald, Stefan; Huber, Katharina T; Moulton, Vincent; Semple, Charles

2008-04-01

One of the main problems in phylogenetics is to develop systematic methods for constructing evolutionary or phylogenetic trees. For a set of species X, an edge-weighted phylogenetic X-tree or phylogenetic tree is a (graph theoretical) tree with leaf set X and no degree 2 vertices, together with a map assigning a non-negative length to each edge of the tree. Within phylogenetics, several methods have been proposed for constructing such trees that work by trying to piece together quartet trees on X, i.e. phylogenetic trees each having four leaves in X. Hence, it is of interest to characterise when a collection of quartet trees corresponds to a (unique) phylogenetic tree. Recently, Dress and Erdös provided such a characterisation for binary phylogenetic trees, that is, phylogenetic trees all of whose internal vertices have degree 3. Here we provide a new characterisation for arbitrary phylogenetic trees.
Mitochondrial genomes reveal recombination in the presumed asexual Fusarium oxysporum species complex.

PubMed

Brankovics, Balázs; van Dam, Peter; Rep, Martijn; de Hoog, G Sybren; J van der Lee, Theo A; Waalwijk, Cees; van Diepeningen, Anne D

2017-09-18

The Fusarium oxysporum species complex (FOSC) contains several phylogenetic lineages. Phylogenetic studies identified two to three major clades within the FOSC. The mitochondrial sequences are highly informative phylogenetic markers, but have been mostly neglected due to technical difficulties. A total of 61 complete mitogenomes of FOSC strains were de novo assembled and annotated. Length variations and intron patterns support the separation of three phylogenetic species. The variable region of the mitogenome that is typical for the genus Fusarium shows two new variants in the FOSC. The variant typical for Fusarium is found in members of all three clades, while variant 2 is found in clades 2 and 3 and variant 3 only in clade 2. The extended set of loci analyzed using a new implementation of the genealogical concordance species recognition method support the identification of three phylogenetic species within the FOSC. Comparative analysis of the mitogenomes in the FOSC revealed ongoing mitochondrial recombination within, but not between phylogenetic species. The recombination indicates the presence of a parasexual cycle in F. oxysporum. The obstacles hindering the usage of the mitogenomes are resolved by using next generation sequencing and selective genome assemblers, such as GRAbB. Complete mitogenome sequences offer a stable basis and reference point for phylogenetic and population genetic studies.
Mitogenome Phylogenetics: The Impact of Using Single Regions and Partitioning Schemes on Topology, Substitution Rate and Divergence Time Estimation

PubMed Central

Duchêne, Sebastián; Archer, Frederick I.; Vilstrup, Julia; Caballero, Susana; Morin, Phillip A.

2011-01-01

The availability of mitochondrial genome sequences is growing as a result of recent technological advances in molecular biology. In phylogenetic analyses, the complete mitogenome is increasingly becoming the marker of choice, usually providing better phylogenetic resolution and precision relative to traditional markers such as cytochrome b (CYTB) and the control region (CR). In some cases, the differences in phylogenetic estimates between mitogenomic and single-gene markers have yielded incongruent conclusions. By comparing phylogenetic estimates made from different genes, we identified the most informative mitochondrial regions and evaluated the minimum amount of data necessary to reproduce the same results as the mitogenome. We compared results among individual genes and the mitogenome for recently published complete mitogenome datasets of selected delphinids (Delphinidae) and killer whales (genus Orcinus). Using Bayesian phylogenetic methods, we investigated differences in estimation of topologies, divergence dates, and clock-like behavior among genes for both datasets. Although the most informative regions were not the same for each taxonomic group (COX1, CYTB, ND3 and ATP6 for Orcinus, and ND1, COX1 and ND4 for Delphinidae), in both cases they were equivalent to less than a quarter of the complete mitogenome. This suggests that gene information content can vary among groups, but can be adequately represented by a portion of the complete sequence. Although our results indicate that complete mitogenomes provide the highest phylogenetic resolution and most precise date estimates, a minimum amount of data can be selected using our approach when the complete sequence is unavailable. Studies based on single genes can benefit from the addition of a few more mitochondrial markers, producing topologies and date estimates similar to those obtained using the entire mitogenome. PMID:22073275
Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy

Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
An Assessment of Phylogenetic Tools for Analyzing the Interplay Between Interspecific Interactions and Phenotypic Evolution.

PubMed

Drury, J P; Grether, G F; Garland, T; Morlon, H

2018-05-01

Much ecological and evolutionary theory predicts that interspecific interactions often drive phenotypic diversification and that species phenotypes in turn influence species interactions. Several phylogenetic comparative methods have been developed to assess the importance of such processes in nature; however, the statistical properties of these methods have gone largely untested. Focusing mainly on scenarios of competition between closely-related species, we assess the performance of available comparative approaches for analyzing the interplay between interspecific interactions and species phenotypes. We find that many currently used statistical methods often fail to detect the impact of interspecific interactions on trait evolution, that sister-taxa analyses are particularly unreliable in general, and that recently developed process-based models have more satisfactory statistical properties. Methods for detecting predictors of species interactions are generally more reliable than methods for detecting character displacement. In weighing the strengths and weaknesses of different approaches, we hope to provide a clear guide for empiricists testing hypotheses about the reciprocal effect of interspecific interactions and species phenotypes and to inspire further development of process-based models.
New substitution models for rooting phylogenetic trees.

PubMed

Williams, Tom A; Heaps, Sarah E; Cherlin, Svetlana; Nye, Tom M W; Boys, Richard J; Embley, T Martin

2015-09-26

The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made. © 2015 The Authors.
The Development of Three Long Universal Nuclear Protein-Coding Locus Markers and Their Application to Osteichthyan Phylogenetics with Nested PCR

PubMed Central

Zhang, Peng

2012-01-01

Background Universal nuclear protein-coding locus (NPCL) markers that are applicable across diverse taxa and show good phylogenetic discrimination have broad applications in molecular phylogenetic studies. For example, RAG1, a representative NPCL marker, has been successfully used to make phylogenetic inferences within all major osteichthyan groups. However, such markers with broad working range and high phylogenetic performance are still scarce. It is necessary to develop more universal NPCL markers comparable to RAG1 for osteichthyan phylogenetics. Methodology/Principal Findings We developed three long universal NPCL markers (>1.6 kb each) based on single-copy nuclear genes (KIAA1239, SACS and TTN) that possess large exons and exhibit the appropriate evolutionary rates. We then compared their phylogenetic utilities with that of the reference marker RAG1 in 47 jawed vertebrate species. In comparison with RAG1, each of the three long universal markers yielded similar topologies and branch supports, all in congruence with the currently accepted osteichthyan phylogeny. To compare their phylogenetic performance visually, we also estimated the phylogenetic informativeness (PI) profile for each of the four long universal NPCL markers. The PI curves indicated that SACS performed best over the whole timescale, while RAG1, KIAA1239 and TTN exhibited similar phylogenetic performances. In addition, we compared the success of nested PCR and standard PCR when amplifying NPCL marker fragments. The amplification success rate and efficiency of the nested PCR were overwhelmingly higher than those of standard PCR. Conclusions/Significance Our work clearly demonstrates the superiority of nested PCR over the conventional PCR in phylogenetic studies and develops three long universal NPCL markers (KIAA1239, SACS and TTN) with the nested PCR strategy. The three markers exhibit high phylogenetic utilities in osteichthyan phylogenetics and can be widely used as pilot genes for phylogenetic questions of osteichthyans at different taxonomic levels. PMID:22720083

Phylogenetic patterns and the adaptive evolution of osmoregulation in fiddler crabs (Brachyura, Uca)

PubMed Central

Faria, Samuel Coelho; Provete, Diogo Borges; Thurman, Carl Leo

2017-01-01

Salinity is the primary driver of osmoregulatory evolution in decapods, and may have influenced their diversification into different osmotic niches. In semi-terrestrial crabs, hyper-osmoregulatory ability favors sojourns into burrows and dilute media, and provides a safeguard against hemolymph dilution; hypo-osmoregulatory ability underlies emersion capability and a life more removed from water sources. However, most comparative studies have neglected the roles of the phylogenetic and environmental components of inter-specific physiological variation, hindering evaluation of phylogenetic patterns and the adaptive nature of osmoregulatory evolution. Semi-terrestrial fiddler crabs (Uca) inhabit fresh to hyper-saline waters, with species from the Americas occupying higher intertidal habitats than Indo-west Pacific species mainly found in the low intertidal zone. Here, we characterize numerous osmoregulatory traits in all ten fiddler crabs found along the Atlantic coast of Brazil, and we employ phylogenetic comparative methods using 24 species to test for: (i) similarities of osmoregulatory ability among closely related species; (ii) salinity as a driver of osmoregulatory evolution; (iii) correlation between salt uptake and secretion; and (iv) adaptive peaks in osmoregulatory ability in the high intertidal American lineages. Our findings reveal that osmoregulation in Uca exhibits strong phylogenetic patterns in salt uptake traits. Salinity does not correlate with hyper/hypo-regulatory abilities, but drives hemolymph osmolality at ambient salinities. Osmoregulatory traits have evolved towards three adaptive peaks, revealing a significant contribution of hyper/hypo-regulatory ability in the American clades. Thus, during the evolutionary history of fiddler crabs, salinity has driven some of the osmoregulatory transformations that underpin habitat diversification, although others are apparently constrained phylogenetically. PMID:28182764
Evolution of the mitochondrial genome in snakes: Gene rearrangements and phylogenetic relationships

PubMed Central

Yan, Jie; Li, Hongdan; Zhou, Kaiya

2008-01-01

Background Snakes as a major reptile group display a variety of morphological characteristics pertaining to their diverse behaviours. Despite abundant analyses of morphological characters, molecular studies using mitochondrial and nuclear genes are limited. As a result, the phylogeny of snakes remains controversial. Previous studies on mitochondrial genomes of snakes have demonstrated duplication of the control region and translocation of trnL to be two notable features of the alethinophidian (all serpents except blindsnakes and threadsnakes) mtDNAs. Our purpose is to further investigate the gene organizations, evolution of the snake mitochondrial genome, and phylogenetic relationships among several major snake families. Results The mitochondrial genomes were sequenced for four taxa representing four different families, and each had a different gene arrangement. Comparative analyses with other snake mitochondrial genomes allowed us to summarize six types of mitochondrial gene arrangement in snakes. Phylogenetic reconstruction with commonly used methods of phylogenetic inference (BI, ML, MP, NJ) arrived at a similar topology, which was used to reconstruct the evolution of mitochondrial gene arrangements in snakes. Conclusion The phylogenetic relationships among the major families of snakes are in accordance with the mitochondrial genomes in terms of gene arrangements. The gene arrangement in Ramphotyphlops braminus mtDNA is inferred to be ancestral for snakes. After the divergence of the early Ramphotyphlops lineage, three types of rearrangements occurred. These changes involve translocations within the IQM tRNA gene cluster and the duplication of the CR. All phylogenetic methods support the placement of Enhydris plumbea outside of the (Colubridae + Elapidae) cluster, providing mitochondrial genomic evidence for the familial rank of Homalopsidae. PMID:19038056
Testing for Divergent Transmission Histories among Cultural Characters: A Study Using Bayesian Phylogenetic Methods and Iranian Tribal Textile Data

PubMed Central

Matthews, Luke J.; Tehrani, Jamie J.; Jordan, Fiona M.; Collard, Mark; Nunn, Charles L.

2011-01-01

Background Archaeologists and anthropologists have long recognized that different cultural complexes may have distinct descent histories, but they have lacked analytical techniques capable of easily identifying such incongruence. Here, we show how Bayesian phylogenetic analysis can be used to identify incongruent cultural histories. We employ the approach to investigate Iranian tribal textile traditions. Methods We used Bayes factor comparisons in a phylogenetic framework to test two models of cultural evolution: the hierarchically integrated system hypothesis and the multiple coherent units hypothesis. In the hierarchically integrated system hypothesis, a core tradition of characters evolves through descent with modification and characters peripheral to the core are exchanged among contemporaneous populations. In the multiple coherent units hypothesis, a core tradition does not exist. Rather, there are several cultural units consisting of sets of characters that have different histories of descent. Results For the Iranian textiles, the Bayesian phylogenetic analyses supported the multiple coherent units hypothesis over the hierarchically integrated system hypothesis. Our analyses suggest that pile-weave designs represent a distinct cultural unit that has a different phylogenetic history compared to other textile characters. Conclusions The results from the Iranian textiles are consistent with the available ethnographic evidence, which suggests that the commercial rug market has influenced pile-rug designs but not the techniques or designs incorporated in the other textiles produced by the tribes. We anticipate that Bayesian phylogenetic tests for inferring cultural units will be of great value for researchers interested in studying the evolution of cultural traits including language, behavior, and material culture. PMID:21559083
Classification of Phylogenetic Profiles for Protein Function Prediction: An SVM Approach

NASA Astrophysics Data System (ADS)

Kotaru, Appala Raju; Joshi, Ramesh C.

Predicting the function of an uncharacterized protein is a major challenge in post-genomic era due to problems complexity and scale. Having knowledge of protein function is a crucial link in the development of new drugs, better crops, and even the development of biochemicals such as biofuels. Recently numerous high-throughput experimental procedures have been invented to investigate the mechanisms leading to the accomplishment of a protein’s function and Phylogenetic profile is one of them. Phylogenetic profile is a way of representing a protein which encodes evolutionary history of proteins. In this paper we proposed a method for classification of phylogenetic profiles using supervised machine learning method, support vector machine classification along with radial basis function as kernel for identifying functionally linked proteins. We experimentally evaluated the performance of the classifier with the linear kernel, polynomial kernel and compared the results with the existing tree kernel. In our study we have used proteins of the budding yeast saccharomyces cerevisiae genome. We generated the phylogenetic profiles of 2465 yeast genes and for our study we used the functional annotations that are available in the MIPS database. Our experiments show that the performance of the radial basis kernel is similar to polynomial kernel is some functional classes together are better than linear, tree kernel and over all radial basis kernel outperformed the polynomial kernel, linear kernel and tree kernel. In analyzing these results we show that it will be feasible to make use of SVM classifier with radial basis function as kernel to predict the gene functionality using phylogenetic profiles.
Pattern and Process in the Comparative Study of Convergent Evolution.

PubMed

Mahler, D Luke; Weber, Marjorie G; Wagner, Catherine E; Ingram, Travis

2017-08-01

Understanding processes that have shaped broad-scale biodiversity patterns is a fundamental goal in evolutionary biology. The development of phylogenetic comparative methods has yielded a tool kit for analyzing contemporary patterns by explicitly modeling processes of change in the past, providing neontologists tools for asking questions previously accessible only for select taxa via the fossil record or laboratory experimentation. The comparative approach, however, differs operationally from alternative approaches to studying convergence in that, for studies of only extant species, convergence must be inferred using evolutionary process models rather than being directly measured. As a result, investigation of evolutionary pattern and process cannot be decoupled in comparative studies of convergence, even though such a decoupling could in theory guard against adaptationist bias. Assumptions about evolutionary process underlying comparative tools can shape the inference of convergent pattern in sometimes profound ways and can color interpretation of such patterns. We discuss these issues and other limitations common to most phylogenetic comparative approaches and suggest ways that they can be avoided in practice. We conclude by promoting a multipronged approach to studying convergence that integrates comparative methods with complementary tests of evolutionary mechanisms and includes ecological and biogeographical perspectives. Carefully employed, the comparative method remains a powerful tool for enriching our understanding of convergence in macroevolution, especially for investigation of why convergence occurs in some settings but not others.
Phylogenetic comparative methods complement discriminant function analysis in ecomorphology.

PubMed

Barr, W Andrew; Scott, Robert S

2014-04-01

In ecomorphology, Discriminant Function Analysis (DFA) has been used as evidence for the presence of functional links between morphometric variables and ecological categories. Here we conduct simulations of characters containing phylogenetic signal to explore the performance of DFA under a variety of conditions. Characters were simulated using a phylogeny of extant antelope species from known habitats. Characters were modeled with no biomechanical relationship to the habitat category; the only sources of variation were body mass, phylogenetic signal, or random "noise." DFA on the discriminability of habitat categories was performed using subsets of the simulated characters, and Phylogenetic Generalized Least Squares (PGLS) was performed for each character. Analyses were repeated with randomized habitat assignments. When simulated characters lacked phylogenetic signal and/or habitat assignments were random, <5.6% of DFAs and <8.26% of PGLS analyses were significant. When characters contained phylogenetic signal and actual habitats were used, 33.27 to 45.07% of DFAs and <13.09% of PGLS analyses were significant. False Discovery Rate (FDR) corrections for multiple PGLS analyses reduced the rate of significance to <4.64%. In all cases using actual habitats and characters with phylogenetic signal, correct classification rates of DFAs exceeded random chance. In simulations involving phylogenetic signal in both predictor variables and predicted categories, PGLS with FDR was rarely significant, while DFA often was. In short, DFA offered no indication that differences between categories might be explained by phylogenetic signal, while PGLS did. As such, PGLS provides a valuable tool for testing the functional hypotheses at the heart of ecomorphology. Copyright © 2013 Wiley Periodicals, Inc.
[Phylogenetic analysis of closely related Leuconostoc citreum species based on partial housekeeping genes].

PubMed

Lv, Qiang; Chen, Ming; Xu, Haiyan; Song, Yuqin; Sun, Zhihong; Dan, Tong; Sun, Tiansong

2013-07-04

Using the 16S rRNA, dnaA, murC and pyrG gene sequences, we identified the phylogenetic relationship among closely related Leuconostoc citreum species. Seven Leu. citreum strains originally isolated from sourdough were characterized by PCR methods to amplify the dnaA, murC and pyrG gene sequences, which were determined to assess the suitability as phylogenetic markers. Then, we estimated the genetic distance and constructed the phylogenetic trees including 16S rRNA and above mentioned three housekeeping genes combining with published corresponding sequences. By comparing the phylogenetic trees, the topology of three housekeeping genes trees were consistent with that of 16S rRNA gene. The homology of closely related Leu. citreum species among dnaA, murC, pyrG and 16S rRNA gene sequences were different, ranged from75.5% to 97.2%, 50.2% to 99.7%, 65.0% to 99.8% and 98.5% 100%, respectively. The phylogenetic relationship of three housekeeping genes sequences were highly consistent with the results of 16S rRNA gene sequence, while the genetic distance of these housekeeping genes were extremely high than 16S rRNA gene. Consequently, the dnaA, murC and pyrG gene are suitable for classification and identification closely related Leu. citreum species.
Phylogenetic analysis of conservation priorities for aquatic mammals and their terrestrial relatives, with a comparison of methods.

PubMed

May-Collado, Laura J; Agnarsson, Ingi

2011-01-01

Habitat loss and overexploitation are among the primary factors threatening populations of many mammal species. Recently, aquatic mammals have been highlighted as particularly vulnerable. Here we test (1) if aquatic mammals emerge as more phylogenetically urgent conservation priorities than their terrestrial relatives, and (2) if high priority species are receiving sufficient conservation effort. We also compare results among some phylogenetic conservation methods. A phylogenetic analysis of conservation priorities for all 620 species of Cetartiodactyla and Carnivora, including most aquatic mammals. Conservation priority ranking of aquatic versus terrestrial species is approximately proportional to their diversity. However, nearly all obligated freshwater cetartiodactylans are among the top conservation priority species. Further, ∼74% and 40% of fully aquatic cetartiodactylans and carnivores, respectively, are either threatened or data deficient, more so than their terrestrial relatives. Strikingly, only 3% of all 'high priority' species are thought to be stable. An overwhelming 97% of these species thus either show decreasing population trends (87%) or are insufficiently known (10%). Furthermore, a disproportional number of highly evolutionarily distinct species are experiencing population decline, thus, such species should be closely monitored even if not currently threatened. Comparison among methods reveals that exact species ranking differs considerably among methods, nevertheless, most top priority species consistently rank high under any method. While we here favor one approach, we also suggest that a consensus approach may be useful when methods disagree. These results reinforce prior findings, suggesting there is an urgent need to gather basic conservation data for aquatic mammals, and special conservation focus is needed on those confined to freshwater. That evolutionarily distinct--and thus 'biodiverse'--species are faring relatively poorly is alarming and requires further study. Our results offer a detailed guide to phylogeny-based conservation prioritization for these two orders.
Improved Maximum Parsimony Models for Phylogenetic Networks.

PubMed

Van Iersel, Leo; Jones, Mark; Scornavacca, Celine

2018-05-01

Phylogenetic networks are well suited to represent evolutionary histories comprising reticulate evolution. Several methods aiming at reconstructing explicit phylogenetic networks have been developed in the last two decades. In this article, we propose a new definition of maximum parsimony for phylogenetic networks that permits to model biological scenarios that cannot be modeled by the definitions currently present in the literature (namely, the "hardwired" and "softwired" parsimony). Building on this new definition, we provide several algorithmic results that lay the foundations for new parsimony-based methods for phylogenetic network reconstruction.
An improved model for whole genome phylogenetic analysis by Fourier transform.

PubMed

Yin, Changchuan; Yau, Stephen S-T

2015-10-07

DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
GPSit: An automated method for evolutionary analysis of nonculturable ciliated microeukaryotes.

PubMed

Chen, Xiao; Wang, Yurui; Sheng, Yalan; Warren, Alan; Gao, Shan

2018-05-01

Microeukaryotes are among the most important components of the microbial food web in almost all aquatic and terrestrial ecosystems worldwide. In order to gain a better understanding their roles and functions in ecosystems, sequencing coupled with phylogenomic analyses of entire genomes or transcriptomes is increasingly used to reconstruct the evolutionary history and classification of these microeukaryotes and thus provide a more robust framework for determining their systematics and diversity. More importantly, phylogenomic research usually requires high levels of hands-on bioinformatics experience. Here, we propose an efficient automated method, "Guided Phylogenomic Search in trees" (GPSit), which starts from predicted protein sequences of newly sequenced species and a well-defined customized orthologous database. Compared with previous protocols, our method streamlines the entire workflow by integrating all essential and other optional operations. In so doing, the manual operation time for reconstructing phylogenetic relationships is reduced from days to several hours, compared to other methods. Furthermore, GPSit supports user-defined parameters in most steps and thus allows users to adapt it to their studies. The effectiveness of GPSit is demonstrated by incorporating available online data and new single-cell data of three nonculturable marine ciliates (Anteholosticha monilata, Deviata sp. and Diophrys scutum) under moderate sequencing coverage (~5×). Our results indicate that the former could reconstruct robust "deep" phylogenetic relationships while the latter reveals the presence of intermediate taxa in shallow relationships. Based on empirical phylogenomic data, we also used GPSit to evaluate the impact of different levels of missing data on two commonly used methods of phylogenetic analyses, maximum likelihood (ML) and Bayesian inference (BI) methods. We found that BI is less sensitive to missing data when fast-evolving sites are removed. © 2018 John Wiley & Sons Ltd.
Phylogenetic analysis of anaerobic psychrophilic enrichment cultures obtained from a greenland glacier ice core

NASA Technical Reports Server (NTRS)

Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

2003-01-01

The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at -9 degrees C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 x 10(7) cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at -2 degrees C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years.
Phylogenetic Analysis of Anaerobic Psychrophilic Enrichment Cultures Obtained from a Greenland Glacier Ice Core

PubMed Central

Sheridan, Peter P.; Miteva, Vanya I.; Brenchley, Jean E.

2003-01-01

The examination of microorganisms in glacial ice cores allows the phylogenetic relationships of organisms frozen for thousands of years to be compared with those of current isolates. We developed a method for aseptically sampling a sediment-containing portion of a Greenland ice core that had remained at −9°C for over 100,000 years. Epifluorescence microscopy and flow cytometry results showed that the ice sample contained over 6 × 107 cells/ml. Anaerobic enrichment cultures inoculated with melted ice were grown and maintained at −2°C. Genomic DNA extracted from these enrichments was used for the PCR amplification of 16S rRNA genes with bacterial and archaeal primers and the preparation of clone libraries. Approximately 60 bacterial inserts were screened by restriction endonuclease analysis and grouped into 27 unique restriction fragment length polymorphism types, and 24 representative sequences were compared phylogenetically. Diverse sequences representing major phylogenetic groups including alpha, beta, and gamma Proteobacteria as well as relatives of the Thermus, Bacteroides, Eubacterium, and Clostridium groups were found. Sixteen clone sequences were closely related to those from known organisms, with four possibly representing new species. Seven sequences may reflect new genera and were most closely related to sequences obtained only by PCR amplification. One sequence was over 12% distant from its closest relative and may represent a novel order or family. These results show that phylogenetically diverse microorganisms have remained viable within the Greenland ice core for at least 100,000 years. PMID:12676695
SNP mining in Crassostrea gigas EST data: transferability to four other Crassostrea species, phylogenetic inferences and outlier SNPs under selection.

PubMed

Zhong, Xiaoxiao; Li, Qi; Yu, Hong; Kong, Lingfeng

2014-01-01

Oysters, with high levels of phenotypic plasticity and wide geographic distribution, are a challenging group for taxonomists and phylogenetics. Our study is intended to generate new EST-SNP markers and to evaluate their potential for cross-species utilization in phylogenetic study of the genus Crassostrea. In the study, 57 novel SNPs were developed from an EST database of C. gigas by the HRM (high-resolution melting) method. Transferability of 377 SNPs developed for C. gigas was examined on four other Crassostrea species: C. sikamea, C. angulata, C. hongkongensis and C. ariakensis. Among the 377 primer pairs tested, 311 (82.5%) primers showed amplification in C. sikamea, 353 (93.6%) in C. angulata, 254 (67.4%) in C. hongkongensis and 253 (67.1%) in C. ariakensis. A total of 214 SNPs were found to be transferable to all four species. Phylogenetic analyses showed that C. hongkongensis was a sister species of C. ariakensis and that this clade was sister to the clade containing C. sikamea, C. angulata and C. gigas. Within this clade, C. gigas and C. angulata had the closest relationship, with C. sikamea being the sister group. In addition, we detected eight SNPs as potentially being under selection by two outlier tests (fdist and hierarchical methods). The SNPs studied here should be useful for genetic diversity, comparative mapping and phylogenetic studies across species in Crassostrea and the candidate outlier SNPs are worth exploring in more detail regarding association genetics and functional studies.
Phylogenetic tree construction using trinucleotide usage profile (TUP).

PubMed

Chen, Si; Deng, Lih-Yuan; Bowman, Dale; Shiau, Jyh-Jen Horng; Wong, Tit-Yee; Madahian, Behrouz; Lu, Henry Horng-Shing

2016-10-06

It has been a challenging task to build a genome-wide phylogenetic tree for a large group of species containing a large number of genes with long nucleotides sequences. The most popular method, called feature frequency profile (FFP-k), finds the frequency distribution for all words of certain length k over the whole genome sequence using (overlapping) windows of the same length. For a satisfactory result, the recommended word length (k) ranges from 6 to 15 and it may not be a multiple of 3 (codon length). The total number of possible words needed for FFP-k can range from 4 6 =4096 to 4 15 . We propose a simple improvement over the popular FFP method using only a typical word length of 3. A new method, called Trinucleotide Usage Profile (TUP), is proposed based only on the (relative) frequency distribution using non-overlapping windows of length 3. The total number of possible words needed for TUP is 4 3 =64, which is much less than the total count for the recommended optimal "resolution" for FFP. To build a phylogenetic tree, we propose first representing each of the species by a TUP vector and then using an appropriate distance measure between pairs of the TUP vectors for the tree construction. In particular, we propose summarizing a DNA sequence by a matrix of three rows corresponding to three reading frames, recording the frequency distribution of the non-overlapping words of length 3 in each of the reading frame. We also provide a numerical measure for comparing trees constructed with various methods. Compared to the FFP method, our empirical study showed that the proposed TUP method is more capable of building phylogenetic trees with a stronger biological support. We further provide some justifications on this from the information theory viewpoint. Unlike the FFP method, the TUP method takes the advantage that the starting of the first reading frame is (usually) known. Without this information, the FFP method could only rely on the frequency distribution of overlapping words, which is the average (or mixture) of the frequency distributions of three possible reading frames. Consequently, we show (from the entropy viewpoint) that the FFP procedure could dilute important gene information and therefore provides less accurate classification.
Advances in the floral structural characterization of the major subclades of Malpighiales, one of the largest orders of flowering plants

PubMed Central

Endress, Peter K.; Davis, Charles C.; Matthews, Merran L.

2013-01-01

Background and Aims Malpighiales are one of the largest angiosperm orders and have undergone radical systematic restructuring based on molecular phylogenetic studies. The clade has been recalcitrant to molecular phylogenetic reconstruction, but has become much more resolved at the suprafamilial level. It now contains so many newly identified clades that there is an urgent need for comparative studies to understand their structure, biology and evolution. This is especially true because the order contains a disproportionally large diversity of rain forest species and includes numerous agriculturally important plants. This study is a first broad systematic step in this endeavour. It focuses on a comparative structural overview of the flowers across all recently identified suprafamilial clades of Malpighiales, and points towards areas that desperately need attention. Methods The phylogenetic comparative analysis of floral structure for the order is based on our previously published studies on four suprafamilial clades of Malpighiales, including also four related rosid orders (Celastrales, Crossosomatales, Cucurbitales, Oxalidales). In addition, the results are compiled from a survey of over 3000 publications on macrosystematics, floral structure and embryology across all orders of the core eudicots. Key Results Most new suprafamilial clades within Malpighiales are well supported by floral structural features. Inner morphological structures of the gynoecium (i.e. stigmatic lobes, inner shape of the locules, placentation, presence of obturators) and ovules (i.e. structure of the nucellus, thickness of the integuments, presence of vascular bundles in the integuments, presence of an endothelium in the inner integument) appear to be especially suitable for characterizing suprafamilial clades within Malpighiales. Conclusions Although the current phylogenetic reconstruction of Malpighiales is much improved compared with earlier versions, it is incomplete, and further focused phylogenetic and morphological studies are needed. Once all major subclades of Malpighiales are elucidated, more in-depth studies on promising structural features can be conducted. In addition, once the phylogenetic tree of Malpighiales, including closely related orders, is more fully resolved, character optimization studies will be possible to reconstruct evolution of structural and biological features within the order. PMID:23486341
PyRAD: assembly of de novo RADseq loci for phylogenetic analyses.

PubMed

Eaton, Deren A R

2014-07-01

Restriction-site-associated genomic markers are a powerful tool for investigating evolutionary questions at the population level, but are limited in their utility at deeper phylogenetic scales where fewer orthologous loci are typically recovered across disparate taxa. While this limitation stems in part from mutations to restriction recognition sites that disrupt data generation, an additional source of data loss comes from the failure to identify homology during bioinformatic analyses. Clustering methods that allow for lower similarity thresholds and the inclusion of indel variation will perform better at assembling RADseq loci at the phylogenetic scale. PyRAD is a pipeline to assemble de novo RADseq loci with the aim of optimizing coverage across phylogenetic datasets. It uses a wrapper around an alignment-clustering algorithm, which allows for indel variation within and between samples, as well as for incomplete overlap among reads (e.g. paired-end). Here I compare PyRAD with the program Stacks in their performance analyzing a simulated RADseq dataset that includes indel variation. Indels disrupt clustering of homologous loci in Stacks but not in PyRAD, such that the latter recovers more shared loci across disparate taxa. I show through reanalysis of an empirical RADseq dataset that indels are a common feature of such data, even at shallow phylogenetic scales. PyRAD uses parallel processing as well as an optional hierarchical clustering method, which allows it to rapidly assemble phylogenetic datasets with hundreds of sampled individuals. Software is written in Python and freely available at http://www.dereneaton.com/software/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Using tree diversity to compare phylogenetic heuristics.

PubMed

Sul, Seung-Jin; Matthews, Suzanne; Williams, Tiffani L

2009-04-29

Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees-especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.
Using complementary approaches to identify trans-domain nuclear gene transfers in the extremophile Galdieria sulphuraria (Rhodophyta).

PubMed

Pandey, Ravi S; Saxena, Garima; Bhattacharya, Debashish; Qiu, Huan; Azad, Rajeev K

2017-02-01

Identification of horizontal gene transfers (HGTs) has primarily relied on phylogenetic tree based methods, which require a rich sampling of sequenced genomes to ensure a reliable inference. Because the success of phylogenetic approaches depends on the breadth and depth of the database, researchers usually apply stringent filters to detect only the most likely gene transfers in the genomes of interest. One such study focused on a highly conservative estimate of trans-domain gene transfers in the extremophile eukaryote, Galdieria sulphuraria (Galdieri) Merola (Rhodophyta), by applying multiple filters in their phylogenetic pipeline. This led to the identification of 75 inter-domain acquisitions from Bacteria or Archaea. Because of the evolutionary, ecological, and potential biotechnological significance of foreign genes in algae, alternative approaches and pipelines complementing phylogenetics are needed for a more comprehensive assessment of HGT. We present here a novel pipeline that uncovered 17 novel foreign genes of prokaryotic origin in G. sulphuraria, results that are supported by multiple lines of evidence including composition-based, comparative data, and phylogenetics. These genes encode a variety of potentially adaptive functions, from metabolite transport to DNA repair. © 2016 Phycological Society of America.
Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

PubMed

Sumner, Jeremy G; Taylor, Amelia; Holland, Barbara R; Jarvis, Peter D

2017-12-01

Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for models with more than two states-for example DNA sequence alignments with four-state models-we find that methods which rely on phylogenetic invariants are incapable of satisfying all three of the stated statistical properties. This is because in these cases the relevant Markov invariants belong to a class of polynomials independent from the phylogenetic invariants.

Maximum parsimony, substitution model, and probability phylogenetic trees.

PubMed

Weng, J F; Thomas, D A; Mareels, I

2011-01-01

The problem of inferring phylogenies (phylogenetic trees) is one of the main problems in computational biology. There are three main methods for inferring phylogenies-Maximum Parsimony (MP), Distance Matrix (DM) and Maximum Likelihood (ML), of which the MP method is the most well-studied and popular method. In the MP method the optimization criterion is the number of substitutions of the nucleotides computed by the differences in the investigated nucleotide sequences. However, the MP method is often criticized as it only counts the substitutions observable at the current time and all the unobservable substitutions that really occur in the evolutionary history are omitted. In order to take into account the unobservable substitutions, some substitution models have been established and they are now widely used in the DM and ML methods but these substitution models cannot be used within the classical MP method. Recently the authors proposed a probability representation model for phylogenetic trees and the reconstructed trees in this model are called probability phylogenetic trees. One of the advantages of the probability representation model is that it can include a substitution model to infer phylogenetic trees based on the MP principle. In this paper we explain how to use a substitution model in the reconstruction of probability phylogenetic trees and show the advantage of this approach with examples.
General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters.

PubMed

Hadfield, J D; Nakagawa, S

2010-03-01

Although many of the statistical techniques used in comparative biology were originally developed in quantitative genetics, subsequent development of comparative techniques has progressed in relative isolation. Consequently, many of the new and planned developments in comparative analysis already have well-tested solutions in quantitative genetics. In this paper, we take three recent publications that develop phylogenetic meta-analysis, either implicitly or explicitly, and show how they can be considered as quantitative genetic models. We highlight some of the difficulties with the proposed solutions, and demonstrate that standard quantitative genetic theory and software offer solutions. We also show how results from Bayesian quantitative genetics can be used to create efficient Markov chain Monte Carlo algorithms for phylogenetic mixed models, thereby extending their generality to non-Gaussian data. Of particular utility is the development of multinomial models for analysing the evolution of discrete traits, and the development of multi-trait models in which traits can follow different distributions. Meta-analyses often include a nonrandom collection of species for which the full phylogenetic tree has only been partly resolved. Using missing data theory, we show how the presented models can be used to correct for nonrandom sampling and show how taxonomies and phylogenies can be combined to give a flexible framework with which to model dependence.
Maximizing the phylogenetic diversity of seed banks.

PubMed

Griffiths, Kate E; Balding, Sharon T; Dickie, John B; Lewis, Gwilym P; Pearce, Tim R; Grenyer, Richard

2015-04-01

Ex situ conservation efforts such as those of zoos, botanical gardens, and seed banks will form a vital complement to in situ conservation actions over the coming decades. It is therefore necessary to pay the same attention to the biological diversity represented in ex situ conservation facilities as is often paid to protected-area networks. Building the phylogenetic diversity of ex situ collections will strengthen our capacity to respond to biodiversity loss. Since 2000, the Millennium Seed Bank Partnership has banked seed from 14% of the world's plant species. We assessed the taxonomic, geographic, and phylogenetic diversity of the Millennium Seed Bank collection of legumes (Leguminosae). We compared the collection with all known legume genera, their known geographic range (at country and regional levels), and a genus-level phylogeny of the legume family constructed for this study. Over half the phylogenetic diversity of legumes at the genus level was represented in the Millennium Seed Bank. However, pragmatic prioritization of species of economic importance and endangerment has led to the banking of a less-than-optimal phylogenetic diversity and prioritization of range-restricted species risks an underdispersed collection. The current state of the phylogenetic diversity of legumes in the Millennium Seed Bank could be substantially improved through the strategic banking of relatively few additional taxa. Our method draws on tools that are widely applied to in situ conservation planning, and it can be used to evaluate and improve the phylogenetic diversity of ex situ collections. © 2014 Society for Conservation Biology.
Contextualising primate origins--an ecomorphological framework.

PubMed

Soligo, Christophe; Smaers, Jeroen B

2016-04-01

Ecomorphology - the characterisation of the adaptive relationship between an organism's morphology and its ecological role - has long been central to theories of the origin and early evolution of the primate order. This is exemplified by two of the most influential theories of primate origins: Matt Cartmill's Visual Predation Hypothesis, and Bob Sussman's Angiosperm Co-Evolution Hypothesis. However, the study of primate origins is constrained by the absence of data directly documenting the events under investigation, and has to rely instead on a fragmentary fossil record and the methodological assumptions inherent in phylogenetic comparative analyses of extant species. These constraints introduce particular challenges for inferring the ecomorphology of primate origins, as morphology and environmental context must first be inferred before the relationship between the two can be considered. Fossils can be integrated in comparative analyses and observations of extant model species and laboratory experiments of form-function relationships are critical for the functional interpretation of the morphology of extinct species. Recent developments have led to important advancements, including phylogenetic comparative methods based on more realistic models of evolution, and improved methods for the inference of clade divergence times, as well as an improved fossil record. This contribution will review current perspectives on the origin and early evolution of primates, paying particular attention to their phylogenetic (including cladistic relationships and character evolution) and environmental (including chronology, geography, and physical environments) contextualisation, before attempting an up-to-date ecomorphological synthesis of primate origins. © 2016 Anatomical Society.
Degenerate primer MOB typing of multiresistant clinical isolates of E. coli uncovers new plasmid backbones.

PubMed

Garcillán-Barcia, M Pilar; Ruiz del Castillo, Belén; Alvarado, Andrés; de la Cruz, Fernando; Martínez-Martínez, Luis

2015-01-01

Degenerate Primer MOB Typing is a PCR-based protocol for the classification of γ-proteobacterial transmissible plasmids in five phylogenetic relaxase MOB families. It was applied to a multiresistant E. coli collection, previously characterized by PCR-based replicon-typing, in order to compare both methods. Plasmids from 32 clinical isolates of multiresistant E. coli (19 extended spectrum beta-lactamase producers and 13 non producers) and their transconjugants were analyzed. A total of 95 relaxases were detected, at least one per isolate, underscoring the high potential of these strains for antibiotic-resistance transmission. MOBP12 and MOBF12 plasmids were the most abundant. Most MOB subfamilies detected were present in both subsets of the collection, indicating a shared mobilome among multiresistant E. coli. The plasmid profile obtained by both methods was compared, which provided useful data upon which decisions related to the implementation of detection methods in the clinic could be based. The phylogenetic depth at which replicon and MOB-typing classify plasmids is different. While replicon-typing aims at plasmid replication regions with non-degenerate primers, MOB-typing classifies plasmids into relaxase subfamilies using degenerate primers. As a result, MOB-typing provides a deeper phylogenetic depth than replicon-typing and new plasmid groups are uncovered. Significantly, MOB typing identified 17 plasmids and an integrative and conjugative element, which were not detected by replicon-typing. Four of these backbones were different from previously reported elements. Copyright © 2014 Elsevier Inc. All rights reserved.
Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction.

PubMed

Mai, Uyen; Sayyari, Erfan; Mirarab, Siavash

2017-01-01

Phylogenetic trees inferred using commonly-used models of sequence evolution are unrooted, but the root position matters both for interpretation and downstream applications. This issue has been long recognized; however, whether the potential for discordance between the species tree and gene trees impacts methods of rooting a phylogenetic tree has not been extensively studied. In this paper, we introduce a new method of rooting a tree based on its branch length distribution; our method, which minimizes the variance of root to tip distances, is inspired by the traditional midpoint rerooting and is justified when deviations from the strict molecular clock are random. Like midpoint rerooting, the method can be implemented in a linear time algorithm. In extensive simulations that consider discordance between gene trees and the species tree, we show that the new method is more accurate than midpoint rerooting, but its relative accuracy compared to using outgroups to root gene trees depends on the size of the dataset and levels of deviations from the strict clock. We show high levels of error for all methods of rooting estimated gene trees due to factors that include effects of gene tree discordance, deviations from the clock, and gene tree estimation error. Our simulations, however, did not reveal significant differences between two equivalent methods for species tree estimation that use rooted and unrooted input, namely, STAR and NJst. Nevertheless, our results point to limitations of existing scalable rooting methods.
Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction

PubMed Central

Sayyari, Erfan; Mirarab, Siavash

2017-01-01

Phylogenetic trees inferred using commonly-used models of sequence evolution are unrooted, but the root position matters both for interpretation and downstream applications. This issue has been long recognized; however, whether the potential for discordance between the species tree and gene trees impacts methods of rooting a phylogenetic tree has not been extensively studied. In this paper, we introduce a new method of rooting a tree based on its branch length distribution; our method, which minimizes the variance of root to tip distances, is inspired by the traditional midpoint rerooting and is justified when deviations from the strict molecular clock are random. Like midpoint rerooting, the method can be implemented in a linear time algorithm. In extensive simulations that consider discordance between gene trees and the species tree, we show that the new method is more accurate than midpoint rerooting, but its relative accuracy compared to using outgroups to root gene trees depends on the size of the dataset and levels of deviations from the strict clock. We show high levels of error for all methods of rooting estimated gene trees due to factors that include effects of gene tree discordance, deviations from the clock, and gene tree estimation error. Our simulations, however, did not reveal significant differences between two equivalent methods for species tree estimation that use rooted and unrooted input, namely, STAR and NJst. Nevertheless, our results point to limitations of existing scalable rooting methods. PMID:28800608
SUNPLIN: Simulation with Uncertainty for Phylogenetic Investigations

PubMed Central

2013-01-01

Background Phylogenetic comparative analyses usually rely on a single consensus phylogenetic tree in order to study evolutionary processes. However, most phylogenetic trees are incomplete with regard to species sampling, which may critically compromise analyses. Some approaches have been proposed to integrate non-molecular phylogenetic information into incomplete molecular phylogenies. An expanded tree approach consists of adding missing species to random locations within their clade. The information contained in the topology of the resulting expanded trees can be captured by the pairwise phylogenetic distance between species and stored in a matrix for further statistical analysis. Thus, the random expansion and processing of multiple phylogenetic trees can be used to estimate the phylogenetic uncertainty through a simulation procedure. Because of the computational burden required, unless this procedure is efficiently implemented, the analyses are of limited applicability. Results In this paper, we present efficient algorithms and implementations for randomly expanding and processing phylogenetic trees so that simulations involved in comparative phylogenetic analysis with uncertainty can be conducted in a reasonable time. We propose algorithms for both randomly expanding trees and calculating distance matrices. We made available the source code, which was written in the C++ language. The code may be used as a standalone program or as a shared object in the R system. The software can also be used as a web service through the link: http://purl.oclc.org/NET/sunplin/. Conclusion We compare our implementations to similar solutions and show that significant performance gains can be obtained. Our results open up the possibility of accounting for phylogenetic uncertainty in evolutionary and ecological analyses of large datasets. PMID:24229408
SUNPLIN: simulation with uncertainty for phylogenetic investigations.

PubMed

Martins, Wellington S; Carmo, Welton C; Longo, Humberto J; Rosa, Thierson C; Rangel, Thiago F

2013-11-15

Phylogenetic comparative analyses usually rely on a single consensus phylogenetic tree in order to study evolutionary processes. However, most phylogenetic trees are incomplete with regard to species sampling, which may critically compromise analyses. Some approaches have been proposed to integrate non-molecular phylogenetic information into incomplete molecular phylogenies. An expanded tree approach consists of adding missing species to random locations within their clade. The information contained in the topology of the resulting expanded trees can be captured by the pairwise phylogenetic distance between species and stored in a matrix for further statistical analysis. Thus, the random expansion and processing of multiple phylogenetic trees can be used to estimate the phylogenetic uncertainty through a simulation procedure. Because of the computational burden required, unless this procedure is efficiently implemented, the analyses are of limited applicability. In this paper, we present efficient algorithms and implementations for randomly expanding and processing phylogenetic trees so that simulations involved in comparative phylogenetic analysis with uncertainty can be conducted in a reasonable time. We propose algorithms for both randomly expanding trees and calculating distance matrices. We made available the source code, which was written in the C++ language. The code may be used as a standalone program or as a shared object in the R system. The software can also be used as a web service through the link: http://purl.oclc.org/NET/sunplin/. We compare our implementations to similar solutions and show that significant performance gains can be obtained. Our results open up the possibility of accounting for phylogenetic uncertainty in evolutionary and ecological analyses of large datasets.
Multilocus inference of species trees and DNA barcoding.

PubMed

Mallo, Diego; Posada, David

2016-09-05

The unprecedented amount of data resulting from next-generation sequencing has opened a new era in phylogenetic estimation. Although large datasets should, in theory, increase phylogenetic resolution, massive, multilocus datasets have uncovered a great deal of phylogenetic incongruence among different genomic regions, due both to stochastic error and to the action of different evolutionary process such as incomplete lineage sorting, gene duplication and loss and horizontal gene transfer. This incongruence violates one of the fundamental assumptions of the DNA barcoding approach, which assumes that gene history and species history are identical. In this review, we explain some of the most important challenges we will have to face to reconstruct the history of species, and the advantages and disadvantages of different strategies for the phylogenetic analysis of multilocus data. In particular, we describe the evolutionary events that can generate species tree-gene tree discordance, compare the most popular methods for species tree reconstruction, highlight the challenges we need to face when using them and discuss their potential utility in barcoding. Current barcoding methods sacrifice a great amount of statistical power by only considering one locus, and a transition to multilocus barcodes would not only improve current barcoding methods, but also facilitate an eventual transition to species-tree-based barcoding strategies, which could better accommodate scenarios where the barcode gap is too small or inexistent.This article is part of the themed issue 'From DNA barcodes to biomes'. © 2016 The Authors.
Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction.

PubMed

Muley, Vijaykumar Yogesh; Ranjan, Akash

2012-01-01

Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.
Effects of phylogenetic reconstruction method on the robustness of species delimitation using single-locus data

PubMed Central

Tang, Cuong Q; Humphreys, Aelys M; Fontaneto, Diego; Barraclough, Timothy G; Paradis, Emmanuel

2014-01-01

Coalescent-based species delimitation methods combine population genetic and phylogenetic theory to provide an objective means for delineating evolutionarily significant units of diversity. The generalised mixed Yule coalescent (GMYC) and the Poisson tree process (PTP) are methods that use ultrametric (GMYC or PTP) or non-ultrametric (PTP) gene trees as input, intended for use mostly with single-locus data such as DNA barcodes. Here, we assess how robust the GMYC and PTP are to different phylogenetic reconstruction and branch smoothing methods. We reconstruct over 400 ultrametric trees using up to 30 different combinations of phylogenetic and smoothing methods and perform over 2000 separate species delimitation analyses across 16 empirical data sets. We then assess how variable diversity estimates are, in terms of richness and identity, with respect to species delimitation, phylogenetic and smoothing methods. The PTP method generally generates diversity estimates that are more robust to different phylogenetic methods. The GMYC is more sensitive, but provides consistent estimates for BEAST trees. The lower consistency of GMYC estimates is likely a result of differences among gene trees introduced by the smoothing step. Unresolved nodes (real anomalies or methodological artefacts) affect both GMYC and PTP estimates, but have a greater effect on GMYC estimates. Branch smoothing is a difficult step and perhaps an underappreciated source of bias that may be widespread among studies of diversity and diversification. Nevertheless, careful choice of phylogenetic method does produce equivalent PTP and GMYC diversity estimates. We recommend simultaneous use of the PTP model with any model-based gene tree (e.g. RAxML) and GMYC approaches with BEAST trees for obtaining species hypotheses. PMID:25821577
Phylogenetic analysis of two Plectus mitochondrial genomes (Nematoda: Plectida) supports a sister group relationship between Plectida and Rhabditida within Chromadorea.

PubMed

Kim, Jiyeon; Kern, Elizabeth; Kim, Taeho; Sim, Mikang; Kim, Jaebum; Kim, Yuseob; Park, Chungoo; Nadler, Steven A; Park, Joong-Ki

2017-02-01

Plectida is an important nematode order with species that occupy many different biological niches. The order includes free-living aquatic and soil-dwelling species, but its phylogenetic position has remained uncertain. We sequenced the complete mitochondrial genomes of two members of this order, Plectus acuminatus and Plectus aquatilis and compared them with those of other major nematode clades. The genome size and base composition of these species are similar to other nematodes; 14,831 and 14,372bp, respectively, with AT contents of 71.0% and 70.1%. Gene content was also similar to other nematodes, but gene order and coding direction of Plectus mtDNAs were dissimilar from other chromadorean species. P. acuminatus and P. aquatilis are the first chromadorean species found to contain a gene inversion. We reconstructed mitochondrial genome phylogenetic trees using nucleotide and amino acid datasets from 87 nematodes that represent major nematode clades, including the Plectus sequences. Trees from phylogenetic analyses using maximum likelihood and Bayesian methods depicted Plectida as the sister group to other sequenced chromadorean nematodes. This finding is consistent with several phylogenetic results based on SSU rDNA, but disagrees with a classification based on morphology. Mitogenomes representing other basal chromadorean groups (Araeolaimida, Monhysterida, Desmodorida, Chromadorida) are needed to confirm their phylogenetic relationships. Copyright © 2016 Elsevier Inc. All rights reserved.
Phylogenetic position of the North American isolate of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, as inferred from 16S rDNA sequence analysis.

PubMed

Atibalentja, N; Noel, G R; Domier, L L

2000-03-01

A 1341 bp sequence of the 16S rDNA of an undescribed species of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, was determined and then compared with a homologous sequence of Pasteuria ramosa, a parasite of cladoceran water fleas of the family Daphnidae. The two Pasteuria sequences, which diverged from each other by a dissimilarity index of 7%, also were compared with the 16S rDNA sequences of 30 other bacterial species to determine the phylogenetic position of the genus Pasteuria among the Gram-positive eubacteria. Phylogenetic analyses using maximum-likelihood, maximum-parsimony and neighbour-joining methods showed that the Heterodera glycines-infecting Pasteuria and its sister species, P. ramosa, form a distinct line of descent within the Alicyclobacillus group of the Bacillaceae. These results are consistent with the view that the genus Pasteuria is a deeply rooted member of the Clostridium-Bacillus-Streptococcus branch of the Gram-positive eubacteria, neither related to the actinomycetes nor closely related to true endospore-forming bacteria.
Transmission clustering among newly diagnosed HIV patients in Chicago, 2008 to 2011: using phylogenetics to expand knowledge of regional HIV transmission patterns

PubMed Central

Lubelchek, Ronald J.; Hoehnen, Sarah C.; Hotton, Anna L.; Kincaid, Stacey L.; Barker, David E.; French, Audrey L.

2014-01-01

Introduction HIV transmission cluster analyses can inform HIV prevention efforts. We describe the first such assessment for transmission clustering among HIV patients in Chicago. Methods We performed transmission cluster analyses using HIV pol sequences from newly diagnosed patients presenting to Chicago’s largest HIV clinic between 2008 and 2011. We compared sequences via progressive pairwise alignment, using neighbor joining to construct an un-rooted phylogenetic tree. We defined clusters as >2 sequences among which each sequence had at least one partner within a genetic distance of ≤ 1.5%. We used multivariable regression to examine factors associated with clustering and used geospatial analysis to assess geographic proximity of phylogenetically clustered patients. Results We compared sequences from 920 patients; median age 35 years; 75% male; 67% Black, 23% Hispanic; 8% had a Rapid Plasma Reagin (RPR) titer ≥ 1:16 concurrent with their HIV diagnosis. We had HIV transmission risk data for 54%; 43% identified as men who have sex with men (MSM). Phylogenetic analysis demonstrated 123 patients (13%) grouped into 26 clusters, the largest having 20 members. In multivariable regression, age < 25, Black race, MSM status, male gender, higher HIV viral load, and RPR ≥ 1:16 associated with clustering. We did not observe geographic grouping of genetically clustered patients. Discussion Our results demonstrate high rates of HIV transmission clustering, without local geographic foci, among young Black MSM in Chicago. Applied prospectively, phylogenetic analyses could guide prevention efforts and help break the cycle of transmission. PMID:25321182
Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.

PubMed

Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F

2017-08-01

Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.
Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits.

PubMed

Dessimoz, Christophe; Boeckmann, Brigitte; Roth, Alexander C J; Gonnet, Gaston H

2006-01-01

Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings.
Utility of COX1 phylogenetics to differentiate between locally acquired and imported Plasmodium knowlesi infections in Singapore

PubMed Central

Loh, Jin Phang; Gao, Qiu Han Christine; Lee, Vernon J; Tetteh, Kevin; Drakeley, Chris

2016-01-01

INTRODUCTION Although there have been several phylogenetic studies on Plasmodium knowlesi (P. knowlesi), only cytochrome c oxidase subunit 1 (COX1) gene analysis has shown some geographical differentiation between the isolates of different countries. METHODS Phylogenetic analysis of locally acquired P. knowlesi infections, based on circumsporozoite, small subunit ribosomal ribonucleic acid (SSU rRNA), merozoite surface protein 1 and COX1 gene targets, was performed. The results were compared with the published sequences of regional isolates from Malaysia and Thailand. RESULTS Phylogenetic analysis of the circumsporozoite, SSU rRNA and merozoite surface protein 1 gene sequences for regional P. knowlesi isolates showed no obvious differentiation that could be attributed to their geographical origin. However, COX1 gene analysis showed that it was possible to differentiate between Singapore-acquired P. knowlesi infections and P. knowlesi infections from Peninsular Malaysia and Sarawak, Borneo, Malaysia. CONCLUSION The ability to differentiate between locally acquired P. knowlesi infections and imported P. knowlesi infections has important utility for the monitoring of P. knowlesi malaria control programmes in Singapore. PMID:26805667
Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

PubMed

Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

2014-01-01

Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.
Estimating phylogenetic trees from genome-scale data.

PubMed

Liu, Liang; Xi, Zhenxiang; Wu, Shaoyuan; Davis, Charles C; Edwards, Scott V

2015-12-01

The heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. Phylogenetic methods known as "species tree" methods have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Here we review theory and empirical examples that help clarify conflicts between species tree and concatenation methods, and misconceptions in the literature about the performance of species tree methods. Considering concatenation as a special case of the multispecies coalescent model helps explain differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences and long-branch attraction. We show that approaches, such as binning, designed to augment the signal in species tree analyses can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods incorporating biological realism are a key to phylogenetic analysis of whole-genome data. © 2015 New York Academy of Sciences.

Leaf Photosynthetic Rate of Tropical Ferns Is Evolutionarily Linked to Water Transport Capacity

PubMed Central

Cao, Kun-Fang; Hu, Hong; Zhang, Jiao-Lin

2014-01-01

Ferns usually have relatively lower photosynthetic potential than angiosperms. However, it is unclear whether low photosynthetic potential of ferns is linked to leaf water supply. We hypothesized that there is an evolutionary association of leaf water transport capacity with photosynthesis and stomatal density in ferns. In the present study, a series of functional traits relating to leaf anatomy, hydraulics and physiology were assessed in 19 terrestrial and 11 epiphytic ferns in a common garden, and analyzed by a comparative phylogenetics method. Compared with epiphytic ferns, terrestrial ferns had higher vein density (Dvein), stomatal density (SD), stomatal conductance (gs), and photosynthetic capacity (Amax), but lower values for lower epidermal thickness (LET) and leaf thickness (LT). Across species, all traits varied significantly, but only stomatal length (SL) showed strong phylogenetic conservatism. Amax was positively correlated with Dvein and gs with and without phylogenetic corrections. SD correlated positively with Amax, Dvein and gs, with the correlation between SD and Dvein being significant after phylogenetic correction. Leaf water content showed significant correlations with LET, LT, and mesophyll thickness. Our results provide evidence that Amax of the studied ferns is linked to leaf water transport capacity, and there was an evolutionary association between water supply and demand in ferns. These findings add new insights into the evolutionary correlations among traits involving carbon and water economy in ferns. PMID:24416265
What is the phylogenetic signal limit from mitogenomes? The reconciliation between mitochondrial and nuclear data in the Insecta class phylogeny

PubMed Central

2011-01-01

Background Efforts to solve higher-level evolutionary relationships within the class Insecta by using mitochondrial genomic data are hindered due to fast sequence evolution of several groups, most notably Hymenoptera, Strepsiptera, Phthiraptera, Hemiptera and Thysanoptera. Accelerated rates of substitution on their sequences have been shown to have negative consequences in phylogenetic inference. In this study, we tested several methodological approaches to recover phylogenetic signal from whole mitochondrial genomes. As a model, we used two classical problems in insect phylogenetics: The relationships within Paraneoptera and within Holometabola. Moreover, we assessed the mitochondrial phylogenetic signal limits in the deeper Eumetabola dataset, and we studied the contribution of individual genes. Results Long-branch attraction (LBA) artefacts were detected in all the datasets. Methods using Bayesian inference outperformed maximum likelihood approaches, and LBA was avoided in Paraneoptera and Holometabola when using protein sequences and the site-heterogeneous mixture model CAT. The better performance of this method was evidenced by resulting topologies matching generally accepted hypotheses based on nuclear and/or morphological data, and was confirmed by cross-validation and simulation analyses. Using the CAT model, the order Strepsiptera was recovered as sister to Coleoptera for the first time using mitochondrial sequences, in agreement with recent results based on large nuclear and morphological datasets. Also the Hymenoptera-Mecopterida association was obtained, leaving Coleoptera and Strepsiptera as the basal groups of the holometabolan insects, which coincides with one of the two main competing hypotheses. For the Paraneroptera, the currently accepted non-monophyly of Homoptera was documented as a phylogenetic novelty for mitochondrial data. However, results were not satisfactory when exploring the entire Eumetabola, revealing the limits of the phylogenetic signal that can be extracted from Insecta mitogenomes. Based on the combined use of the five best topology-performing genes we obtained comparable results to whole mitogenomes, highlighting the important role of data quality. Conclusion We show for the first time that mitogenomic data agrees with nuclear and morphological data for several of the most controversial insect evolutionary relationships, adding a new independent source of evidence to study relationships among insect orders. We propose that deeper divergences cannot be inferred with the current available methods due to sequence saturation and compositional bias inconsistencies. Our exploratory analysis indicates that the CAT model is the best dealing with LBA and it could be useful for other groups and datasets with similar phylogenetic difficulties. PMID:22032248
Phylogenetic Analysis of Conservation Priorities for Aquatic Mammals and Their Terrestrial Relatives, with a Comparison of Methods

PubMed Central

May-Collado, Laura J.; Agnarsson, Ingi

2011-01-01

Background Habitat loss and overexploitation are among the primary factors threatening populations of many mammal species. Recently, aquatic mammals have been highlighted as particularly vulnerable. Here we test (1) if aquatic mammals emerge as more phylogenetically urgent conservation priorities than their terrestrial relatives, and (2) if high priority species are receiving sufficient conservation effort. We also compare results among some phylogenetic conservation methods. Methodology/Principal Findings A phylogenetic analysis of conservation priorities for all 620 species of Cetartiodactyla and Carnivora, including most aquatic mammals. Conservation priority ranking of aquatic versus terrestrial species is approximately proportional to their diversity. However, nearly all obligated freshwater cetartiodactylans are among the top conservation priority species. Further, ∼74% and 40% of fully aquatic cetartiodactylans and carnivores, respectively, are either threatened or data deficient, more so than their terrestrial relatives. Strikingly, only 3% of all ‘high priority’ species are thought to be stable. An overwhelming 97% of these species thus either show decreasing population trends (87%) or are insufficiently known (10%). Furthermore, a disproportional number of highly evolutionarily distinct species are experiencing population decline, thus, such species should be closely monitored even if not currently threatened. Comparison among methods reveals that exact species ranking differs considerably among methods, nevertheless, most top priority species consistently rank high under any method. While we here favor one approach, we also suggest that a consensus approach may be useful when methods disagree. Conclusions/Significance These results reinforce prior findings, suggesting there is an urgent need to gather basic conservation data for aquatic mammals, and special conservation focus is needed on those confined to freshwater. That evolutionarily distinct—and thus ‘biodiverse’—species are faring relatively poorly is alarming and requires further study. Our results offer a detailed guide to phylogeny-based conservation prioritization for these two orders. PMID:21799899
A parametric method for assessing diversification-rate variation in phylogenetic trees.

PubMed

Shah, Premal; Fitzpatrick, Benjamin M; Fordyce, James A

2013-02-01

Phylogenetic hypotheses are frequently used to examine variation in rates of diversification across the history of a group. Patterns of diversification-rate variation can be used to infer underlying ecological and evolutionary processes responsible for patterns of cladogenesis. Most existing methods examine rate variation through time. Methods for examining differences in diversification among groups are more limited. Here, we present a new method, parametric rate comparison (PRC), that explicitly compares diversification rates among lineages in a tree using a variety of standard statistical distributions. PRC can identify subclades of the tree where diversification rates are at variance with the remainder of the tree. A randomization test can be used to evaluate how often such variance would appear by chance alone. The method also allows for comparison of diversification rate among a priori defined groups. Further, the application of the PRC method is not restricted to monophyletic groups. We examined the performance of PRC using simulated data, which showed that PRC has acceptable false-positive rates and statistical power to detect rate variation. We apply the PRC method to the well-studied radiation of North American Plethodon salamanders, and support the inference that the large-bodied Plethodon glutinosus clade has a higher historical rate of diversification compared to other Plethodon salamanders. © 2012 The Author(s). Evolution© 2012 The Society for the Study of Evolution.
Increased phylogenetic resolution within the ecologically important Rhizopogon subgenus Amylopogon using 10 anonymous nuclear loci.

PubMed

Dowie, Nicholas J; Grubisha, Lisa C; Burton, Brent A; Klooster, Matthew R; Miller, Steven L

2017-01-01

Rhizopogon species are ecologically significant ectomycorrhizal fungi in conifer ecosystems. The importance of this system merits the development and utilization of a more robust set of molecular markers specifically designed to evaluate their evolutionary ecology. Anonymous nuclear loci (ANL) were developed for R. subgenus Amylopogon. Members of this subgenus occur throughout the United States and are exclusive fungal symbionts associated with Pterospora andromedea, a threatened mycoheterotrophic plant endemic to disjunct eastern and western regions of North America. Candidate ANL were developed from 454 shotgun pyrosequencing and assessed for positive amplification across targeted species, sequencing success, and recovery of phylogenetically informative sites. Ten ANL were successfully developed and were subsequently used to sequence representative taxa, herbaria holotype and paratype specimens in R. subgenus Amylopogon. Phylogenetic reconstructions were performed on individual and concatenated data sets by Bayesian inference and maximum likelihood methods. Phylogenetic analyses of these 10 ANL were compared with a phylogeny traditionally constructed using the universal fungal barcode nuc rDNA ITS1-5.8S-ITS2 region (ITS). The resulting ANL phylogeny was consistent with most of the species designations delineated by ITS. However, the ANL phylogeny provided much greater phylogenetic resolution, yielding new evidence for cryptic species within previously defined species of R. subgenus Amylopogon. Additionally, the rooted ANL phylogeny provided an alternate topology to the ITS phylogeny, which inferred a novel set of evolutionary relationships not identified in prior phylogenetic studies.
Allopatric tuberculosis host–pathogen relationships are associated with greater pulmonary impairment

PubMed Central

Pasipanodya, Jotam G.; Moonan, Patrick K.; Vecino, Edgar; Miller, Thaddeus L.; Fernandez, Michel; Slocum, Philip; Drewyer, Gerry; Weis, Stephen E.

2015-01-01

Background Host pathogen relationships can be classified as allopatric, when the pathogens originated from separate, non-overlapping geographic areas from the host; or sympatric, when host and pathogen shared a common ancestral geographic location. It remains unclear if host–pathogen relationships, as defined by phylogenetic lineage, influence clinical outcome. We sought to examine the association between allopatric and sympatric phylogenetic Mycobacterium tuberculosis lineages and pulmonary impairment after tuberculosis (PIAT). Methods Pulmonary function tests were performed on patients 16 years of age and older who had received ≥20 weeks of treatment for culture-confirmed M. tuberculosis complex. Forced Expiratory Volume in 1 min (FEV1) ≥80%, Forced Vital Capacity (FVC) ≥80% and FEV1/FVC >70% of predicted were considered normal. Other results defined pulmonary impairment. Spoligotype and 12-locus mycobacterial interspersed repetitive units-variable number of tandem repeats (MIRU-VNTR) were used to assign phylogenetic lineage. PIAT severity was compared between host–pathogen relationships which were defined by geography and ethnic population. We used multivariate logistic regression modeling to calculate adjusted odds ratios (aOR) between phylogenetic lineage and PIAT. Results Self-reported continental ancestry was correlated with Mycobacterium. tuberculosis lineage (p < 0.001). In multivariate analyses adjusting for phylogenetic lineage, age and smoking, the overall aOR for subjects with allopatric host–pathogen relationships and PIAT was 1.8 (95% confidence interval [CI]: 1.1, 2.9) compared to sympatric relationships. Smoking >30 pack-years was also associated with PIAT (aOR: 3.2; 95% CI: 1.5, 7.2) relative to smoking <1 pack-years. Conclusions PIAT frequency and severity varies by host–pathogen relationship and heavy cigarette consumption, but not phylogenetic lineage alone. Patients who had disease resulting from allopatric–host–pathogen relationship were more likely to have PIAT than patients with disease from sympatric–host–pathogen relationship infection. Further study of this association may identify ways that treatment and preventive efforts can be tailored to specific lineages and racial/ethnic populations. PMID:23501297
The 'temporal effect' in hominids: Reinvestigating the nature of support for a chimp-human clade in bone morphology.

PubMed

Pearson, Alannah; Groves, Colin; Cardini, Andrea

2015-11-01

In 2004, an analysis by Lockwood and colleagues of hard-tissue morphology, using geometric morphometrics on the temporal bone, succeeded in recovering the correct phylogeny of living hominids without resorting to potentially problematic methods for transforming continuous shape variables into meristic characters. That work has increased hope that by using modern analytical methods and phylogenetically informative anatomical data we might one day be able to accurately infer the relationships of hominins, including the closest extinct relatives of modern humans. In the present study, using 3D virtually generated models of the hominid temporal bone and a larger suite of geometric morphometric and comparative techniques, we have re-examined the evidence for a Pan-Homo clade. Despite differences in samples, as well as the type of raw data, the effect of measurement error (and especially landmark digitization by a different operator), but also a broader perspective brought in by our diverse set of approaches, our reanalysis largely supports Lockwood and colleagues' original results. However, by focusing not only mainly on shape (as in the original 2004 analysis) but also on size and 'size-corrected' (non-allometric) shape, we demonstrate that the strong phylogenetic signal in the temporal bone is largely related to similarities in size. Thus, with this study, we are not suggesting the use of a single 'character', such as size, for phylogenetic inference, but we do challenge the common view that shape, with its highly complex and multivariate nature, is necessarily more phylogenetically informative than size and that actually size and size-related shape variation (i.e., allometry) confound phylogenetic inference based on morphology. This perspective may in fact be less generalizable than often believed. Thus, while we confirm the original findings by Lockwood et al., we provide a deep reinterpretation of their nature and potential implications for hominid phylogenetics and we show how crucial it is not to overlook size in geometric morphometric analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Sequencing of whole plastid genomes and nuclear ribosomal DNA of Diospyros species (Ebenaceae) endemic to New Caledonia: many species, little divergence

PubMed Central

Turner, Barbara; Paun, Ovidiu; Munzinger, Jérôme; Chase, Mark W.; Samuel, Rosabelle

2016-01-01

Background and Aims Some plant groups, especially on islands, have been shaped by strong ancestral bottlenecks and rapid, recent radiation of phenotypic characters. Single molecular markers are often not informative enough for phylogenetic reconstruction in such plant groups. Whole plastid genomes and nuclear ribosomal DNA (nrDNA) are viewed by many researchers as sources of information for phylogenetic reconstruction of groups in which expected levels of divergence in standard markers are low. Here we evaluate the usefulness of these data types to resolve phylogenetic relationships among closely related Diospyros species. Methods Twenty-two closely related Diospyros species from New Caledonia were investigated using whole plastid genomes and nrDNA data from low-coverage next-generation sequencing (NGS). Phylogenetic trees were inferred using maximum parsimony, maximum likelihood and Bayesian inference on separate plastid and nrDNA and combined matrices. Key Results The plastid and nrDNA sequences were, singly and together, unable to provide well supported phylogenetic relationships among the closely related New Caledonian Diospyros species. In the nrDNA, a 6-fold greater percentage of parsimony-informative characters compared with plastid DNA was found, but the total number of informative sites was greater for the much larger plastid DNA genomes. Combining the plastid and nuclear data improved resolution. Plastid results showed a trend towards geographical clustering of accessions rather than following taxonomic species. Conclusions In plant groups in which multiple plastid markers are not sufficiently informative, an investigation at the level of the entire plastid genome may also not be sufficient for detailed phylogenetic reconstruction. Sequencing of complete plastid genomes and nrDNA repeats seems to clarify some relationships among the New Caledonian Diospyros species, but the higher percentage of parsimony-informative characters in nrDNA compared with plastid DNA did not help to resolve the phylogenetic tree because the total number of variable sites was much lower than in the entire plastid genome. The geographical clustering of the individuals against a background of overall low sequence divergence could indicate transfer of plastid genomes due to hybridization and introgression following secondary contact. PMID:27098088
The vestigial olfactory receptor subgenome of odontocete whales: phylogenetic congruence between gene-tree reconciliation and supermatrix methods.

PubMed

McGowen, Michael R; Clark, Clay; Gatesy, John

2008-08-01

The macroevolutionary transition of whales (cetaceans) from a terrestrial quadruped to an obligate aquatic form involved major changes in sensory abilities. Compared to terrestrial mammals, the olfactory system of baleen whales is dramatically reduced, and in toothed whales is completely absent. We sampled the olfactory receptor (OR) subgenomes of eight cetacean species from four families. A multigene tree of 115 newly characterized OR sequences from these eight species and published data for Bos taurus revealed a diverse array of class II OR paralogues in Cetacea. Evolution of the OR gene superfamily in toothed whales (Odontoceti) featured a multitude of independent pseudogenization events, supporting anatomical evidence that odontocetes have lost their olfactory sense. We explored the phylogenetic utility of OR pseudogenes in Cetacea, concentrating on delphinids (oceanic dolphins), the product of a rapid evolutionary radiation that has been difficult to resolve in previous studies of mitochondrial DNA sequences. Phylogenetic analyses of OR pseudogenes using both gene-tree reconciliation and supermatrix methods yielded fully resolved, consistently supported relationships among members of four delphinid subfamilies. Alternative minimizations of gene duplications, gene duplications plus gene losses, deep coalescence events, and nucleotide substitutions plus indels returned highly congruent phylogenetic hypotheses. Novel DNA sequence data for six single-copy nuclear loci and three mitochondrial genes (> 5000 aligned nucleotides) provided an independent test of the OR trees. Nucleotide substitutions and indels in OR pseudogenes showed a very low degree of homoplasy in comparison to mitochondrial DNA and, on average, provided more variation than single-copy nuclear DNA. Our results suggest that phylogenetic analysis of the large OR superfamily will be effective for resolving relationships within Cetacea whether supermatrix or gene-tree reconciliation procedures are used.
The problem and promise of scale dependency in community phylogenetics.

PubMed

Swenson, Nathan G; Enquist, Brian J; Pither, Jason; Thompson, Jill; Zimmerman, Jess K

2006-10-01

The problem of scale dependency is widespread in investigations of ecological communities. Null model investigations of community assembly exemplify the challenges involved because they typically include subjectively defined "regional species pools." The burgeoning field of community phylogenetics appears poised to face similar challenges. Our objective is to quantify the scope of the problem of scale dependency by comparing the phylogenetic structure of assemblages across contrasting geographic and taxonomic scales. We conduct phylogenetic analyses on communities within three tropical forests, and perform a sensitivity analysis with respect to two scaleable inputs: taxonomy and species pool size. We show that (1) estimates of phylogenetic overdispersion within local assemblages depend strongly on the taxonomic makeup of the local assemblage and (2) comparing the phylogenetic structure of a local assemblage to a species pool drawn from increasingly larger geographic scales results in an increased signal of phylogenetic clustering. We argue that, rather than posing a problem, "scale sensitivities" are likely to reveal general patterns of diversity that could help identify critical scales at which local or regional influences gain primacy for the structuring of communities. In this way, community phylogenetics promises to fill an important gap in community ecology and biogeography research.
Is xenodontine snake reproduction shaped by ancestry, more than by ecology?

PubMed

Bellini, Gisela P; Arzamendia, Vanesa; Giraudo, Alejandro R

2017-01-01

One of the current challenges of evolutionary ecology is to understand the effects of phylogenetic history (PH) and/or ecological factors (EF) on the life-history traits of the species. Here, the effects of environment and phylogeny are tested for the first time on the reproductive biology of South American xenodontine snakes. We studied 60% of the tribes of this endemic and most representative clade in a temperate region of South America. A comparative method (canonical phylogenetic ordination-CPO) was used to find the relative contributions of EF and PH upon life-history aspects of snakes, comparing the reproductive mode, mean fecundity, reproductive potential, and frequency of nearly 1,000 specimens. CPO analysis showed that PH or ancestry explained most of the variation in reproduction, whereas EF explained little of this variation. The reproductive traits under study are suggested to have a strong phylogenetic signal in this clade, the ancestry playing a big role in reproduction. The EF also influenced the reproduction of South American xenodontines, although to a lesser extent. Our finding provides new evidence of how the evolutionary history is embodied in the traits of living species.
Distribution of pathogenicity island markers and virulence factors in new phylogenetic groups of uropathogenic Escherichia coli isolates.

PubMed

Najafi, Akram; Hasanpour, Mojtaba; Askary, Azam; Aziemzadeh, Masoud; Hashemi, Najmeh

2018-05-01

The present study was aimed at investigating the relationship between the new Clermont's phylogenetic groups, virulence factors, and pathogenicity island markers (PAIs) among uropathogenic Escherichia coli (UPEC) in Iran. This cross-sectional study was carried out on 140 UPEC isolates collected from patients with urinary tract infections in Bushehr, Iran. All isolates were subjected to phylogenetic typing using a new quadruplex-PCR method. The presence of PAI markers and virulence factors in UPEC strains was evaluated by multiplex PCR. The most predominant virulence gene was fimH (85%), followed by iucC (61.4%), papC (38.6%), hlyA (22.1%), cnf-1 (18.6%), afa (10.7%), papG and neuC (each 9.3%), ibeA (3.6%), and sfa/foc (0.7%). The most common phylogenetic group was related to B2 (39.3%), and the least common to A (0.7%). The most prevalent PAI marker was PAI IV536 (77.14%), while markers for PAI III536 (13.57%), PAI IIJ96 (12.86%), and PAI II536 (12.14%) were the least frequent among the UPEC strains. Meanwhile, the PAI IJ96 marker was not detected. There was a significant association between the phylogenetic group B2 and all the studied virulence genes and PAI markers. To our knowledge, this is the first study to compare the relationship between new phylogenetic groups, virulence genes and PAI markers in UPEC strains in Iran. The phylogenetic group B2 was predominantly represented among the studied virulence genes and PAI markers, indicating the preference of particular strains to carry virulence genes.
Alignment methods: strategies, challenges, benchmarking, and comparative overview.

PubMed

Löytynoja, Ari

2012-01-01

Comparative evolutionary analyses of molecular sequences are solely based on the identities and differences detected between homologous characters. Errors in this homology statement, that is errors in the alignment of the sequences, are likely to lead to errors in the downstream analyses. Sequence alignment and phylogenetic inference are tightly connected and many popular alignment programs use the phylogeny to divide the alignment problem into smaller tasks. They then neglect the phylogenetic tree, however, and produce alignments that are not evolutionarily meaningful. The use of phylogeny-aware methods reduces the error but the resulting alignments, with evolutionarily correct representation of homology, can challenge the existing practices and methods for viewing and visualising the sequences. The inter-dependency of alignment and phylogeny can be resolved by joint estimation of the two; methods based on statistical models allow for inferring the alignment parameters from the data and correctly take into account the uncertainty of the solution but remain computationally challenging. Widely used alignment methods are based on heuristic algorithms and unlikely to find globally optimal solutions. The whole concept of one correct alignment for the sequences is questionable, however, as there typically exist vast numbers of alternative, roughly equally good alignments that should also be considered. This uncertainty is hidden by many popular alignment programs and is rarely correctly taken into account in the downstream analyses. The quest for finding and improving the alignment solution is complicated by the lack of suitable measures of alignment goodness. The difficulty of comparing alternative solutions also affects benchmarks of alignment methods and the results strongly depend on the measure used. As the effects of alignment error cannot be predicted, comparing the alignments' performance in downstream analyses is recommended.
Phylogenetic overdispersion of plant species in southern Brazilian savannas.

PubMed

Silva, I A; Batalha, M A

2009-08-01

Ecological communities are the result of not only present ecological processes, such as competition among species and environmental filtering, but also past and continuing evolutionary processes. Based on these assumptions, we may infer mechanisms of contemporary coexistence from the phylogenetic relationships of the species in a community. We studied the phylogenetic structure of plant communities in four cerrado sites, in southeastern Brazil. We calculated two raw phylogenetic distances among the species sampled. We estimated the phylogenetic structure by comparing the observed phylogenetic distances to the distribution of phylogenetic distances in null communities. We obtained null communities by randomizing the phylogenetic relationships of the regional pool of species. We found a phylogenetic overdispersion of the cerrado species. Phylogenetic overdispersion has several explanations, depending on the phylogenetic history of traits and contemporary ecological interactions. However, based on coexistence models between grasses and trees, density-dependent ecological forces, and the evolutionary history of the cerrado flora, we argue that the phylogenetic overdispersion of cerrado species is predominantly due to competitive interactions, herbivores and pathogen attacks, and ecological speciation. Future studies will need to include information on the phylogenetic history of plant traits.
Phylogenetic Tools for Generalized HIV-1 Epidemics: Findings from the PANGEA-HIV Methods Comparison.

PubMed

Ratmann, Oliver; Hodcroft, Emma B; Pickles, Michael; Cori, Anne; Hall, Matthew; Lycett, Samantha; Colijn, Caroline; Dearlove, Bethany; Didelot, Xavier; Frost, Simon; Hossain, A S Md Mukarram; Joy, Jeffrey B; Kendall, Michelle; Kühnert, Denise; Leventhal, Gabriel E; Liang, Richard; Plazzotta, Giacomo; Poon, Art F Y; Rasmussen, David A; Stadler, Tanja; Volz, Erik; Weis, Caroline; Leigh Brown, Andrew J; Fraser, Christophe

2017-01-01

Viral phylogenetic methods contribute to understanding how HIV spreads in populations, and thereby help guide the design of prevention interventions. So far, most analyses have been applied to well-sampled concentrated HIV-1 epidemics in wealthy countries. To direct the use of phylogenetic tools to where the impact of HIV-1 is greatest, the Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium generates full-genome viral sequences from across sub-Saharan Africa. Analyzing these data presents new challenges, since epidemics are principally driven by heterosexual transmission and a smaller fraction of cases is sampled. Here, we show that viral phylogenetic tools can be adapted and used to estimate epidemiological quantities of central importance to HIV-1 prevention in sub-Saharan Africa. We used a community-wide methods comparison exercise on simulated data, where participants were blinded to the true dynamics they were inferring. Two distinct simulations captured generalized HIV-1 epidemics, before and after a large community-level intervention that reduced infection levels. Five research groups participated. Structured coalescent modeling approaches were most successful: phylogenetic estimates of HIV-1 incidence, incidence reductions, and the proportion of transmissions from individuals in their first 3 months of infection correlated with the true values (Pearson correlation > 90%), with small bias. However, on some simulations, true values were markedly outside reported confidence or credibility intervals. The blinded comparison revealed current limits and strengths in using HIV phylogenetics in challenging settings, provided benchmarks for future methods' development, and supports using the latest generation of phylogenetic tools to advance HIV surveillance and prevention. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Molecular species delimitation methods recover most song-delimited cicada species in the European Cicadetta montana complex.

PubMed

Wade, E J; Hertach, T; Gogala, M; Trilar, T; Simon, C

2015-12-01

Molecular species delimitation is increasingly being used to discover and illuminate species level diversity, and a number of methods have been developed. Here, we compare the ability of two molecular species delimitation methods to recover song-delimited species in the Cicadetta montana cryptic species complex throughout Europe. Recent bioacoustics studies of male calling songs (premating reproductive barriers) have revealed cryptic species diversity in this complex. Maximum likelihood and Bayesian phylogenetic analyses were used to analyse the mitochondrial genes COI and COII and the nuclear genes EF1α and period for thirteen European Cicadetta species as well as the closely related monotypic genus Euboeana. Two molecular species delimitation methods, general mixed Yule-coalescent (GMYC) and Bayesian phylogenetics and phylogeography, identified the majority of song-delimited species and were largely congruent with each other. None of the molecular delimitation methods were able to fully recover a recent radiation of four Greek species. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.
Towards an eco-phylogenetic framework for infectious disease ecology.

PubMed

Fountain-Jones, Nicholas M; Pearse, William D; Escobar, Luis E; Alba-Casals, Ana; Carver, Scott; Davies, T Jonathan; Kraberger, Simona; Papeş, Monica; Vandegrift, Kurt; Worsley-Tonks, Katherine; Craft, Meggan E

2018-05-01

Identifying patterns and drivers of infectious disease dynamics across multiple scales is a fundamental challenge for modern science. There is growing awareness that it is necessary to incorporate multi-host and/or multi-parasite interactions to understand and predict current and future disease threats better, and new tools are needed to help address this task. Eco-phylogenetics (phylogenetic community ecology) provides one avenue for exploring multi-host multi-parasite systems, yet the incorporation of eco-phylogenetic concepts and methods into studies of host pathogen dynamics has lagged behind. Eco-phylogenetics is a transformative approach that uses evolutionary history to infer present-day dynamics. Here, we present an eco-phylogenetic framework to reveal insights into parasite communities and infectious disease dynamics across spatial and temporal scales. We illustrate how eco-phylogenetic methods can help untangle the mechanisms of host-parasite dynamics from individual (e.g. co-infection) to landscape scales (e.g. parasite/host community structure). An improved ecological understanding of multi-host and multi-pathogen dynamics across scales will increase our ability to predict disease threats. © 2017 Cambridge Philosophical Society.
A congruent phylogenomic signal places eukaryotes within the Archaea.

PubMed

Williams, Tom A; Foster, Peter G; Nye, Tom M W; Cox, Cymon J; Embley, T Martin

2012-12-22

Determining the relationships among the major groups of cellular life is important for understanding the evolution of biological diversity, but is difficult given the enormous time spans involved. In the textbook 'three domains' tree based on informational genes, eukaryotes and Archaea share a common ancestor to the exclusion of Bacteria. However, some phylogenetic analyses of the same data have placed eukaryotes within the Archaea, as the nearest relatives of different archaeal lineages. We compared the support for these competing hypotheses using sophisticated phylogenetic methods and an improved sampling of archaeal biodiversity. We also employed both new and existing tests of phylogenetic congruence to explore the level of uncertainty and conflict in the data. Our analyses suggested that much of the observed incongruence is weakly supported or associated with poorly fitting evolutionary models. All of our phylogenetic analyses, whether on small subunit and large subunit ribosomal RNA or concatenated protein-coding genes, recovered a monophyletic group containing eukaryotes and the TACK archaeal superphylum comprising the Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota. Hence, while our results provide no support for the iconic three-domain tree of life, they are consistent with an extended eocyte hypothesis whereby vital components of the eukaryotic nuclear lineage originated from within the archaeal radiation.
Not a simple case - A first comprehensive phylogenetic hypothesis for the Midas cichlid complex in Nicaragua (Teleostei: Cichlidae: Amphilophus).

PubMed

Geiger, Matthias F; McCrary, Jeffrey K; Schliewen, Ulrich K

2010-09-01

Nicaraguan Midas cichlids from crater lakes have recently attracted attention as potential model systems for speciation research, but no attempt has been made to comprehensively reconstruct phylogenetic relationships of this highly diverse and recently evolved species complex. We present a first AFLP (2793 loci) and mtDNA based phylogenetic hypothesis including all described and several undescribed species from six crater lakes (Apoyeque, Apoyo, Asososca Leon, Masaya, Tiscapa and Xiloá), the two great Lakes Managua and Nicaragua and the San Juan River. Our analyses demonstrate that the relationships between the Midas cichlid members are complex, and that phylogenetic information from different markers and methods do not always yield congruent results. Nevertheless, monophyly support for crater lake assemblages from Lakes Apoyeque, Apoyo, A. Leon is high as compared to those from L. Xiloá indicating occurrence of sympatric speciation. Further, we demonstrate that a 'three species' concept for the Midas cichlid complex is inapplicable and consequently that an individualized and voucher based approach in speciation research of the Midas cichlid complex is necessary at least as long as there is no comprehensive revision of the species complex available. Copyright 2010 Elsevier Inc. All rights reserved.
Phylogenetic Factor Analysis.

PubMed

Tolkoff, Max R; Alfaro, Michael E; Baele, Guy; Lemey, Philippe; Suchard, Marc A

2018-05-01

Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over 3-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.

Homology and the optimization of DNA sequence data

NASA Technical Reports Server (NTRS)

Wheeler, W.

2001-01-01

Three methods of nucleotide character analysis are discussed. Their implications for molecular sequence homology and phylogenetic analysis are compared. The criterion of inter-data set congruence, both character based and topological, are applied to two data sets to elucidate and potentially discriminate among these parsimony-based ideas. c2001 The Willi Hennig Society.
Phylogenetic diversity, functional trait diversity and extinction: avoiding tipping points and worst-case losses

PubMed Central

Faith, Daniel P.

2015-01-01

The phylogenetic diversity measure, (‘PD’), measures the relative feature diversity of different subsets of taxa from a phylogeny. At the level of feature diversity, PD supports the broad goal of biodiversity conservation to maintain living variation and option values. PD calculations at the level of lineages and features include those integrating probabilities of extinction, providing estimates of expected PD. This approach has known advantages over the evolutionarily distinct and globally endangered (EDGE) methods. Expected PD methods also have limitations. An alternative notion of expected diversity, expected functional trait diversity, relies on an alternative non-phylogenetic model and allows inferences of diversity at the level of functional traits. Expected PD also faces challenges in helping to address phylogenetic tipping points and worst-case PD losses. Expected PD may not choose conservation options that best avoid worst-case losses of long branches from the tree of life. We can expand the range of useful calculations based on expected PD, including methods for identifying phylogenetic key biodiversity areas. PMID:25561672
Species divergence and phylogenetic variation of ecophysiological traits in lianas and trees.

PubMed

Rios, Rodrigo S; Salgado-Luarte, Cristian; Gianoli, Ernesto

2014-01-01

The climbing habit is an evolutionary key innovation in plants because it is associated with enhanced clade diversification. We tested whether patterns of species divergence and variation of three ecophysiological traits that are fundamental for plant adaptation to light environments (maximum photosynthetic rate [A(max)], dark respiration rate [R(d)], and specific leaf area [SLA]) are consistent with this key innovation. Using data reported from four tropical forests and three temperate forests, we compared phylogenetic distance among species as well as the evolutionary rate, phylogenetic distance and phylogenetic signal of those traits in lianas and trees. Estimates of evolutionary rates showed that R(d) evolved faster in lianas, while SLA evolved faster in trees. The mean phylogenetic distance was 1.2 times greater among liana species than among tree species. Likewise, estimates of phylogenetic distance indicated that lianas were less related than by chance alone (phylogenetic evenness across 63 species), and trees were more related than expected by chance (phylogenetic clustering across 71 species). Lianas showed evenness for R(d), while trees showed phylogenetic clustering for this trait. In contrast, for SLA, lianas exhibited phylogenetic clustering and trees showed phylogenetic evenness. Lianas and trees showed patterns of ecophysiological trait variation among species that were independent of phylogenetic relatedness. We found support for the expected pattern of greater species divergence in lianas, but did not find consistent patterns regarding ecophysiological trait evolution and divergence. R(d) followed the species-level pattern, i.e., greater divergence/evolution in lianas compared to trees, while the opposite occurred for SLA and no pattern was detected for A(max). R(d) may have driven lianas' divergence across forest environments, and might contribute to diversification in climber clades.
Species Divergence and Phylogenetic Variation of Ecophysiological Traits in Lianas and Trees

PubMed Central

Rios, Rodrigo S.; Salgado-Luarte, Cristian; Gianoli, Ernesto

2014-01-01

The climbing habit is an evolutionary key innovation in plants because it is associated with enhanced clade diversification. We tested whether patterns of species divergence and variation of three ecophysiological traits that are fundamental for plant adaptation to light environments (maximum photosynthetic rate [Amax], dark respiration rate [Rd], and specific leaf area [SLA]) are consistent with this key innovation. Using data reported from four tropical forests and three temperate forests, we compared phylogenetic distance among species as well as the evolutionary rate, phylogenetic distance and phylogenetic signal of those traits in lianas and trees. Estimates of evolutionary rates showed that Rd evolved faster in lianas, while SLA evolved faster in trees. The mean phylogenetic distance was 1.2 times greater among liana species than among tree species. Likewise, estimates of phylogenetic distance indicated that lianas were less related than by chance alone (phylogenetic evenness across 63 species), and trees were more related than expected by chance (phylogenetic clustering across 71 species). Lianas showed evenness for Rd, while trees showed phylogenetic clustering for this trait. In contrast, for SLA, lianas exhibited phylogenetic clustering and trees showed phylogenetic evenness. Lianas and trees showed patterns of ecophysiological trait variation among species that were independent of phylogenetic relatedness. We found support for the expected pattern of greater species divergence in lianas, but did not find consistent patterns regarding ecophysiological trait evolution and divergence. Rd followed the species-level pattern, i.e., greater divergence/evolution in lianas compared to trees, while the opposite occurred for SLA and no pattern was detected for Amax. Rd may have driven lianas' divergence across forest environments, and might contribute to diversification in climber clades. PMID:24914958
[Phylogenetic relationship of street rabies virus strains and their antigenic reactivity with antibodies induced by vaccine strains. I. Analysis of phylogenetic relationship of street rabies virus strains isolated in Poland].

PubMed

Sadkowska-Todys, M

2000-01-01

The aims of these studies were: genetic characteristic of street rabies virus strains isolated from different animal species in Poland and determination of phylogenetic relationships to reference laboratory strains of the street rabies viruses belonging to genotype 1 and 5. The variability of rabies isolates and their phylogenetic relationship were studied by comparing the nucleotide sequence of the virus genome fragment. The Polish strains of genotype 1 belong to four phylogenetic groups (NE, CE, NEE, EE) corresponding to four variants: fox-racoon dog (F-RD); European fox 1 (F1); European fox 2 (F2) and European fox 3 (F3). On the Polish territories there are no rabies strains representing the variant dog-wolf and typical for arctic fox variant. The similarity of nucleotide and amino acid sequences of street rabies strains belonging to genotype 1 and laboratory strain CVS is very high. It is about 91% similarity at nucleotide level and 95% at amino acid level. Rabies strain CVS is similar to genotype 5 bat strains (EBL 1) only in about 69% and 74% at nucleotide and amino acid level, respectively. The genetic divergence of rabies strains circulating in Poland raised the need of permanent epidemiological and virological surveillance. The genotype and variant of isolated strains should be determined (using PCR and RLFP methods).
A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data.

PubMed

Adams, Dean C

2014-09-01

Phylogenetic signal is the tendency for closely related species to display similar trait values due to their common ancestry. Several methods have been developed for quantifying phylogenetic signal in univariate traits and for sets of traits treated simultaneously, and the statistical properties of these approaches have been extensively studied. However, methods for assessing phylogenetic signal in high-dimensional multivariate traits like shape are less well developed, and their statistical performance is not well characterized. In this article, I describe a generalization of the K statistic of Blomberg et al. that is useful for quantifying and evaluating phylogenetic signal in highly dimensional multivariate data. The method (K(mult)) is found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices. Using computer simulations based on Brownian motion, I demonstrate that the expected value of K(mult) remains at 1.0 as trait variation among species is increased or decreased, and as the number of trait dimensions is increased. By contrast, estimates of phylogenetic signal found with a squared-change parsimony procedure for multivariate data change with increasing trait variation among species and with increasing numbers of trait dimensions, confounding biological interpretations. I also evaluate the statistical performance of hypothesis testing procedures based on K(mult) and find that the method displays appropriate Type I error and high statistical power for detecting phylogenetic signal in high-dimensional data. Statistical properties of K(mult) were consistent for simulations using bifurcating and random phylogenies, for simulations using different numbers of species, for simulations that varied the number of trait dimensions, and for different underlying models of trait covariance structure. Overall these findings demonstrate that K(mult) provides a useful means of evaluating phylogenetic signal in high-dimensional multivariate traits. Finally, I illustrate the utility of the new approach by evaluating the strength of phylogenetic signal for head shape in a lineage of Plethodon salamanders. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Tandem repeats analysis for the high resolution phylogenetic analysis of Yersinia pestis

PubMed Central

Pourcel, C; André-Mazeaud, F; Neubauer, H; Ramisse, F; Vergnaud, G

2004-01-01

Background Yersinia pestis, the agent of plague, is a young and highly monomorphic species. Three biovars, each one thought to be associated with the last three Y. pestis pandemics, have been defined based on biochemical assays. More recently, DNA based assays, including DNA sequencing, IS typing, DNA arrays, have significantly improved current knowledge on the origin and phylogenetic evolution of Y. pestis. However, these methods suffer either from a lack of resolution or from the difficulty to compare data. Variable number of tandem repeats (VNTRs) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses in a growing number of pathogens and have given promising results for Y. pestis as well. Results In this study we have genotyped 180 Y. pestis isolates by multiple locus VNTR analysis (MLVA) using 25 markers. Sixty-one different genotypes were observed. The three biovars were distributed into three main branches, with some exceptions. In particular, the Medievalis phenotype is clearly heterogeneous, resulting from different mutation events in the napA gene. Antiqua strains from Asia appear to hold a central position compared to Antiqua strains from Africa. A subset of 7 markers is proposed for the quick comparison of a new strain with the collection typed here. This can be easily achieved using a Web-based facility, specifically set-up for running such identifications. Conclusion Tandem-repeat typing may prove to be a powerful complement to the existing phylogenetic tools for Y. pestis. Typing can be achieved quickly at a low cost in terms of consumables, technical expertise and equipment. The resulting data can be easily compared between different laboratories. The number and selection of markers will eventually depend upon the type and aim of investigations. PMID:15186506
["Long-branch Attraction" artifact in phylogenetic reconstruction].

PubMed

Li, Yi-Wei; Yu, Li; Zhang, Ya-Ping

2007-06-01

Phylogenetic reconstruction among various organisms not only helps understand their evolutionary history but also reveal several fundamental evolutionary questions. Understanding of the evolutionary relationships among organisms establishes the foundation for the investigations of other biological disciplines. However, almost all the widely used phylogenetic methods have limitations which fail to eliminate systematic errors effectively, preventing the reconstruction of true organismal relationships. "Long-branch Attraction" (LBA) artifact is one of the most disturbing factors in phylogenetic reconstruction. In this review, the conception and analytic method as well as the avoidance strategy of LBA were summarized. In addition, several typical examples were provided. The approach to avoid and resolve LBA artifact has been discussed.
Are voluntary wheel running and open-field behavior correlated in mice? Different answers from comparative and artificial selection approaches.

PubMed

Careau, Vincent; Bininda-Emonds, Olaf R P; Ordonez, Genesis; Garland, Theodore

2012-09-01

Voluntary wheel running and open-field behavior are probably the two most widely used measures of locomotion in laboratory rodents. We tested whether these two behaviors are correlated in mice using two approaches: the phylogenetic comparative method using inbred strains of mice and an ongoing artificial selection experiment on voluntary wheel running. After taking into account the measurement error and phylogenetic relationships among inbred strains, we obtained a significant positive correlation between distance run on wheels and distance moved in the open-field for both sexes. Thigmotaxis was negatively correlated with distance run on wheels in females but not in males. By contrast, mice from four replicate lines bred for high wheel running did not differ in either distance covered or thigmotaxis in the open field as compared with mice from four non-selected control lines. Overall, results obtained in the selection experiment were generally opposite to those observed among inbred strains. Possible reasons for this discrepancy are discussed.
Transforming phylogenetic networks: Moving beyond tree space.

PubMed

Huber, Katharina T; Moulton, Vincent; Wu, Taoyang

2016-09-07

Phylogenetic networks are a generalization of phylogenetic trees that are used to represent reticulate evolution. Unrooted phylogenetic networks form a special class of such networks, which naturally generalize unrooted phylogenetic trees. In this paper we define two operations on unrooted phylogenetic networks, one of which is a generalization of the well-known nearest-neighbor interchange (NNI) operation on phylogenetic trees. We show that any unrooted phylogenetic network can be transformed into any other such network using only these operations. This generalizes the well-known fact that any phylogenetic tree can be transformed into any other such tree using only NNI operations. It also allows us to define a generalization of tree space and to define some new metrics on unrooted phylogenetic networks. To prove our main results, we employ some fascinating new connections between phylogenetic networks and cubic graphs that we have recently discovered. Our results should be useful in developing new strategies to search for optimal phylogenetic networks, a topic that has recently generated some interest in the literature, as well as for providing new ways to compare networks. Copyright © 2016 Elsevier Ltd. All rights reserved.
Influence of cladogenesis on feeding structures in drums (Teleostei: Sciaenidae).

PubMed

Deary, Alison L; Hilton, Eric J

2017-02-01

Drums (family Sciaenidae) are common in tropical to temperate coastal and estuarine habitats worldwide and present a broad spectrum of morphological diversity. The anatomical variation in this family is particularly evident in their feeding apparatus, which may reflect the partitioning of adult foraging habitats. Adult and early life history stage sciaenids may display ecomorphological patterns in oral and pharyngeal jaw elements but because sciaenids are hierarchically related, the morphological variation of the feeding apparatus cannot be analyzed as independent data. Morphological patterns have been identified in three sciaenid genera from the Chesapeake Bay but it is not known if these patterns are present in other genera of the family and if such patterns are constrained by phylogenetic history. In this study, phylogenetic comparative methods were applied to two sets of oral jaw data obtained from growth series of 11 species of cleared and double-stained Chesapeake Bay sciaenids and alcohol-preserved museum specimens representing 65 of the 66 recognized genera to determine the magnitude of phylogenetic dependence present in the structure of the oral jaws using a recent molecular phylogeny of the family. Pagel's lambda, a measure of phylogenetic signal, was low for pelagic sciaenids in premaxilla, lower jaw, and ascending process lengths, indicating influence of selective forces on the condition of these traits. Conversely, for benthic sciaenids, phylogenetic signal was high for lower jaw and ascending process lengths, indicating significant phylogenetic constraint for their condition in these taxa. Pagel's lambda was intermediate for premaxilla length in benthic sciaenids, suggesting that the length of the premaxilla is influenced by a mix of selective forces and phylogenetic constraint. Although the ecomorphological patterns identified in the oral jaws of scaienids are not entirely free of phylogenetic dependence, selective forces related to foraging are likely driving the evolution of these structures. Copyright © 2016 Elsevier GmbH. All rights reserved.
Structural phylogeny by profile extraction and multiple superimposition using electrostatic congruence as a discriminator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chakraborty, Sandeep; Rao, Basuthkar J.; Baker, Nathan A.

2013-04-01

Phylogenetic analysis of proteins using multiple sequence alignment (MSA) assumes an underlying evolutionary relationship in these proteins which occasionally remains undetected due to considerable sequence divergence. Structural alignment programs have been developed to unravel such fuzzy relationships. However, none of these structure based methods have used electrostatic properties to discriminate between spatially equivalent residues. We present a methodology for MSA of a set of related proteins with known structures using electrostatic properties as an additional discriminator (STEEP). STEEP first extracts a profile, then generates a multiple structural superimposition providing a consolidated spatial framework for comparing residues and finally emits themore » MSA. Residues that are aligned differently by including or excluding electrostatic properties can be targeted by directed evolution experiments to transform the enzymatic properties of one protein into another. We have compared STEEP results to those obtained from a MSA program (ClustalW) and a structural alignment method (MUSTANG) for chymotrypsin serine proteases. Subsequently, we used PhyML to generate phylogenetic trees for the serine and metallo-β-lactamase superfamilies from the STEEP generated MSA, and corroborated the accepted relationships in these superfamilies. We have observed that STEEP acts as a functional classifier when electrostatic congruence is used as a discriminator, and thus identifies potential targets for directed evolution experiments. In summary, STEEP is unique among phylogenetic methods for its ability to use electrostatic congruence to specify mutations that might be the source of the functional divergence in a protein family. Based on our results, we also hypothesize that the active site and its close vicinity contains enough information to infer the correct phylogeny for related proteins.« less
Fossils matter: improved estimates of divergence times in Pinus reveal older diversification.

PubMed

Saladin, Bianca; Leslie, Andrew B; Wüest, Rafael O; Litsios, Glenn; Conti, Elena; Salamin, Nicolas; Zimmermann, Niklaus E

2017-04-04

The taxonomy of pines (genus Pinus) is widely accepted and a robust gene tree based on entire plastome sequences exists. However, there is a large discrepancy in estimated divergence times of major pine clades among existing studies, mainly due to differences in fossil placement and dating methods used. We currently lack a dated molecular phylogeny that makes use of the rich pine fossil record, and this study is the first to estimate the divergence dates of pines based on a large number of fossils (21) evenly distributed across all major clades, in combination with applying both node and tip dating methods. We present a range of molecular phylogenetic trees of Pinus generated within a Bayesian framework. We find the origin of crown Pinus is likely up to 30 Myr older (Early Cretaceous) than inferred in most previous studies (Late Cretaceous) and propose generally older divergence times for major clades within Pinus than previously thought. Our age estimates vary significantly between the different dating approaches, but the results generally agree on older divergence times. We present a revised list of 21 fossils that are suitable to use in dating or comparative analyses of pines. Reliable estimates of divergence times in pines are essential if we are to link diversification processes and functional adaptation of this genus to geological events or to changing climates. In addition to older divergence times in Pinus, our results also indicate that node age estimates in pines depend on dating approaches and the specific fossil sets used, reflecting inherent differences in various dating approaches. The sets of dated phylogenetic trees of pines presented here provide a way to account for uncertainties in age estimations when applying comparative phylogenetic methods.
CDAO-Store: Ontology-driven Data Integration for Phylogenetic Analysis

PubMed Central

2011-01-01

Background The Comparative Data Analysis Ontology (CDAO) is an ontology developed, as part of the EvoInfo and EvoIO groups supported by the National Evolutionary Synthesis Center, to provide semantic descriptions of data and transformations commonly found in the domain of phylogenetic analysis. The core concepts of the ontology enable the description of phylogenetic trees and associated character data matrices. Results Using CDAO as the semantic back-end, we developed a triple-store, named CDAO-Store. CDAO-Store is a RDF-based store of phylogenetic data, including a complete import of TreeBASE. CDAO-Store provides a programmatic interface, in the form of web services, and a web-based front-end, to perform both user-defined as well as domain-specific queries; domain-specific queries include search for nearest common ancestors, minimum spanning clades, filter multiple trees in the store by size, author, taxa, tree identifier, algorithm or method. In addition, CDAO-Store provides a visualization front-end, called CDAO-Explorer, which can be used to view both character data matrices and trees extracted from the CDAO-Store. CDAO-Store provides import capabilities, enabling the addition of new data to the triple-store; files in PHYLIP, MEGA, nexml, and NEXUS formats can be imported and their CDAO representations added to the triple-store. Conclusions CDAO-Store is made up of a versatile and integrated set of tools to support phylogenetic analysis. To the best of our knowledge, CDAO-Store is the first semantically-aware repository of phylogenetic data with domain-specific querying capabilities. The portal to CDAO-Store is available at http://www.cs.nmsu.edu/~cdaostore. PMID:21496247
CDAO-store: ontology-driven data integration for phylogenetic analysis.

PubMed

Chisham, Brandon; Wright, Ben; Le, Trung; Son, Tran Cao; Pontelli, Enrico

2011-04-15

The Comparative Data Analysis Ontology (CDAO) is an ontology developed, as part of the EvoInfo and EvoIO groups supported by the National Evolutionary Synthesis Center, to provide semantic descriptions of data and transformations commonly found in the domain of phylogenetic analysis. The core concepts of the ontology enable the description of phylogenetic trees and associated character data matrices. Using CDAO as the semantic back-end, we developed a triple-store, named CDAO-Store. CDAO-Store is a RDF-based store of phylogenetic data, including a complete import of TreeBASE. CDAO-Store provides a programmatic interface, in the form of web services, and a web-based front-end, to perform both user-defined as well as domain-specific queries; domain-specific queries include search for nearest common ancestors, minimum spanning clades, filter multiple trees in the store by size, author, taxa, tree identifier, algorithm or method. In addition, CDAO-Store provides a visualization front-end, called CDAO-Explorer, which can be used to view both character data matrices and trees extracted from the CDAO-Store. CDAO-Store provides import capabilities, enabling the addition of new data to the triple-store; files in PHYLIP, MEGA, nexml, and NEXUS formats can be imported and their CDAO representations added to the triple-store. CDAO-Store is made up of a versatile and integrated set of tools to support phylogenetic analysis. To the best of our knowledge, CDAO-Store is the first semantically-aware repository of phylogenetic data with domain-specific querying capabilities. The portal to CDAO-Store is available at http://www.cs.nmsu.edu/~cdaostore.
Occurrence of lignin degradation genotypes and phenotypes among prokaryotes.

PubMed

Tian, Jiang-Hao; Pourcher, Anne-Marie; Bouchez, Théodore; Gelhaye, Eric; Peu, Pascal

2014-12-01

A number of prokaryotes actively contribute to lignin degradation in nature and their activity could be of interest for many applications including the production of biogas/biofuel from lignocellulosic biomass and biopulping. This review compares the reliability and efficiency of the culture-dependent screening methods currently used for the isolation of ligninolytic prokaryotes. Isolated prokaryotes exhibiting lignin-degrading potential are presented according to their phylogenetic groups. With the development of bioinformatics, culture-independent techniques are emerging that allow larger-scale data mining for ligninolytic prokaryotic functions but today, these techniques still have some limits. In this work, two phylogenetic affiliations of isolated prokaryotes exhibiting ligninolytic potential and laccase-encoding prokaryotes were determined on the basis of 16S rDNA sequences, providing a comparative view of results obtained by the two types of screening techniques. The combination of laboratory culture and bioinformatics approaches is a promising way to explore lignin-degrading prokaryotes.
IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

PubMed Central

Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

2009-01-01

Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385
Yeast species diversity in apple juice for cider production evidenced by culture-based method.

PubMed

Lorenzini, Marilinda; Simonato, Barbara; Zapparoli, Giacomo

2018-05-07

Identification of yeasts isolated from apple juices of two cider houses (one located in a plain area and one in an alpine area) was carried out by culture-based method. Wallerstein Laboratory Nutrient Agar was used as medium for isolation and preliminary yeasts identification. A total of 20 species of yeasts belonging to ten different genera were identified using both BLAST algorithm for pairwise sequence comparison and phylogenetic approaches. A wide variety of non-Saccharomyces species was found. Interestingly, Candida railenensis, Candida cylindracea, Hanseniaspora meyeri, Hanseniaspora pseudoguilliermondii, and Metschnikowia sinensis were recovered for the first time in the yeast community of an apple environment. Phylogenetic analysis revealed a better resolution in identifying Metschnikowia and Moesziomyces isolates than comparative analysis using the GenBank or YeastIP gene databases. This study provides important data on yeast microbiota of apple juice and evidenced differences between two geographical cider production areas in terms of species composition.
Comparative molecular species delimitation in the charismatic Nawab butterflies (Nymphalidae, Charaxinae, Polyura).

PubMed

Toussaint, Emmanuel F A; Morinière, Jérôme; Müller, Chris J; Kunte, Krushnamegh; Turlin, Bernard; Hausmann, Axel; Balke, Michael

2015-10-01

The charismatic tropical Polyura Nawab butterflies are distributed across twelve biodiversity hotspots in the Indomalayan/Australasian archipelago. In this study, we tested an array of species delimitation methods and compared the results to existing morphology-based taxonomy. We sequenced two mitochondrial and two nuclear gene fragments to reconstruct phylogenetic relationships within Polyura using both Bayesian inference and maximum likelihood. Based on this phylogenetic framework, we used the recently introduced bGMYC, BPP and PTP methods to investigate species boundaries. Based on our results, we describe two new species Polyura paulettae Toussaint sp. n. and Polyura smilesi Toussaint sp. n., propose one synonym, and five populations are raised to species status. Most of the newly recognized species are single-island endemics likely resulting from the recent highly complex geological history of the Indomalayan-Australasian archipelago. Surprisingly, we also find two newly recognized species in the Indomalayan region where additional biotic or abiotic factors have fostered speciation. Species delimitation methods were largely congruent and succeeded to cross-validate most extant morphological species. PTP and BPP seem to yield more consistent and robust estimations of species boundaries with respect to morphological characters while bGMYC delivered contrasting results depending on the different gene trees considered. Our findings demonstrate the efficiency of comparative approaches using molecular species delimitation methods on empirical data. They also pave the way for the investigation of less well-known groups to unveil patterns of species richness and catalogue Earth's concealed, therefore unappreciated diversity. Published by Elsevier Inc.
The influence of molecular markers and methods on inferring the phylogenetic relationships between the representatives of the Arini (parrots, Psittaciformes), determined on the basis of their complete mitochondrial genomes.

PubMed

Urantowka, Adam Dawid; Kroczak, Aleksandra; Mackiewicz, Paweł

2017-07-14

Conures are a morphologically diverse group of Neotropical parrots classified as members of the tribe Arini, which has recently been subjected to a taxonomic revision. The previously broadly defined Aratinga genus of this tribe has been split into the 'true' Aratinga and three additional genera, Eupsittula, Psittacara and Thectocercus. Popular markers used in the reconstruction of the parrots' phylogenies derive from mitochondrial DNA. However, current phylogenetic analyses seem to indicate conflicting relationships between Aratinga and other conures, and also among other Arini members. Therefore, it is not clear if the mtDNA phylogenies can reliably define the species tree. The inconsistencies may result from the variable evolution rate of the markers used or their weak phylogenetic signal. To resolve these controversies and to assess to what extent the phylogenetic relationships in the tribe Arini can be inferred from mitochondrial genomes, we compared representative Arini mitogenomes as well as examined the usefulness of the individual mitochondrial markers and the efficiency of various phylogenetic methods. Single molecular markers produced inconsistent tree topologies, while different methods offered various topologies even for the same marker. A significant disagreement in these tree topologies occurred for cytb, nd2 and nd6 genes, which are commonly used in parrot phylogenies. The strongest phylogenetic signal was found in the control region and RNA genes. However, these markers cannot be used alone in inferring Arini phylogenies because they do not provide fully resolved trees. The most reliable phylogeny of the parrots under study is obtained only on the concatenated set of all mitochondrial markers. The analyses established significantly resolved relationships within the former Aratinga representatives and the main genera of the tribe Arini. Such mtDNA phylogeny can be in agreement with the species tree, owing to its match with synapomorphic features in plumage colouration. Phylogenetic relationships inferred from single mitochondrial markers can be incorrect and contradictory. Therefore, such phylogenies should be considered with caution. Reliable results can be produced by concatenated sets of all or at least the majority of mitochondrial genes and the control region. The results advance a new view on the relationships among the main genera of Arini and resolve the inconsistencies between the taxa that were previously classified as the broadly defined genus Aratinga. Although gene and species trees do not always have to be consistent, the mtDNA phylogenies for Arini can reflect the species tree.

Karyotype Evolution in Birds: From Conventional Staining to Chromosome Painting

PubMed Central

Ferguson-Smith, Malcolm A.

2018-01-01

In the last few decades, there have been great efforts to reconstruct the phylogeny of Neoaves based mainly on DNA sequencing. Despite the importance of karyotype data in phylogenetic studies, especially with the advent of fluorescence in situ hybridization (FISH) techniques using different types of probes, the use of chromosomal data to clarify phylogenetic proposals is still minimal. Additionally, comparative chromosome painting in birds is restricted to a few orders, while in mammals, for example, virtually all orders have already been analyzed using this method. Most reports are based on comparisons using Gallus gallus probes, and only a small number of species have been analyzed with more informative sets of probes, such as those from Leucopternis albicollis and Gyps fulvus, which show ancestral macrochromosomes rearranged in alternative patterns. Despite this, it is appropriate to review the available cytogenetic information and possible phylogenetic conclusions. In this report, the authors gather both classical and molecular cytogenetic data and describe some interesting and unique characteristics of karyotype evolution in birds. PMID:29584697
PoMo: An Allele Frequency-Based Approach for Species Tree Estimation

PubMed Central

De Maio, Nicola; Schrempf, Dominik; Kosiol, Carolin

2015-01-01

Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches. PMID:26209413
Phylogenetic relations of humans and African apes from DNA sequences in the Psi eta-globin region

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miyamoto, M.M.; Slightom, J.L.; Goodman, M.

Sequences from the upstream and downstream flanking DNA regions of the Psi eta-globin locus in Pan troglodytes (common chimpanzee), Gorilla gorilla (gorilla), and Pongo pygmaeus (orangutan, the closest living relative to Homo, Pan, and Gorilla) provided further data for evaluating the phylogenetic relations of humans and African apes. These newly sequenced orthologs (an additional 4.9 kilobase pairs (kbp) for each species) were combined with published Psi eta-gene sequences and then compared to the same orthologous stretch (a continuous 7.1-kbp region) available for humans. Phylogenetic analysis of these nucleotide sequences by the parsimony method indicated (i) that human and chimpanzee aremore » more closely related to each other than either is to gorilla and (ii) that the slowdown in the rate of sequence evolution evident in higher primates is especially pronounced in humans. These results indicate that features unique to African apes (but not to humans) are primitive and that even local molecular clocks should be applied with caution.« less
Karyotype Evolution in Birds: From Conventional Staining to Chromosome Painting.

PubMed

Kretschmer, Rafael; Ferguson-Smith, Malcolm A; de Oliveira, Edivaldo Herculano Correa

2018-03-27

In the last few decades, there have been great efforts to reconstruct the phylogeny of Neoaves based mainly on DNA sequencing. Despite the importance of karyotype data in phylogenetic studies, especially with the advent of fluorescence in situ hybridization (FISH) techniques using different types of probes, the use of chromosomal data to clarify phylogenetic proposals is still minimal. Additionally, comparative chromosome painting in birds is restricted to a few orders, while in mammals, for example, virtually all orders have already been analyzed using this method. Most reports are based on comparisons using Gallus gallus probes, and only a small number of species have been analyzed with more informative sets of probes, such as those from Leucopternis albicollis and Gyps fulvus , which show ancestral macrochromosomes rearranged in alternative patterns. Despite this, it is appropriate to review the available cytogenetic information and possible phylogenetic conclusions. In this report, the authors gather both classical and molecular cytogenetic data and describe some interesting and unique characteristics of karyotype evolution in birds.
Phylogenetic Tools for Generalized HIV-1 Epidemics: Findings from the PANGEA-HIV Methods Comparison

PubMed Central

Ratmann, Oliver; Hodcroft, Emma B.; Pickles, Michael; Cori, Anne; Hall, Matthew; Lycett, Samantha; Colijn, Caroline; Dearlove, Bethany; Didelot, Xavier; Frost, Simon; Hossain, A.S. Md Mukarram; Joy, Jeffrey B.; Kendall, Michelle; Kühnert, Denise; Leventhal, Gabriel E.; Liang, Richard; Plazzotta, Giacomo; Poon, Art F.Y.; Rasmussen, David A.; Stadler, Tanja; Volz, Erik; Weis, Caroline; Leigh Brown, Andrew J.; Fraser, Christophe

2017-01-01

Viral phylogenetic methods contribute to understanding how HIV spreads in populations, and thereby help guide the design of prevention interventions. So far, most analyses have been applied to well-sampled concentrated HIV-1 epidemics in wealthy countries. To direct the use of phylogenetic tools to where the impact of HIV-1 is greatest, the Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium generates full-genome viral sequences from across sub-Saharan Africa. Analyzing these data presents new challenges, since epidemics are principally driven by heterosexual transmission and a smaller fraction of cases is sampled. Here, we show that viral phylogenetic tools can be adapted and used to estimate epidemiological quantities of central importance to HIV-1 prevention in sub-Saharan Africa. We used a community-wide methods comparison exercise on simulated data, where participants were blinded to the true dynamics they were inferring. Two distinct simulations captured generalized HIV-1 epidemics, before and after a large community-level intervention that reduced infection levels. Five research groups participated. Structured coalescent modeling approaches were most successful: phylogenetic estimates of HIV-1 incidence, incidence reductions, and the proportion of transmissions from individuals in their first 3 months of infection correlated with the true values (Pearson correlation > 90%), with small bias. However, on some simulations, true values were markedly outside reported confidence or credibility intervals. The blinded comparison revealed current limits and strengths in using HIV phylogenetics in challenging settings, provided benchmarks for future methods’ development, and supports using the latest generation of phylogenetic tools to advance HIV surveillance and prevention. PMID:28053012
A program to compute the soft Robinson-Foulds distance between phylogenetic networks.

PubMed

Lu, Bingxin; Zhang, Louxin; Leong, Hon Wai

2017-03-14

Over the past two decades, phylogenetic networks have been studied to model reticulate evolutionary events. The relationships among phylogenetic networks, phylogenetic trees and clusters serve as the basis for reconstruction and comparison of phylogenetic networks. To understand these relationships, two problems are raised: the tree containment problem, which asks whether a phylogenetic tree is displayed in a phylogenetic network, and the cluster containment problem, which asks whether a cluster is represented at a node in a phylogenetic network. Both the problems are NP-complete. A fast exponential-time algorithm for the cluster containment problem on arbitrary networks is developed and implemented in C. The resulting program is further extended into a computer program for fast computation of the Soft Robinson-Foulds distance between phylogenetic networks. Two computer programs are developed for facilitating reconstruction and validation of phylogenetic network models in evolutionary and comparative genomics. Our simulation tests indicated that they are fast enough for use in practice. Additionally, the distribution of the Soft Robinson-Foulds distance between phylogenetic networks is demonstrated to be unlikely normal by our simulation data.
Phylogenetic Analyses: A Toolbox Expanding towards Bayesian Methods

PubMed Central

Aris-Brosou, Stéphane; Xia, Xuhua

2008-01-01

The reconstruction of phylogenies is becoming an increasingly simple activity. This is mainly due to two reasons: the democratization of computing power and the increased availability of sophisticated yet user-friendly software. This review describes some of the latest additions to the phylogenetic toolbox, along with some of their theoretical and practical limitations. It is shown that Bayesian methods are under heavy development, as they offer the possibility to solve a number of long-standing issues and to integrate several steps of the phylogenetic analyses into a single framework. Specific topics include not only phylogenetic reconstruction, but also the comparison of phylogenies, the detection of adaptive evolution, and the estimation of divergence times between species. PMID:18483574
Visualizing Phylogenetic Treespace Using Cartographic Projections

NASA Astrophysics Data System (ADS)

Sundberg, Kenneth; Clement, Mark; Snell, Quinn

Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger datasets.
Phylogenetic rooting using minimal ancestor deviation.

PubMed

Tria, Fernando Domingues Kümmel; Landan, Giddy; Dagan, Tal

2017-06-19

Ancestor-descendent relations play a cardinal role in evolutionary theory. Those relations are determined by rooting phylogenetic trees. Existing rooting methods are hampered by evolutionary rate heterogeneity or the unavailability of auxiliary phylogenetic information. Here we present a rooting approach, the minimal ancestor deviation (MAD) method, which accommodates heterotachy by using all pairwise topological and metric information in unrooted trees. We demonstrate the performance of the method, in comparison to existing rooting methods, by the analysis of phylogenies from eukaryotes and prokaryotes. MAD correctly recovers the known root of eukaryotes and uncovers evidence for the origin of cyanobacteria in the ocean. MAD is more robust and consistent than existing methods, provides measures of the root inference quality and is applicable to any tree with branch lengths.
Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

PubMed

Gatesy, John; Springer, Mark S

2014-11-01

Large datasets are required to solve difficult phylogenetic problems that are deep in the Tree of Life. Currently, two divergent systematic methods are commonly applied to such datasets: the traditional supermatrix approach (= concatenation) and "shortcut" coalescence (= coalescence methods wherein gene trees and the species tree are not co-estimated). When applied to ancient clades, these contrasting frameworks often produce congruent results, but in recent phylogenetic analyses of Placentalia (placental mammals), this is not the case. A recent series of papers has alternatively disputed and defended the utility of shortcut coalescence methods at deep phylogenetic scales. Here, we examine this exchange in the context of published phylogenomic data from Mammalia; in particular we explore two critical issues - the delimitation of data partitions ("genes") in coalescence analysis and hidden support that emerges with the combination of such partitions in phylogenetic studies. Hidden support - increased support for a clade in combined analysis of all data partitions relative to the support evident in separate analyses of the various data partitions, is a hallmark of the supermatrix approach and a primary rationale for concatenating all characters into a single matrix. In the most extreme cases of hidden support, relationships that are contradicted by all gene trees are supported when all of the genes are analyzed together. A valid fear is that shortcut coalescence methods might bypass or distort character support that is hidden in individual loci because small gene fragments are analyzed in isolation. Given the extensive systematic database for Mammalia, the assumptions and applicability of shortcut coalescence methods can be assessed with rigor to complement a small but growing body of simulation work that has directly compared these methods to concatenation. We document several remarkable cases of hidden support in both supermatrix and coalescence paradigms and argue that in most instances, the emergent support in the shortcut coalescence analyses is an artifact. By referencing rigorous molecular clock studies of Mammalia, we suggest that inaccurate gene trees that imply unrealistically deep coalescences debilitate shortcut coalescence analyses of the placental dataset. We document contradictory coalescence results for Placentalia, and outline a critical conundrum that challenges the general utility of shortcut coalescence methods at deep phylogenetic scales. In particular, the basic unit of analysis in coalescence analysis, the coalescence-gene, is expected to shrink in size as more taxa are analyzed, but as the amount of data for reconstruction of a gene tree ratchets downward, the number of nodes in the gene tree that need to be resolved ratchets upward. Some advocates of shortcut coalescence methods have attempted to address problems with inaccurate gene trees by concatenating multiple coalescence-genes to yield "gene trees" that better match the species tree. However, this hybrid concatenation/coalescence approach, "concatalescence," contradicts the most basic biological rationale for performing a coalescence analysis in the first place. We discuss this reality in the context of recent simulation work that also suggests inaccurate reconstruction of gene trees is more problematic for shortcut coalescence methods than deep coalescence of independently segregating loci is for concatenation methods. Copyright © 2014 Elsevier Inc. All rights reserved.
Phylogenetic and Evolutionary Patterns in Microbial Carotenoid Biosynthesis Are Revealed by Comparative Genomics

PubMed Central

Klassen, Jonathan L.

2010-01-01

Background Carotenoids are multifunctional, taxonomically widespread and biotechnologically important pigments. Their biosynthesis serves as a model system for understanding the evolution of secondary metabolism. Microbial carotenoid diversity and evolution has hitherto been analyzed primarily from structural and biosynthetic perspectives, with the few phylogenetic analyses of microbial carotenoid biosynthetic proteins using either used limited datasets or lacking methodological rigor. Given the recent accumulation of microbial genome sequences, a reappraisal of microbial carotenoid biosynthetic diversity and evolution from the perspective of comparative genomics is warranted to validate and complement models of microbial carotenoid diversity and evolution based upon structural and biosynthetic data. Methodology/Principal Findings Comparative genomics were used to identify and analyze in silico microbial carotenoid biosynthetic pathways. Four major phylogenetic lineages of carotenoid biosynthesis are suggested composed of: (i) Proteobacteria; (ii) Firmicutes; (iii) Chlorobi, Cyanobacteria and photosynthetic eukaryotes; and (iv) Archaea, Bacteroidetes and two separate sub-lineages of Actinobacteria. Using this phylogenetic framework, specific evolutionary mechanisms are proposed for carotenoid desaturase CrtI-family enzymes and carotenoid cyclases. Several phylogenetic lineage-specific evolutionary mechanisms are also suggested, including: (i) horizontal gene transfer; (ii) gene acquisition followed by differential gene loss; (iii) co-evolution with other biochemical structures such as proteorhodopsins; and (iv) positive selection. Conclusions/Significance Comparative genomics analyses of microbial carotenoid biosynthetic proteins indicate a much greater taxonomic diversity then that identified based on structural and biosynthetic data, and divides microbial carotenoid biosynthesis into several, well-supported phylogenetic lineages not evident previously. This phylogenetic framework is applicable to understanding the evolution of specific carotenoid biosynthetic proteins or the unique characteristics of carotenoid biosynthetic evolution in a specific phylogenetic lineage. Together, these analyses suggest a “bramble” model for microbial carotenoid biosynthesis whereby later biosynthetic steps exhibit greater evolutionary plasticity and reticulation compared to those closer to the biosynthetic “root”. Structural diversification may be constrained (“trimmed”) where selection is strong, but less so where selection is weaker. These analyses also highlight likely productive avenues for future research and bioprospecting by identifying both gaps in current knowledge and taxa which may particularly facilitate carotenoid diversification. PMID:20582313
Phylogenetic diversity, functional trait diversity and extinction: avoiding tipping points and worst-case losses.

PubMed

Faith, Daniel P

2015-02-19

The phylogenetic diversity measure, ('PD'), measures the relative feature diversity of different subsets of taxa from a phylogeny. At the level of feature diversity, PD supports the broad goal of biodiversity conservation to maintain living variation and option values. PD calculations at the level of lineages and features include those integrating probabilities of extinction, providing estimates of expected PD. This approach has known advantages over the evolutionarily distinct and globally endangered (EDGE) methods. Expected PD methods also have limitations. An alternative notion of expected diversity, expected functional trait diversity, relies on an alternative non-phylogenetic model and allows inferences of diversity at the level of functional traits. Expected PD also faces challenges in helping to address phylogenetic tipping points and worst-case PD losses. Expected PD may not choose conservation options that best avoid worst-case losses of long branches from the tree of life. We can expand the range of useful calculations based on expected PD, including methods for identifying phylogenetic key biodiversity areas. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Phylogenetic diversity of macromycetes and woody plants along an elevational gradient in Eastern Mexico

Treesearch

Marko Gomez-Hernandez; Guadalupe Williams-Linera; D. Jean Lodge; Roger Guevara; Eduardo Ruiz-Sanchez; Etelvina Gandara

2016-01-01

Phylogenetic information provides insight into the ecological and evolutionary processes that organize species assemblages. We compared patterns of phylogenetic diversity among macromycete and woody plant communities along a steep elevational gradient in eastern Mexico to better understand the evolutionary processes that structure their communities. Macrofungi and...
Phylogenetic analysis of a spontaneous cocoa bean fermentation metagenome reveals new insights into its bacterial and fungal community diversity.

PubMed

Illeghems, Koen; De Vuyst, Luc; Papalexandratou, Zoi; Weckx, Stefan

2012-01-01

This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly γ-Proteobacteria) and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni). Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques.
Comparing Ontogenetic and Phylogenetic Stages of Human Development

ERIC Educational Resources Information Center

Clarken, Rodney H.

2005-01-01

This paper will present evidence to support ontogenetic and phylogenetic parallels and draw from these comparisons to further illuminate our understanding of micro and macro human development. Individual and collective stages of physical, psychological and spiritual development will be compared and their homologous structures examined.…
Fitting models of continuous trait evolution to incompletely sampled comparative data using approximate Bayesian computation.

PubMed

Slater, Graham J; Harmon, Luke J; Wegmann, Daniel; Joyce, Paul; Revell, Liam J; Alfaro, Michael E

2012-03-01

In recent years, a suite of methods has been developed to fit multiple rate models to phylogenetic comparative data. However, most methods have limited utility at broad phylogenetic scales because they typically require complete sampling of both the tree and the associated phenotypic data. Here, we develop and implement a new, tree-based method called MECCA (Modeling Evolution of Continuous Characters using ABC) that uses a hybrid likelihood/approximate Bayesian computation (ABC)-Markov-Chain Monte Carlo approach to simultaneously infer rates of diversification and trait evolution from incompletely sampled phylogenies and trait data. We demonstrate via simulation that MECCA has considerable power to choose among single versus multiple evolutionary rate models, and thus can be used to test hypotheses about changes in the rate of trait evolution across an incomplete tree of life. We finally apply MECCA to an empirical example of body size evolution in carnivores, and show that there is no evidence for an elevated rate of body size evolution in the pinnipeds relative to terrestrial carnivores. ABC approaches can provide a useful alternative set of tools for future macroevolutionary studies where likelihood-dependent approaches are lacking. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.
A comparison of the community diversity of foliar fungal endophytes between seedling and adult loblolly pines (Pinus taeda)

PubMed Central

Oono, Ryoko; Lefèvre, Emilie; Simha, Anita; Lutzoni, François

2015-01-01

Fungal endophytes represent one of the most ubiquitous plant symbionts on Earth and are phylogenetically diverse. The structure and diversity of endophyte communities have been shown to depend on host taxa and climate, but there have been relatively few studies exploring endophyte communities throughout host maturity. We compared foliar fungal endophyte communities between seedlings and adult trees of loblolly pines (Pinus taeda) at the same seasons and locations by culturing and culture-independent methods. We sequenced the internal transcribed spacer region and adjacent partial large subunit nuclear ribosomal RNA gene (ITS–LSU amplicon) to delimit operational taxonomic units and phylogenetically characterize the communities. Despite the lower infection frequency in seedlings compared to adult trees, seedling needles were receptive to a more diverse community of fungal endophytes. Culture-free method confirmed the presence of commonly cultured OTUs from adult needles but revealed several new OTUs from seedling needles that were not found with culturing methods. The two most commonly cultured OTUs in adults were rarely cultured from seedlings, suggesting that host age is correlated with a selective enrichment for specific endophytes. This shift in endophyte species dominance may be indicative of a functional change between these fungi and their loblolly pine hosts. PMID:26399186
Computing prokaryotic gene ubiquity: rescuing the core from extinction.

PubMed

Charlebois, Robert L; Doolittle, W Ford

2004-12-01

The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.
Evolution of thermotolerance in hot spring cyanobacteria of the genus Synechococcus

NASA Technical Reports Server (NTRS)

Miller, S. R.; Castenholz, R. W.

2000-01-01

The extension of ecological tolerance limits may be an important mechanism by which microorganisms adapt to novel environments, but it may come at the evolutionary cost of reduced performance under ancestral conditions. We combined a comparative physiological approach with phylogenetic analyses to study the evolution of thermotolerance in hot spring cyanobacteria of the genus Synechococcus. Among the 20 laboratory clones of Synechococcus isolated from collections made along an Oregon hot spring thermal gradient, four different 16S rRNA gene sequences were identified. Phylogenies constructed by using the sequence data indicated that the clones were polyphyletic but that three of the four sequence groups formed a clade. Differences in thermotolerance were observed for clones with different 16S rRNA gene sequences, and comparison of these physiological differences within a phylogenetic framework provided evidence that more thermotolerant lineages of Synechococcus evolved from less thermotolerant ancestors. The extension of the thermal limit in these bacteria was correlated with a reduction in the breadth of the temperature range for growth, which provides evidence that enhanced thermotolerance has come at the evolutionary cost of increased thermal specialization. This study illustrates the utility of using phylogenetic comparative methods to investigate how evolutionary processes have shaped historical patterns of ecological diversification in microorganisms.
Comparison of microbial taxonomic and functional shift pattern along contamination gradient.

PubMed

Ren, Youhua; Niu, Jiaojiao; Huang, Wenkun; Peng, Deliang; Xiao, Yunhua; Zhang, Xian; Liang, Yili; Liu, Xueduan; Yin, Huaqun

2016-06-14

The interaction mechanism between microbial communities and environment is a key issue in microbial ecology. Microbial communities usually change significantly under environmental stress, which has been studied both phylogenetically and functionally, however which method is more effective in assessing the relationship between microbial communities shift and environmental changes still remains controversial. By comparing the microbial taxonomic and functional shift pattern along heavy metal contamination gradient, we found that both sedimentary composition and function shifted significantly along contamination gradient. For example, the relative abundance of Geobacter and Fusibacter decreased along contamination gradient (from high to low), while Janthinobacterium and Arthrobacter increased their abundances. Most genes involved in heavy metal resistance (e.g., metc, aoxb and mer) showed higher intensity in sites with higher concentration of heavy metals. Comparing the two shift patterns, there were correlations between them, because functional and phylogenetic β-diversities were significantly correlated, and many heavy metal resistance genes were derived from Geobacter, explaining their high abundance in heavily contaminated sites. However, there was a stronger link between functional composition and environmental drivers, while stochasticity played an important role in formation and succession of phylogenetic composition demonstrated by null model test. Overall our research suggested that the responses of functional traits depended more on environmental changes, while stochasticity played an important role in formation and succession of phylogenetic composition for microbial communities. So profiling microbial functional composition seems more appropriate to study the relationship between microbial communities and environment, as well as explore the adaptation and remediation mechanism of microbial communities to heavy metal contamination.

Toward the resolution of an explosive radiation--a multilocus phylogeny of oceanic dolphins (Delphinidae).

PubMed

McGowen, Michael R

2011-09-01

Oceanic dolphins (Delphinidae) are the product of a rapid radiation that yielded ∼36 extant species of small to medium-sized cetaceans that first emerged in the Late Miocene. Although they are a charismatic group of organisms that have become poster children for marine conservation, many phylogenetic relationships within Delphinidae remain elusive due to the slow molecular evolution of the group and the difficulty of resolving short branches from successive cladogenic events. Here I combine existing and newly generated sequences from four mitochondrial (mt) genes and 20 nuclear (nu) genes to reconstruct a well-supported phylogenetic hypothesis for Delphinidae. This study compares maximum-likelihood and Bayesian inference methods of several data sets including mtDNA, combined nuDNA, gene trees of individual nuDNA loci, and concatenated mtDNA+nuDNA. In addition, I contrast these standard phylogenetic analyses with the species tree reconstruction method of Bayesian concordance analysis (BCA). Despite finding discordance between mtDNA and individual nuDNA loci, the concatenated matrix recovers a completely resolved and robustly supported phylogeny that is also broadly congruent with BCA trees. This study strongly supports groupings such as Delphininae, Lissodelphininae, Globicephalinae, Sotalia+Delphininae, Steno+Orcaella+Globicephalinae, and Leucopleurus acutus, Lagenorhynchus albirostris, and Orcinus orca as basal delphinid taxa. Copyright © 2011 Elsevier Inc. All rights reserved.
SWPhylo - A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees.

PubMed

Yu, Xiaoyu; Reva, Oleg N

2018-01-01

Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.
SWPhylo – A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees

PubMed Central

Yu, Xiaoyu; Reva, Oleg N

2018-01-01

Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354
Molecular evolution of ependymin and the phylogenetic resolution of early divergences among euteleost fishes.

PubMed

Ortí, G; Meyer, A

1996-04-01

The rate and pattern of DNA evolution of ependymin, a single-copy gene coding for a highly expressed glycoprotein in the brain matrix of teleost fishes, is characterized and its phylogenetic utility for fish systematics is assessed. DNA sequences were determined from catfish, electric fish, and characiforms and compared with published ependymin sequences from cyprinids, salmon, pike, and herring. Among these groups, ependymin amino acid sequences were highly divergent (up to 60% sequence difference), but had surprisingly similar hydropathy profiles and invariant glycosylation sites, suggesting that functional properties of the proteins are conserved. Comparison of base composition at third codon positions and introns revealed AT-rich introns and GC-rich third codon positions, suggesting that the biased codon usage observed might not be due to mutational bias. Phylogenetic information content of third codon positions was surprisingly high and sufficient to recover the most basal nodes of the tree, in spite of the observation that pairwise distances (at third codon positions) were well above the presumed saturation level. This finding can be explained by the high proportion of phylogenetically informative nonsynonymous changes at third codon positions among these highly divergent proteins. Ependymin DNA sequences have established the first molecular evidence for the monophyly of a group containing salmonids and esociforms. In addition, ependymin suggests a sister group relationship of electric fish (Gymnotiformes) and Characiformes, constituting a significant departure from currently accepted classifications. However, relationships among characiform lineages were not completely resolved by ependymin sequences in spite of seemingly appropriate levels of variation among taxa and considerably low levels of homoplasy in the data (consistency index = 0.7). If the diversification of Characiformes took place in an "explosive" manner, over a relatively short period of time this pattern should also be observed using other phylogenetic markers. Poor conservation of ependymin's primary structure hinders the design of efficient primers for PCR that could be used in wide-ranging fish systematic studies. However, alternative methods like PCR amplification from cDNA used here should provide promising comparative sequence data for the resolution of phylogenetic relationships among other basal lineages of teleost fishes.
Pollinator shifts as triggers of speciation in painted petal irises (Lapeirousia: Iridaceae)

PubMed Central

Forest, Félix; Goldblatt, Peter; Manning, John C.; Baker, David; Colville, Jonathan F.; Devey, Dion S.; Jose, Sarah; Kaye, Maria; Buerki, Sven

2014-01-01

Background and Aims Adaptation to different pollinators has been hypothesized as one of the main factors promoting the formation of new species in the Cape region of South Africa. Other researchers favour alternative causes such as shifts in edaphic preferences. Using a phylogenetic framework and taking into consideration the biogeographical scenario explaining the distribution of the group as well as the distribution of pollinators, this study compares pollination strategies with substrate adaptations to develop hypotheses of the primary factors leading to speciation in Lapeirousia (Iridaceae), a genus of corm-bearing geophytes well represented in the Cape and presenting an important diversity of pollination syndromes and edaphic preferences. Methods Phylogenetic relationships are reconstructed within Lapeirousia using nuclear and plastid DNA sequence data. State-of-the-art methods in biogeography, divergence time estimation, character optimization and diversification rate assessments are used to examine the evolution of pollination syndromes and substrate shifts in the history of the group. Based on the phylogenetic results, ecological factors are compared for nine sister species pairs in Lapeirousia. Key Results Seventeen pollinator shifts and ten changes in substrate types were inferred during the evolution of the genus Lapeirousia. Of the nine species pairs examined, all show divergence in pollination syndromes, while only four pairs present different substrate types. Conclusions The available evidence points to a predominant influence of pollinator shifts over substrate types on the speciation process within Lapeirousia, contrary to previous studies that favoured a more important role for edaphic factors in these processes. This work also highlights the importance of biogeographical patterns in the study of pollination syndromes. PMID:24323246
Comparative methods for the analysis of gene-expression evolution: an example using yeast functional genomic data.

PubMed

Oakley, Todd H; Gu, Zhenglong; Abouheif, Ehab; Patel, Nipam H; Li, Wen-Hsiung

2005-01-01

Understanding the evolution of gene function is a primary challenge of modern evolutionary biology. Despite an expanding database from genomic and developmental studies, we are lacking quantitative methods for analyzing the evolution of some important measures of gene function, such as gene-expression patterns. Here, we introduce phylogenetic comparative methods to compare different models of gene-expression evolution in a maximum-likelihood framework. We find that expression of duplicated genes has evolved according to a nonphylogenetic model, where closely related genes are no more likely than more distantly related genes to share common expression patterns. These results are consistent with previous studies that found rapid evolution of gene expression during the history of yeast. The comparative methods presented here are general enough to test a wide range of evolutionary hypotheses using genomic-scale data from any organism.
The phylogenetic distribution of extrafloral nectaries in plants

PubMed Central

Weber, Marjorie G.; Keeler, Kathleen H.

2013-01-01

Background and Aims Understanding the evolutionary patterns of ecologically relevant traits is a central goal in plant biology. However, for most important traits, we lack the comprehensive understanding of their taxonomic distribution needed to evaluate their evolutionary mode and tempo across the tree of life. Here we evaluate the broad phylogenetic patterns of a common plant-defence trait found across vascular plants: extrafloral nectaries (EFNs), plant glands that secrete nectar and are located outside the flower. EFNs typically defend plants indirectly by attracting invertebrate predators who reduce herbivory. Methods Records of EFNs published over the last 135 years were compiled. After accounting for changes in taxonomy, phylogenetic comparative methods were used to evaluate patterns of EFN evolution, using a phylogeny of over 55 000 species of vascular plants. Using comparisons of parametric and non-parametric models, the true number of species with EFNs likely to exist beyond the current list was estimated. Key Results To date, EFNs have been reported in 3941 species representing 745 genera in 108 families, about 1–2 % of vascular plant species and approx. 21 % of families. They are found in 33 of 65 angiosperm orders. Foliar nectaries are known in four of 36 fern families. Extrafloral nectaries are unknown in early angiosperms, magnoliids and gymnosperms. They occur throughout monocotyledons, yet most EFNs are found within eudicots, with the bulk of species with EFNs being rosids. Phylogenetic analyses strongly support the repeated gain and loss of EFNs across plant clades, especially in more derived dicot families, and suggest that EFNs are found in a minimum of 457 independent lineages. However, model selection methods estimate that the number of unreported cases of EFNs may be as high as the number of species already reported. Conclusions EFNs are widespread and evolutionarily labile traits that have repeatedly evolved a remarkable number of times in vascular plants. Our current understanding of the phylogenetic patterns of EFNs makes them powerful candidates for future work exploring the drivers of their evolutionary origins, shifts, and losses. PMID:23087129
Phylogenetic diversity and biodiversity indices on phylogenetic networks.

PubMed

Wicke, Kristina; Fischer, Mareike

2018-04-01

In biodiversity conservation it is often necessary to prioritize the species to conserve. Existing approaches to prioritization, e.g. the Fair Proportion Index and the Shapley Value, are based on phylogenetic trees and rank species according to their contribution to overall phylogenetic diversity. However, in many cases evolution is not treelike and thus, phylogenetic networks have been developed as a generalization of phylogenetic trees, allowing for the representation of non-treelike evolutionary events, such as hybridization. Here, we extend the concepts of phylogenetic diversity and phylogenetic diversity indices from phylogenetic trees to phylogenetic networks. On the one hand, we consider the treelike content of a phylogenetic network, e.g. the (multi)set of phylogenetic trees displayed by a network and the so-called lowest stable ancestor tree associated with it. On the other hand, we derive the phylogenetic diversity of subsets of taxa and biodiversity indices directly from the internal structure of the network. We consider both approaches that are independent of so-called inheritance probabilities as well as approaches that explicitly incorporate these probabilities. Furthermore, we introduce our software package NetDiversity, which is implemented in Perl and allows for the calculation of all generalized measures of phylogenetic diversity and generalized phylogenetic diversity indices established in this note that are independent of inheritance probabilities. We apply our methods to a phylogenetic network representing the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), a group of species characterized by widespread hybridization. Copyright © 2018 Elsevier Inc. All rights reserved.
Comparison of cluster-based and source-attribution methods for estimating transmission risk using large HIV sequence databases.

PubMed

Le Vu, Stéphane; Ratmann, Oliver; Delpech, Valerie; Brown, Alison E; Gill, O Noel; Tostevin, Anna; Fraser, Christophe; Volz, Erik M

2018-06-01

Phylogenetic clustering of HIV sequences from a random sample of patients can reveal epidemiological transmission patterns, but interpretation is hampered by limited theoretical support and statistical properties of clustering analysis remain poorly understood. Alternatively, source attribution methods allow fitting of HIV transmission models and thereby quantify aspects of disease transmission. A simulation study was conducted to assess error rates of clustering methods for detecting transmission risk factors. We modeled HIV epidemics among men having sex with men and generated phylogenies comparable to those that can be obtained from HIV surveillance data in the UK. Clustering and source attribution approaches were applied to evaluate their ability to identify patient attributes as transmission risk factors. We find that commonly used methods show a misleading association between cluster size or odds of clustering and covariates that are correlated with time since infection, regardless of their influence on transmission. Clustering methods usually have higher error rates and lower sensitivity than source attribution method for identifying transmission risk factors. But neither methods provide robust estimates of transmission risk ratios. Source attribution method can alleviate drawbacks from phylogenetic clustering but formal population genetic modeling may be required to estimate quantitative transmission risk factors. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Phylogenetic Properties of RNA Viruses

PubMed Central

Pompei, Simone; Loreto, Vittorio; Tria, Francesca

2012-01-01

A new word, phylodynamics, was coined to emphasize the interconnection between phylogenetic properties, as observed for instance in a phylogenetic tree, and the epidemic dynamics of viruses, where selection, mediated by the host immune response, and transmission play a crucial role. The challenges faced when investigating the evolution of RNA viruses call for a virtuous loop of data collection, data analysis and modeling. This already resulted both in the collection of massive sequences databases and in the formulation of hypotheses on the main mechanisms driving qualitative differences observed in the (reconstructed) evolutionary patterns of different RNA viruses. Qualitatively, it has been observed that selection driven by the host immune response induces an uneven survival ability among co-existing strains. As a consequence, the imbalance level of the phylogenetic tree is manifestly more pronounced if compared to the case when the interaction with the host immune system does not play a central role in the evolutive dynamics. While many imbalance metrics have been introduced, reliable methods to discriminate in a quantitative way different level of imbalance are still lacking. In our work, we reconstruct and analyze the phylogenetic trees of six RNA viruses, with a special emphasis on the human Influenza A virus, due to its relevance for vaccine preparation as well as for the theoretical challenges it poses due to its peculiar evolutionary dynamics. We focus in particular on topological properties. We point out the limitation featured by standard imbalance metrics, and we introduce a new methodology with which we assign the correct imbalance level of the phylogenetic trees, in agreement with the phylodynamics of the viruses. Our thorough quantitative analysis allows for a deeper understanding of the evolutionary dynamics of the considered RNA viruses, which is crucial in order to provide a valuable framework for a quantitative assessment of theoretical predictions. PMID:23028645
Entire plastid phylogeny of the carrot genus (Daucus, Apiaceae): Concordance with nuclear data and mitochondrial and nuclear DNA insertions to the plastid.

PubMed

Spooner, David M; Ruess, Holly; Iorizzo, Massimo; Senalik, Douglas; Simon, Philipp

2017-02-01

We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results with prior phylogenetic results using plastid and nuclear DNA sequences. We used Illumina sequencing to obtain full plastid sequences of 37 accessions of 20 Daucus taxa and outgroups, analyzed the data with phylogenetic methods, and examined evidence for mitochondrial DNA transfer to the plastid ( Dc MP). Our phylogenetic trees of the entire data set were highly resolved, with 100% bootstrap support for most of the external and many of the internal clades, except for the clade of D. carota and its most closely related species D. syrticus . Subsets of the data, including regions traditionally used as phylogenetically informative regions, provide various degrees of soft congruence with the entire data set. There are areas of hard incongruence, however, with phylogenies using nuclear data. We extended knowledge of a mitochondrial to plastid DNA insertion sequence previously named Dc MP and identified the first instance in flowering plants of a sequence of potential nuclear genome origin inserted into the plastid genome. There is a relationship of inverted repeat junction classes and repeat DNA to phylogeny, but no such relationship with nonsynonymous mutations. Our data have allowed us to (1) produce a well-resolved plastid phylogeny of Daucus , (2) evaluate subsets of the entire plastid data for phylogeny, (3) examine evidence for plastid and nuclear DNA phylogenetic incongruence, and (4) examine mitochondrial and nuclear DNA insertion into the plastid. © 2017 Spooner et al. Published by the Botanical Society of America. This work is licensed under a Creative Commons public domain license (CC0 1.0).
On the use of cartographic projections in visualizing phylo-genetic tree space

PubMed Central

2010-01-01

Phylogenetic analysis is becoming an increasingly important tool for biological research. Applications include epidemiological studies, drug development, and evolutionary analysis. Phylogenetic search is a known NP-Hard problem. The size of the data sets which can be analyzed is limited by the exponential growth in the number of trees that must be considered as the problem size increases. A better understanding of the problem space could lead to better methods, which in turn could lead to the feasible analysis of more data sets. We present a definition of phylogenetic tree space and a visualization of this space that shows significant exploitable structure. This structure can be used to develop search methods capable of handling much larger data sets. PMID:20529355
A new algorithm to construct phylogenetic networks from trees.

PubMed

Wang, J

2014-03-06

Developing appropriate methods for constructing phylogenetic networks from tree sets is an important problem, and much research is currently being undertaken in this area. BIMLR is an algorithm that constructs phylogenetic networks from tree sets. The algorithm can construct a much simpler network than other available methods. Here, we introduce an improved version of the BIMLR algorithm, QuickCass. QuickCass changes the selection strategy of the labels of leaves below the reticulate nodes, i.e., the nodes with an indegree of at least 2 in BIMLR. We show that QuickCass can construct simpler phylogenetic networks than BIMLR. Furthermore, we show that QuickCass is a polynomial-time algorithm when the output network that is constructed by QuickCass is binary.
Independent contrasts and PGLS regression estimators are equivalent.

PubMed

Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

2012-05-01

We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.
Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information

PubMed Central

McDonald, Daniel; Gonzalez, Antonio; Navas-Molina, Jose A.; Jiang, Lingjing; Xu, Zhenjiang Zech; Winker, Kevin; Kado, Deborah M.; Orwoll, Eric; Manary, Mark; Mirarab, Siavash

2018-01-01

ABSTRACT Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith’s PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SATé-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods. PMID:29719869
Accurate Phylogenetic Tree Reconstruction from Quartets: A Heuristic Approach

PubMed Central

Reaz, Rezwana; Bayzid, Md. Shamsuzzoha; Rahman, M. Sohel

2014-01-01

Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A ‘quartet’ is an unrooted tree over taxa, hence the quartet-based supertree methods combine many -taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets. PMID:25117474
Molecular Phylogenetics: Concepts for a Newcomer.

PubMed

Ajawatanawong, Pravech

Molecular phylogenetics is the study of evolutionary relationships among organisms using molecular sequence data. The aim of this review is to introduce the important terminology and general concepts of tree reconstruction to biologists who lack a strong background in the field of molecular evolution. Some modern phylogenetic programs are easy to use because of their user-friendly interfaces, but understanding the phylogenetic algorithms and substitution models, which are based on advanced statistics, is still important for the analysis and interpretation without a guide. Briefly, there are five general steps in carrying out a phylogenetic analysis: (1) sequence data preparation, (2) sequence alignment, (3) choosing a phylogenetic reconstruction method, (4) identification of the best tree, and (5) evaluating the tree. Concepts in this review enable biologists to grasp the basic ideas behind phylogenetic analysis and also help provide a sound basis for discussions with expert phylogeneticists.
The phylogenetic distribution of extrafloral nectaries in plants.

PubMed

Weber, Marjorie G; Keeler, Kathleen H

2013-06-01

Understanding the evolutionary patterns of ecologically relevant traits is a central goal in plant biology. However, for most important traits, we lack the comprehensive understanding of their taxonomic distribution needed to evaluate their evolutionary mode and tempo across the tree of life. Here we evaluate the broad phylogenetic patterns of a common plant-defence trait found across vascular plants: extrafloral nectaries (EFNs), plant glands that secrete nectar and are located outside the flower. EFNs typically defend plants indirectly by attracting invertebrate predators who reduce herbivory. Records of EFNs published over the last 135 years were compiled. After accounting for changes in taxonomy, phylogenetic comparative methods were used to evaluate patterns of EFN evolution, using a phylogeny of over 55 000 species of vascular plants. Using comparisons of parametric and non-parametric models, the true number of species with EFNs likely to exist beyond the current list was estimated. To date, EFNs have been reported in 3941 species representing 745 genera in 108 families, about 1-2 % of vascular plant species and approx. 21 % of families. They are found in 33 of 65 angiosperm orders. Foliar nectaries are known in four of 36 fern families. Extrafloral nectaries are unknown in early angiosperms, magnoliids and gymnosperms. They occur throughout monocotyledons, yet most EFNs are found within eudicots, with the bulk of species with EFNs being rosids. Phylogenetic analyses strongly support the repeated gain and loss of EFNs across plant clades, especially in more derived dicot families, and suggest that EFNs are found in a minimum of 457 independent lineages. However, model selection methods estimate that the number of unreported cases of EFNs may be as high as the number of species already reported. EFNs are widespread and evolutionarily labile traits that have repeatedly evolved a remarkable number of times in vascular plants. Our current understanding of the phylogenetic patterns of EFNs makes them powerful candidates for future work exploring the drivers of their evolutionary origins, shifts, and losses.
Organellar phylogenomics of an emerging model system: Sphagnum (peatmoss)

PubMed Central

Jonathan Shaw, A.; Devos, Nicolas; Liu, Yang; Cox, Cymon J.; Goffinet, Bernard; Flatberg, Kjell Ivar; Shaw, Blanka

2016-01-01

Background and Aims Sphagnum-dominated peatlands contain approx. 30 % of the terrestrial carbon pool in the form of partially decomposed plant material (peat), and, as a consequence, Sphagnum is currently a focus of studies on biogeochemistry and control of global climate. Sphagnum species differ in ecologically important traits that scale up to impact ecosystem function, and sequencing of the genome from selected Sphagnum species is currently underway. As an emerging model system, these resources for Sphagnum will facilitate linking nucleotide variation to plant functional traits, and through those traits to ecosystem processes. A solid phylogenetic framework for Sphagnum is crucial to comparative analyses of species-specific traits, but relationships among major clades within Sphagnum have been recalcitrant to resolution because the genus underwent a rapid radiation. Herein a well-supported hypothesis for phylogenetic relationships among major clades within Sphagnum based on organellar genome sequences (plastid, mitochondrial) is provided. Methods We obtained nucleotide sequences (273 753 nucleotides in total) from the two organellar genomes from 38 species (including three outgroups). Phylogenetic analyses were conducted using a variety of methods applied to nucleotide and amino acid sequences. The Sphagnum phylogeny was rooted with sequences from the related Sphagnopsida genera, Eosphagnum and Flatbergium. Key Results Phylogenetic analyses of the data converge on the following subgeneric relationships: (Rigida (((Subsecunda) (Cuspidata)) ((Sphagnum) (Acutifolia))). All relationships were strongly supported. Species in the two major clades (i.e. Subsecunda + Cuspidata and Sphagnum + Acutifolia), which include >90 % of all Sphagnum species, differ in ecological niches and these differences correlate with other functional traits that impact biogeochemical cycling. Mitochondrial intron presence/absence are variable among species and genera of the Sphagnopsida. Two new nomenclatural combinations are made, in the genera Eosphagnum and Flatbergium. Conclusions Newly resolved relationships now permit phylogenetic analyses of morphological, biochemical and ecological traits among Sphagnum species. The results clarify long-standing disagreements about subgeneric relationships and intrageneric classification. PMID:27268484
Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison

PubMed Central

2013-01-01

Background Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. Results We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. Conclusion CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets. PMID:23617892

Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison.

PubMed

Yang, Fang; Chia, Nicholas; White, Bryan A; Schook, Lawrence B

2013-04-23

Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets.
Towards a formal genealogical classification of the Lezgian languages (North Caucasus): testing various phylogenetic methods on lexical data.

PubMed

Kassian, Alexei

2015-01-01

A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies.
Towards a Formal Genealogical Classification of the Lezgian Languages (North Caucasus): Testing Various Phylogenetic Methods on Lexical Data

PubMed Central

Kassian, Alexei

2015-01-01

A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies. PMID:25719456
Evaluation of a Method Using Three Genomic Guided Escherichia coli Markers for Phylogenetic Typing of E. coli Isolates of Various Genetic Backgrounds

PubMed Central

Hamamoto, Kouta; Ueda, Shuhei; Yamamoto, Yoshimasa

2015-01-01

Genotyping and characterization of bacterial isolates are essential steps in the identification and control of antibiotic-resistant bacterial infections. Recently, one novel genotyping method using three genomic guided Escherichia coli markers (GIG-EM), dinG, tonB, and dipeptide permease (DPP), was reported. Because GIG-EM has not been fully evaluated using clinical isolates, we assessed this typing method with 72 E. coli collection of reference (ECOR) environmental E. coli reference strains and 63 E. coli isolates of various genetic backgrounds. In this study, we designated 768 bp of dinG, 745 bp of tonB, and 655 bp of DPP target sequences for use in the typing method. Concatenations of the processed marker sequences were used to draw GIG-EM phylogenetic trees. E. coli isolates with identical sequence types as identified by the conventional multilocus sequence typing (MLST) method were localized to the same branch of the GIG-EM phylogenetic tree. Sixteen clinical E. coli isolates were utilized as test isolates without prior characterization by conventional MLST and phylogenetic grouping before GIG-EM typing. Of these, 14 clinical isolates were assigned to a branch including only isolates of a pandemic clone, E. coli B2-ST131-O25b, and these results were confirmed by conventional typing methods. Our results suggested that the GIG-EM typing method and its application to phylogenetic trees might be useful tools for the molecular characterization and determination of the genetic relationships among E. coli isolates. PMID:25809972
Evaluation of a Method Using Three Genomic Guided Escherichia coli Markers for Phylogenetic Typing of E. coli Isolates of Various Genetic Backgrounds.

PubMed

Hamamoto, Kouta; Ueda, Shuhei; Yamamoto, Yoshimasa; Hirai, Itaru

2015-06-01

Genotyping and characterization of bacterial isolates are essential steps in the identification and control of antibiotic-resistant bacterial infections. Recently, one novel genotyping method using three genomic guided Escherichia coli markers (GIG-EM), dinG, tonB, and dipeptide permease (DPP), was reported. Because GIG-EM has not been fully evaluated using clinical isolates, we assessed this typing method with 72 E. coli collection of reference (ECOR) environmental E. coli reference strains and 63 E. coli isolates of various genetic backgrounds. In this study, we designated 768 bp of dinG, 745 bp of tonB, and 655 bp of DPP target sequences for use in the typing method. Concatenations of the processed marker sequences were used to draw GIG-EM phylogenetic trees. E. coli isolates with identical sequence types as identified by the conventional multilocus sequence typing (MLST) method were localized to the same branch of the GIG-EM phylogenetic tree. Sixteen clinical E. coli isolates were utilized as test isolates without prior characterization by conventional MLST and phylogenetic grouping before GIG-EM typing. Of these, 14 clinical isolates were assigned to a branch including only isolates of a pandemic clone, E. coli B2-ST131-O25b, and these results were confirmed by conventional typing methods. Our results suggested that the GIG-EM typing method and its application to phylogenetic trees might be useful tools for the molecular characterization and determination of the genetic relationships among E. coli isolates. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Phylogenetic framework for coevolutionary studies: a compass for exploring jungles of tangled trees.

PubMed

Martínez-Aquino, Andrés

2016-08-01

Phylogenetics is used to detect past evolutionary events, from how species originated to how their ecological interactions with other species arose, which can mirror cophylogenetic patterns. Cophylogenetic reconstructions uncover past ecological relationships between taxa through inferred coevolutionary events on trees, for example, codivergence, duplication, host-switching, and loss. These events can be detected by cophylogenetic analyses based on nodes and the length and branching pattern of the phylogenetic trees of symbiotic associations, for example, host-parasite. In the past 2 decades, algorithms have been developed for cophylogetenic analyses and implemented in different software, for example, statistical congruence index and event-based methods. Based on the combination of these approaches, it is possible to integrate temporal information into cophylogenetical inference, such as estimates of lineage divergence times between 2 taxa, for example, hosts and parasites. Additionally, the advances in phylogenetic biogeography applying methods based on parametric process models and combined Bayesian approaches, can be useful for interpreting coevolutionary histories in a scenario of biogeographical area connectivity through time. This article briefly reviews the basics of parasitology and provides an overview of software packages in cophylogenetic methods. Thus, the objective here is to present a phylogenetic framework for coevolutionary studies, with special emphasis on groups of parasitic organisms. Researchers wishing to undertake phylogeny-based coevolutionary studies can use this review as a "compass" when "walking" through jungles of tangled phylogenetic trees.
Phylogenetic framework for coevolutionary studies: a compass for exploring jungles of tangled trees

PubMed Central

2016-01-01

Abstract Phylogenetics is used to detect past evolutionary events, from how species originated to how their ecological interactions with other species arose, which can mirror cophylogenetic patterns. Cophylogenetic reconstructions uncover past ecological relationships between taxa through inferred coevolutionary events on trees, for example, codivergence, duplication, host-switching, and loss. These events can be detected by cophylogenetic analyses based on nodes and the length and branching pattern of the phylogenetic trees of symbiotic associations, for example, host–parasite. In the past 2 decades, algorithms have been developed for cophylogetenic analyses and implemented in different software, for example, statistical congruence index and event-based methods. Based on the combination of these approaches, it is possible to integrate temporal information into cophylogenetical inference, such as estimates of lineage divergence times between 2 taxa, for example, hosts and parasites. Additionally, the advances in phylogenetic biogeography applying methods based on parametric process models and combined Bayesian approaches, can be useful for interpreting coevolutionary histories in a scenario of biogeographical area connectivity through time. This article briefly reviews the basics of parasitology and provides an overview of software packages in cophylogenetic methods. Thus, the objective here is to present a phylogenetic framework for coevolutionary studies, with special emphasis on groups of parasitic organisms. Researchers wishing to undertake phylogeny-based coevolutionary studies can use this review as a “compass” when “walking” through jungles of tangled phylogenetic trees. PMID:29491928
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.

PubMed

Zierke, Stephanie; Bakos, Jason D

2010-04-12

Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
Taking the First Steps towards a Standard for Reporting on Phylogenies: Minimal Information about a Phylogenetic Analysis (MIAPA)

PubMed Central

LEEBENS-MACK, JIM; VISION, TODD; BRENNER, ERIC; BOWERS, JOHN E.; CANNON, STEVEN; CLEMENT, MARK J.; CUNNINGHAM, CLIFFORD W.; dePAMPHILIS, CLAUDE; deSALLE, ROB; DOYLE, JEFF J.; EISEN, JONATHAN A.; GU, XUN; HARSHMAN, JOHN; JANSEN, ROBERT K.; KELLOGG, ELIZABETH A.; KOONIN, EUGENE V.; MISHLER, BRENT D.; PHILIPPE, HERVÉ; PIRES, J. CHRIS; QIU, YIN-LONG; RHEE, SEUNG Y.; SJÖLANDER, KIMMEN; SOLTIS, DOUGLAS E.; SOLTIS, PAMELA S.; STEVENSON, DENNIS W.; WALL, KERR; WARNOW, TANDY; ZMASEK, CHRISTIAN

2011-01-01

In the eight years since phylogenomics was introduced as the intersection of genomics and phylogenetics, the field has provided fundamental insights into gene function, genome history and organismal relationships. The utility of phylogenomics is growing with the increase in the number and diversity of taxa for which whole genome and large transcriptome sequence sets are being generated. We assert that the synergy between genomic and phylogenetic perspectives in comparative biology would be enhanced by the development and refinement of minimal reporting standards for phylogenetic analyses. Encouraged by the development of the Minimum Information About a Microarray Experiment (MIAME) standard, we propose a similar roadmap for the development of a Minimal Information About a Phylogenetic Analysis (MIAPA) standard. Key in the successful development and implementation of such a standard will be broad participation by developers of phylogenetic analysis software, phylogenetic database developers, practitioners of phylogenomics, and journal editors. PMID:16901231
Single-cell analysis of uncultured magnetotactic bacteria via fluorescence-coupled electron microscopy approach

NASA Astrophysics Data System (ADS)

LI, J.; Zhang, H.; Liu, P.; Menguy, N.; Pan, Y.

2017-12-01

Magnetotactic bacteria (MTB) are phylogenetically diverse and can biomineralize magnetic nanocrystals of magnetite or greigite in intracellular structures termed magnetosomes. Their remains within sediments or sedimentary rocks, i.e. magnetofossils, have been used to retrieve paleomagnetic and paleoenvironmental information of deposition time, as well as to trace the origin and evolution of life on Earth and even perhaps Mars. A precise identification of magnetofossils heavily depends on our knowledge of phylogenetic diversity and magnetosomal biomineralization within natural MTB. In this paper, we will present a novel method which can rapidly characterize both the phylogenetic and biomineralogical properties of uncultured MTB at the single-cell level by coupling fluorescence and electron microscopy. Using this method, we have successfully identified several uncultured MTB strains from natural environments in China. These MTB are phylogenetically affiliated with the Alphaproteobacteria, Deltaproteobacteria, Gammaproteobacteria and Nitrospirae phylum, and form octahedral, cuboctahedral, prismatic, tooth-like and bullet-shaped magnetite magnetosomes. A corresponding analysis of magnetosome morphology and bacterial phylogenetics on each MTB strain has shown a species/strain-specific magnetosome biomineralization. The new method is not only promising for better understanding the correlation between magnetosome mineral habits and MTB phylogenies, but also crucial for unambiguously identifying magnetofossils.
Phylogenetic turnover during subtropical forest succession across environmental and phylogenetic scales.

PubMed

Purschke, Oliver; Michalski, Stefan G; Bruelheide, Helge; Durka, Walter

2017-12-01

Although spatial and temporal patterns of phylogenetic community structure during succession are inherently interlinked and assembly processes vary with environmental and phylogenetic scales, successional studies of community assembly have yet to integrate spatial and temporal components of community structure, while accounting for scaling issues. To gain insight into the processes that generate biodiversity after disturbance, we combine analyses of spatial and temporal phylogenetic turnover across phylogenetic scales, accounting for covariation with environmental differences. We compared phylogenetic turnover, at the species- and individual-level, within and between five successional stages, representing woody plant communities in a subtropical forest chronosequence. We decomposed turnover at different phylogenetic depths and assessed its covariation with between-plot abiotic differences. Phylogenetic turnover between stages was low relative to species turnover and was not explained by abiotic differences. However, within the late-successional stages, there was high presence-/absence-based turnover (clustering) that occurred deep in the phylogeny and covaried with environmental differentiation. Our results support a deterministic model of community assembly where (i) phylogenetic composition is constrained through successional time, but (ii) toward late succession, species sorting into preferred habitats according to niche traits that are conserved deep in phylogeny, becomes increasingly important.
The phylogenetic roots of human lethal violence.

PubMed

Gómez, José María; Verdú, Miguel; González-Megías, Adela; Méndez, Marcos

2016-10-13

The psychological, sociological and evolutionary roots of conspecific violence in humans are still debated, despite attracting the attention of intellectuals for over two millennia. Here we propose a conceptual approach towards understanding these roots based on the assumption that aggression in mammals, including humans, has a significant phylogenetic component. By compiling sources of mortality from a comprehensive sample of mammals, we assessed the percentage of deaths due to conspecifics and, using phylogenetic comparative tools, predicted this value for humans. The proportion of human deaths phylogenetically predicted to be caused by interpersonal violence stood at 2%. This value was similar to the one phylogenetically inferred for the evolutionary ancestor of primates and apes, indicating that a certain level of lethal violence arises owing to our position within the phylogeny of mammals. It was also similar to the percentage seen in prehistoric bands and tribes, indicating that we were as lethally violent then as common mammalian evolutionary history would predict. However, the level of lethal violence has changed through human history and can be associated with changes in the socio-political organization of human populations. Our study provides a detailed phylogenetic and historical context against which to compare levels of lethal violence observed throughout our history.
Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology.

PubMed

Soltis, P S; Soltis, D E; Chase, M W

1999-11-25

Comparative biology requires a firm phylogenetic foundation to uncover and understand patterns of diversification and evaluate hypotheses of the processes responsible for these patterns. In the angiosperms, studies of diversification in floral form, stamen organization, reproductive biology, photosynthetic pathway, nitrogen-fixing symbioses and life histories have relied on either explicit or implied phylogenetic trees. Furthermore, to understand the evolution of specific genes and gene families, evaluate the extent of conservation of plant genomes and make proper sense of the huge volume of molecular genetic data available for model organisms such as Arabidopsis, Antirrhinum, maize, rice and wheat, a phylogenetic perspective is necessary. Here we report the results of parsimony analyses of DNA sequences of the plastid genes rbcL and atpB and the nuclear 18S rDNA for 560 species of angiosperms and seven non-flowering seed plants and show a well-resolved and well-supported phylogenetic tree for the angiosperms for use in comparative biology.
Toward a method for tracking virus evolutionary trajectory applied to the pandemic H1N1 2009 influenza virus.

PubMed

Squires, R Burke; Pickett, Brett E; Das, Sajal; Scheuermann, Richard H

2014-12-01

In 2009 a novel pandemic H1N1 influenza virus (H1N1pdm09) emerged as the first official influenza pandemic of the 21st century. Early genomic sequence analysis pointed to the swine origin of the virus. Here we report a novel computational approach to determine the evolutionary trajectory of viral sequences that uses data-driven estimations of nucleotide substitution rates to track the gradual accumulation of observed sequence alterations over time. Phylogenetic analysis and multiple sequence alignments show that sequences belonging to the resulting evolutionary trajectory of the H1N1pdm09 lineage exhibit a gradual accumulation of sequence variations and tight temporal correlations in the topological structure of the phylogenetic trees. These results suggest that our evolutionary trajectory analysis (ETA) can more effectively pinpoint the evolutionary history of viruses, including the host and geographical location traversed by each segment, when compared against either BLAST or traditional phylogenetic analysis alone. Copyright © 2014 Elsevier B.V. All rights reserved.
Molecular characterization and phylogenetic analysis of the causative agent of hemoplasma infection in small Indian Mongoose (Herpestes Javanicus).

PubMed

Sharifiyazdi, Hassan; Nazifi, Saeed; Shirzad Aski, Hesamaddin; Shayegh, Hossein

2014-09-01

Hemoplasmas are the trivial name for a group of erythrocyte-parasitizing bacteria of the genus Mycoplasma. This study is the first report of hemoplasma infection in Small Indian Mongoose (Herpestes Javanicus) based on molecular analysis of 16S rDNA. Whole blood samples were collected by sterile methods, from 14 live captured mongooses, in the south of Iran. Candidatus Mycoplasma turicensis (CMt)-like hemoplasma was detected in blood samples from one animal tested. BLAST search and phylogenetic analysis of partial 16S rDNA sequence (933bp) of the hemoplasma from Small Indian mongoose (KJ530704) revealed only 96-97% identity to the previously described CMt followed by 95% and 91% similarity with Mycoplasma coccoides and Mycoplasma haemomuris, respectively. Accordingly, the Iranian mongoose CMt isolate showed a high intra-specific genetic variation compared to all previously reported CMt strains in GenBank. Further molecular studies using multiple phylogenetic markers are required to characterize the exact species of Mongoose-derived hemoplasma. Copyright © 2014 Elsevier Ltd. All rights reserved.
Estimating Bayesian Phylogenetic Information Content

PubMed Central

Lewis, Paul O.; Chen, Ming-Hui; Kuo, Lynn; Lewis, Louise A.; Fučíková, Karolina; Neupane, Suman; Wang, Yu-Bo; Shi, Daoyuan

2016-01-01

Measuring the phylogenetic information content of data has a long history in systematics. Here we explore a Bayesian approach to information content estimation. The entropy of the posterior distribution compared with the entropy of the prior distribution provides a natural way to measure information content. If the data have no information relevant to ranking tree topologies beyond the information supplied by the prior, the posterior and prior will be identical. Information in data discourages consideration of some hypotheses allowed by the prior, resulting in a posterior distribution that is more concentrated (has lower entropy) than the prior. We focus on measuring information about tree topology using marginal posterior distributions of tree topologies. We show that both the accuracy and the computational efficiency of topological information content estimation improve with use of the conditional clade distribution, which also allows topological information content to be partitioned by clade. We explore two important applications of our method: providing a compelling definition of saturation and detecting conflict among data partitions that can negatively affect analyses of concatenated data. [Bayesian; concatenation; conditional clade distribution; entropy; information; phylogenetics; saturation.] PMID:27155008
A molecular phylogeny of the stingless bee genus Melipona (Hymenoptera: Apidae).

PubMed

Ramírez, Santiago R; Nieh, James C; Quental, Tiago B; Roubik, David W; Imperatriz-Fonseca, Vera L; Pierce, Naomi E

2010-08-01

Stingless bees (Meliponini) constitute a diverse group of highly eusocial insects that occur throughout tropical regions around the world. The meliponine genus Melipona is restricted to the New World tropics and has over 50 described species. Melipona, like Apis, possesses the remarkable ability to use representational communication to indicate the location of foraging patches. Although Melipona has been the subject of numerous behavioral, ecological, and genetic studies, the evolutionary history of this genus remains largely unexplored. Here, we implement a multigene phylogenetic approach based on nuclear, mitochondrial, and ribosomal loci, coupled with molecular clock methods, to elucidate the phylogenetic relationships and antiquity of subgenera and species of Melipona. Our phylogenetic analysis resolves the relationship among subgenera and tends to agree with morphology-based classification hypotheses. Our molecular clock analysis indicates that the genus Melipona shared a most recent common ancestor at least approximately 14-17 million years (My) ago. These results provide the groundwork for future comparative analyses aimed at understanding the evolution of complex communication mechanisms in eusocial Apidae. Copyright 2010 Elsevier Inc. All rights reserved.
Phylogenetic analysis of DNA and RNA polymerases from a Moniliophthora perniciosa mitochondrial plasmid reveals probable lateral gene transfer.

PubMed

Andrade, B S; Góes-Neto, A

2015-10-30

The filamentous fungus Moniliophthora perniciosa is a hemibiotrophic basidiomycete that causes witches' broom disease of cacao (Theobroma cacao L.). Many fungal mitochondrial plasmids are DNA and RNA polymerase-encoding invertrons with terminal inverted repeats and 5'-linked proteins. The aim of this study was to carry out comparative and phylogenetic analyses of DNA and RNA polymerases for all known linear mitochondrial plasmids in fungi. We performed these analyses at both gene and protein levels and assessed differences between fungal and viral polymerases in order to test the lateral gene transfer (LGT) hypothesis. We analyzed all mitochondrial plasmids of the invertron type within the fungal clade, including five from Ascomycota, seven from Basidiomycota, and one from Chytridiomycota. All phylogenetic analyses generated similar tree topologies regardless of the methods and datasets used. It is likely that DNA and RNA polymerase genes were inserted into the mitochondrial genomes of the 13 fungal species examined in our study as a result of different LGT events. These findings are important for a better understanding of the evolutionary relationships between fungal mitochondrial plasmids.
Phylogeny of the Genus Flavivirus

PubMed Central

Kuno, Goro; Chang, Gwong-Jen J.; Tsuchiya, K. Richard; Karabatsos, Nick; Cropp, C. Bruce

1998-01-01

We undertook a comprehensive phylogenetic study to establish the genetic relationship among the viruses of the genus Flavivirus and to compare the classification based on molecular phylogeny with the existing serologic method. By using a combination of quantitative definitions (bootstrap support level and the pairwise nucleotide sequence identity), the viruses could be classified into clusters, clades, and species. Our phylogenetic study revealed for the first time that from the putative ancestor two branches, non-vector and vector-borne virus clusters, evolved and from the latter cluster emerged tick-borne and mosquito-borne virus clusters. Provided that the theory of arthropod association being an acquired trait was correct, pairwise nucleotide sequence identity among these three clusters provided supporting data for a possibility that the non-vector cluster evolved first, followed by the separation of tick-borne and mosquito-borne virus clusters in that order. Clades established in our study correlated significantly with existing antigenic complexes. We also resolved many of the past taxonomic problems by establishing phylogenetic relationships of the antigenically unclassified viruses with the well-established viruses and by identifying synonymous viruses. PMID:9420202
Phylogeny of the genus Flavivirus.

PubMed

Kuno, G; Chang, G J; Tsuchiya, K R; Karabatsos, N; Cropp, C B

1998-01-01

We undertook a comprehensive phylogenetic study to establish the genetic relationship among the viruses of the genus Flavivirus and to compare the classification based on molecular phylogeny with the existing serologic method. By using a combination of quantitative definitions (bootstrap support level and the pairwise nucleotide sequence identity), the viruses could be classified into clusters, clades, and species. Our phylogenetic study revealed for the first time that from the putative ancestor two branches, non-vector and vector-borne virus clusters, evolved and from the latter cluster emerged tick-borne and mosquito-borne virus clusters. Provided that the theory of arthropod association being an acquired trait was correct, pairwise nucleotide sequence identity among these three clusters provided supporting data for a possibility that the non-vector cluster evolved first, followed by the separation of tick-borne and mosquito-borne virus clusters in that order. Clades established in our study correlated significantly with existing antigenic complexes. We also resolved many of the past taxonomic problems by establishing phylogenetic relationships of the antigenically unclassified viruses with the well-established viruses and by identifying synonymous viruses.

Molecular phylogenetic position of hexactinellid sponges in relation to the Protista and Demospongiae.

PubMed

West, L; Powers, D

1993-01-01

Although it is generally accepted that the first multicellular organisms arose from unicellular ancestors, the phylogenetic relationships linking these groups remain unclear. Anatomical, physiological, and molecular studies of current multicellular organisms with relatively simple body organization suggest key characteristics of the earliest multicellular lineages. Glass sponges, the Hexactinellida, possess cellular characteristics that resemble some unicellular protistan organisms. These unique sponges were abundant in shallow seas of the early Cambrian, but they are currently restricted to polar habitats or very deep regions of the world oceans. Due in part to their relative inaccessibility, their potential significance to the early phylogeny of the eukaryotic kingdoms has been largely overlooked. We used sequences of the 18s ribosomal RNA gene of Farrea occa, a representative of the deep-water hexactinellid sponges, and Coelocarteria singaporense, a representative of the more common demosponges, and compared them with selected ribosomal RNA gene sequences available within the Protista. Using four computational methods for phylogenetic analysis of ribosomal DNA sequences, we found that the hexactinellid sponge-demosponge cluster is most closely related to Volvox and Acanthamoeba.
Fast algorithms for computing phylogenetic divergence time.

PubMed

Crosby, Ralph W; Williams, Tiffani L

2017-12-06

The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.
Feasibility of nuclear ribosomal region ITS1 over ITS2 in barcoding taxonomically challenging genera of subtribe Cassiinae (Fabaceae).

PubMed

Mishra, Priyanka; Kumar, Amit; Rodrigues, Vereena; Shukla, Ashutosh K; Sundaresan, Velusamy

2016-01-01

The internal transcribed spacer (ITS) region is situated between 18S and 26S in a polycistronic rRNA precursor transcript. It had been proved to be the most commonly sequenced region across plant species to resolve phylogenetic relationships ranging from shallow to deep taxonomic levels. Despite several taxonomical revisions in Cassiinae, a stable phylogeny remains elusive at the molecular level, particularly concerning the delineation of species in the genera Cassia, Senna and Chamaecrista . This study addresses the comparative potential of ITS datasets (ITS1, ITS2 and concatenated) in resolving the underlying morphological disparity in the highly complex genera, to assess their discriminatory power as potential barcode candidates in Cassiinae. A combination of experimental data and an in-silico approach based on threshold genetic distances, sequence similarity based and hierarchical tree-based methods was performed to decipher the discriminating power of ITS datasets on 18 different species of Cassiinae complex. Lab-generated s equences were compared against those available in the GenBank using BLAST and were aligned through MUSCLE 3.8.31 and analysed in PAUP 4.0 and BEAST1.8 using parsimony ratchet, maximum likelihood and Bayesian inference (BI) methods of gene and species tree reconciliation with bootstrapping. DNA barcoding gap was realized based on the Kimura two-parameter distance model (K2P) in TaxonDNA and MEGA. Based on the K2P distance, significant divergences between the inter- and intra-specific genetic distances were observed, while the presence of a DNA barcoding gap was obvious. The ITS1 region efficiently identified 81.63% and 90% of species using TaxonDNA and BI methods, respectively. The PWG-distance method based on simple pairwise matching indicated the significance of ITS1 whereby highest number of variable (210) and informative sites (206) were obtained. The BI tree-based methods outperformed the similarity-based methods producing well-resolved phylogenetic trees with many nodes well supported by bootstrap analyses. The reticulated phylogenetic hypothesis using the ITS1 region mainly supported the relationship between the species of Cassiinae established by traditional morphological methods. The ITS1 region showed a higher discrimination power and desirable characteristics as compared to ITS2 and ITS1 + 2, thereby concluding to be the locus of choice. Considering the complexity of the group and the underlying biological ambiguities, the results presented here are encouraging for developing DNA barcoding as a useful tool for resolving taxonomical challenges in corroboration with morphological framework.
Open Reading Frame Phylogenetic Analysis on the Cloud

PubMed Central

2013-01-01

Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843
Applying phylogenetic analysis to viral livestock diseases: moving beyond molecular typing.

PubMed

Olvera, Alex; Busquets, Núria; Cortey, Marti; de Deus, Nilsa; Ganges, Llilianne; Núñez, José Ignacio; Peralta, Bibiana; Toskano, Jennifer; Dolz, Roser

2010-05-01

Changes in livestock production systems in recent years have altered the presentation of many diseases resulting in the need for more sophisticated control measures. At the same time, new molecular assays have been developed to support the diagnosis of animal viral disease. Nucleotide sequences generated by these diagnostic techniques can be used in phylogenetic analysis to infer phenotypes by sequence homology and to perform molecular epidemiology studies. In this review, some key elements of phylogenetic analysis are highlighted, such as the selection of the appropriate neutral phylogenetic marker, the proper phylogenetic method and different techniques to test the reliability of the resulting tree. Examples are given of current and future applications of phylogenetic reconstructions in viral livestock diseases. Copyright 2009 Elsevier Ltd. All rights reserved.
Mammalian phylogenetic diversity-area relationships at a continental scale

PubMed Central

Mazel, Florent; Renaud, Julien; Guilhaumon, François; Mouillot, David; Gravel, Dominique; Thuiller, Wilfried

2015-01-01

In analogy to the species-area relationship (SAR), one of the few laws in Ecology, the phylogenetic diversity-area relationship (PDAR) describes the tendency of phylogenetic diversity (PD) to increase with area. Although investigating PDAR has the potential to unravel the underlying processes shaping assemblages across spatial scales and to predict PD loss through habitat reduction, it has been little investigated so far. Focusing on PD has noticeable advantages compared to species richness (SR) since PD also gives insights on processes such as speciation/extinction, assembly rules and ecosystem functioning. Here we investigate the universality and pervasiveness of the PDAR at continental scale using terrestrial mammals as study case. We define the relative robustness of PD (compared to SR) to habitat loss as the area between the standardized PDAR and standardized SAR (i.e. standardized by the diversity of the largest spatial window) divided by the area under the standardized SAR only. This metric quantifies the relative increase of PD robustness compared to SR robustness. We show that PD robustness is higher than SR robustness but that it varies among continents. We further use a null model approach to disentangle the relative effect of phylogenetic tree shape and non random spatial distribution of evolutionary history on the PDAR. We find that for most spatial scales and for all continents except Eurasia, PDARs are not different from expected by a model using only the observed SAR and the shape of the phylogenetic tree at continental scale. Interestingly, we detect a strong phylogenetic structure of the Eurasian PDAR that can be predicted by a model that specifically account for a finer biogeographical delineation of this continent. In conclusion, the relative robustness of PD to habitat loss compared to species richness is determined by the phylogenetic tree shape but also depends on the spatial structure of PD. PMID:26649401
Comparative evaluation of the identification of rapidly growing non-tuberculous mycobacteria by mass spectrometry (MALDI-TOF MS), GenoType Mycobacterium CM/AS assay and partial sequencing of the rpoβ gene with phylogenetic analysis as a reference method.

PubMed

Costa-Alcalde, José Javier; Barbeito-Castiñeiras, Gema; González-Alba, José María; Aguilera, Antonio; Galán, Juan Carlos; Pérez-Del-Molino, María Luisa

2018-06-02

The American Thoracic Society and the Infectious Diseases Society of America recommend that clinically significant non-tuberculous mycobacteria (NTM) should be identified to the species level in order to determine their clinical significance. The aim of this study was to evaluate identification of rapidly growing NTM (RGM) isolated from clinical samples by using MALDI-TOF MS and a commercial molecular system. The results were compared with identification using a reference method. We included 46 clinical isolates of RGM and identified them using the commercial molecular system GenoType ® CM/AS (Hain, Lifescience, Germany), MALDI-TOF MS (Bruker) and, as reference method, partial rpoβ gene sequencing followed by BLAST and phylogenetic analysis with the 1093 sequences available in the GeneBank. The degree of agreement between GenoType ® and MALDI-TOF MS and the reference method, partial rpoβ sequencing, was 27/43 (62.8%) and 38/43 cases (88.3%) respectively. For all the samples correctly classified by GenoType ® , we obtained the same result with MALDI-TOF MS (27/27). However, MALDI-TOF MS also correctly identified 68.75% (11/16) of the samples that GenoType ® had misclassified (p=0.005). MALDI-TOF MS classified significantly better than GenoType ® . When a MALDI-TOF MS score >1.85 was achieved, MALDI-TOF MS and partial rpoβ gene sequencing were equivalent. GenoType ® was not able to distinguish between species belonging to the M. fortuitum complex. MALDI-TOF MS methodology is simple, rapid and associated with lower consumable costs than GenoType ® . The partial rpoβ sequencing methods with BLAST and phylogenetic analysis were not able to identify some RGM unequivocally. Therefore, sequencing of additional regions would be indicated in these cases. Copyright © 2018 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.
Bottomless barrel-sponge species in the Indo-Pacific?

PubMed

Setiawan, Edwin; Voogd, Nicole J De; Wörheide, Gert; Erpenbeck, Dirk

2016-07-06

The use of nuclear markers, in addition to traditional mitochondrial markers, helps to clarify hidden patterns of genetic structure in natural populations (Palumbi & Baker, 1994). This is particularly evident among demosponges that possess slow mitochondrial evolutionary rates compared to Bilateria, where nuclear intron markers can aid in the understanding of shallow level phylogenetic relationships (Shearer et al., 2002). Ideally, these nuclear markers (i) are evolutionary well-conserved across different lineages, (ii) produce amplicons holding a number of sites with sufficient variability to answer the relevant phylogenetic question, (iii) derive from single copy genes (see review in Zhang & Hewitt, 2003). A popular method to amplify intron markers uses EPIC (Exon-Primed, Intron-Crossing) primers that anneal to the more conserved flanking exon regions and subsequently bridge the intron during amplification (Palumbi & Baker, 1994).
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.

PubMed

Xu, Weijia; Ozer, Stuart; Gutell, Robin R

2009-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA

PubMed Central

Xu, Weijia; Ozer, Stuart; Gutell, Robin R.

2010-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure. PMID:20502534
Bacterial genomes in epidemiology—present and future

PubMed Central

Croucher, Nicholas J.; Harris, Simon R.; Grad, Yonatan H.; Hanage, William P.

2013-01-01

Sequence data are well established in the reconstruction of the phylogenetic and demographic scenarios that have given rise to outbreaks of viral pathogens. The application of similar methods to bacteria has been hindered in the main by the lack of high-resolution nucleotide sequence data from quality samples. Developing and already available genomic methods have greatly increased the amount of data that can be used to characterize an isolate and its relationship to others. However, differences in sequencing platforms and data analysis mean that these enhanced data come with a cost in terms of portability: results from one laboratory may not be directly comparable with those from another. Moreover, genomic data for many bacteria bear the mark of a history including extensive recombination, which has the potential to greatly confound phylogenetic and coalescent analyses. Here, we discuss the exacting requirements of genomic epidemiology, and means by which the distorting signal of recombination can be minimized to permit the leverage of growing datasets of genomic data from bacterial pathogens. PMID:23382424
Evolution of gastropod mitochondrial genome arrangements

PubMed Central

2008-01-01

Background Gastropod mitochondrial genomes exhibit an unusually great variety of gene orders compared to other metazoan mitochondrial genome such as e.g those of vertebrates. Hence, gastropod mitochondrial genomes constitute a good model system to study patterns, rates, and mechanisms of mitochondrial genome rearrangement. However, this kind of evolutionary comparative analysis requires a robust phylogenetic framework of the group under study, which has been elusive so far for gastropods in spite of the efforts carried out during the last two decades. Here, we report the complete nucleotide sequence of five mitochondrial genomes of gastropods (Pyramidella dolabrata, Ascobulla fragilis, Siphonaria pectinata, Onchidella celtica, and Myosotella myosotis), and we analyze them together with another ten complete mitochondrial genomes of gastropods currently available in molecular databases in order to reconstruct the phylogenetic relationships among the main lineages of gastropods. Results Comparative analyses with other mollusk mitochondrial genomes allowed us to describe molecular features and general trends in the evolution of mitochondrial genome organization in gastropods. Phylogenetic reconstruction with commonly used methods of phylogenetic inference (ME, MP, ML, BI) arrived at a single topology, which was used to reconstruct the evolution of mitochondrial gene rearrangements in the group. Conclusion Four main lineages were identified within gastropods: Caenogastropoda, Vetigastropoda, Patellogastropoda, and Heterobranchia. Caenogastropoda and Vetigastropoda are sister taxa, as well as, Patellogastropoda and Heterobranchia. This result rejects the validity of the derived clade Apogastropoda (Caenogastropoda + Heterobranchia). The position of Patellogastropoda remains unclear likely due to long-branch attraction biases. Within Heterobranchia, the most heterogeneous group of gastropods, neither Euthyneura (because of the inclusion of P. dolabrata) nor Pulmonata (polyphyletic) nor Opisthobranchia (because of the inclusion S. pectinata) were recovered as monophyletic groups. The gene order of the Vetigastropoda might represent the ancestral mitochondrial gene order for Gastropoda and we propose that at least three major rearrangements have taken place in the evolution of gastropods: one in the ancestor of Caenogastropoda, another in the ancestor of Patellogastropoda, and one more in the ancestor of Heterobranchia. PMID:18302768
Comparative transcriptomics of early dipteran development

PubMed Central

2013-01-01

Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies). PMID:23432914
Phylogenetics of modern birds in the era of genomics

PubMed Central

Edwards, Scott V; Bryan Jennings, W; Shedlock, Andrew M

2005-01-01

In the 14 years since the first higher-level bird phylogenies based on DNA sequence data, avian phylogenetics has witnessed the advent and maturation of the genomics era, the completion of the chicken genome and a suite of technologies that promise to add considerably to the agenda of avian phylogenetics. In this review, we summarize current approaches and data characteristics of recent higher-level bird studies and suggest a number of as yet untested molecular and analytical approaches for the unfolding tree of life for birds. A variety of comparative genomics strategies, including adoption of objective quality scores for sequence data, analysis of contiguous DNA sequences provided by large-insert genomic libraries, and the systematic use of retroposon insertions and other rare genomic changes all promise an integrated phylogenetics that is solidly grounded in genome evolution. The avian genome is an excellent testing ground for such approaches because of the more balanced representation of single-copy and repetitive DNA regions than in mammals. Although comparative genomics has a number of obvious uses in avian phylogenetics, its application to large numbers of taxa poses a number of methodological and infrastructural challenges, and can be greatly facilitated by a ‘community genomics’ approach in which the modest sequencing throughputs of single PI laboratories are pooled to produce larger, complementary datasets. Although the polymerase chain reaction era of avian phylogenetics is far from complete, the comparative genomics era—with its ability to vastly increase the number and type of molecular characters and to provide a genomic context for these characters—will usher in a host of new perspectives and opportunities for integrating genome evolution and avian phylogenetics. PMID:16024355
Treetrimmer: a method for phylogenetic dataset size reduction.

PubMed

Maruyama, Shinichiro; Eveleigh, Robert J M; Archibald, John M

2013-04-12

With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual 'pruning' of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. Here we present 'TreeTrimmer', a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined 'redundant' sequences, e.g., orthologous sequences from closely related organisms and 'recently' evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion.
Application of unweighted pair group methods with arithmetic average (UPGMA) for identification of kinship types and spreading of ebola virus through establishment of phylogenetic tree

NASA Astrophysics Data System (ADS)

Andriani, Tri; Irawan, Mohammad Isa

2017-08-01

Ebola Virus Disease (EVD) is a disease caused by a virus of the genus Ebolavirus (EBOV), family Filoviridae. Ebola virus is classifed into five types, namely Zaire ebolavirus (ZEBOV), Sudan ebolavirus (SEBOV), Bundibugyo ebolavirus (BEBOV), Tai Forest ebolavirus also known as Cote d'Ivoire ebolavirus (CIEBOV), and Reston ebolavirus (REBOV). Identification of kinship types of Ebola virus can be performed using phylogenetic trees. In this study, the phylogenetic tree constructed by UPGMA method in which there are Multiple Alignment using Progressive Method. The results concluded that the phylogenetic tree formation kinship ebola virus types that kind of Tai Forest ebolavirus close to Bundibugyo ebolavirus but the layout state ebola epidemic spread far apart. The genetic distance for this type of Bundibugyo ebolavirus with Tai Forest ebolavirus is 0.3725. Type Tai Forest ebolavirus similar to Bundibugyo ebolavirus not inuenced by the proximity of the area ebola epidemic spread.
A Complete Fossil-Calibrated Phylogeny of Seed Plant Families as a Tool for Comparative Analyses: Testing the ‘Time for Speciation’ Hypothesis

PubMed Central

Harris, Liam W.; Davies, T. Jonathan

2016-01-01

Explaining the uneven distribution of species richness across the branches of the tree of life has been a major challenge for evolutionary biologists. Advances in phylogenetic reconstruction, allowing the generation of large, well-sampled, phylogenetic trees have provided an opportunity to contrast competing hypotheses. Here, we present a new time-calibrated phylogeny of seed plant families using Bayesian methods and 26 fossil calibrations. While there are various published phylogenetic trees for plants which have a greater density of species sampling, we are still a long way from generating a complete phylogeny for all ~300,000+ plants. Our phylogeny samples all seed plant families and is a useful tool for comparative analyses. We use this new phylogenetic hypothesis to contrast two alternative explanations for differences in species richness among higher taxa: time for speciation versus ecological limits. We calculated net diversification rate for each clade in the phylogeny and assessed the relationship between clade age and species richness. We then fit models of speciation and extinction to individual branches in the tree to identify major rate-shifts. Our data suggest that the majority of lineages are diversifying very slowly while a few lineages, distributed throughout the tree, are diversifying rapidly. Diversification is unrelated to clade age, no matter the age range of the clades being examined, contrary to both the assumption of an unbounded lineage increase through time, and the paradigm of fixed ecological limits. These findings are consistent with the idea that ecology plays a role in diversification, but rather than imposing a fixed limit, it may have variable effects on per lineage diversification rates through time. PMID:27706173
Allometry of sexual size dimorphism in turtles: a comparison of mass and length data.

PubMed

Regis, Koy W; Meik, Jesse M

2017-01-01

The macroevolutionary pattern of Rensch's Rule (positive allometry of sexual size dimorphism) has had mixed support in turtles. Using the largest carapace length dataset and only large-scale body mass dataset assembled for this group, we determine (a) whether turtles conform to Rensch's Rule at the order, suborder, and family levels, and (b) whether inferences regarding allometry of sexual size dimorphism differ based on choice of body size metric used for analyses. We compiled databases of mean body mass and carapace length for males and females for as many populations and species of turtles as possible. We then determined scaling relationships between males and females for average body mass and straight carapace length using traditional and phylogenetic comparative methods. We also used regression analyses to evalutate sex-specific differences in the variance explained by carapace length on body mass. Using traditional (non-phylogenetic) analyses, body mass supports Rensch's Rule, whereas straight carapace length supports isometry. Using phylogenetic independent contrasts, both body mass and straight carapace length support Rensch's Rule with strong congruence between metrics. At the family level, support for Rensch's Rule is more frequent when mass is used and in phylogenetic comparative analyses. Turtles do not differ in slopes of sex-specific mass-to-length regressions and more variance in body size within each sex is explained by mass than by carapace length. Turtles display Rensch's Rule overall and within families of Cryptodires, but not within Pleurodire families. Mass and length are strongly congruent with respect to Rensch's Rule across turtles, and discrepancies are observed mostly at the family level (the level where Rensch's Rule is most often evaluated). At macroevolutionary scales, the purported advantages of length measurements over weight are not supported in turtles.
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty

PubMed Central

Baele, Guy; Lemey, Philippe; Suchard, Marc A.

2016-01-01

Marginal likelihood estimates to compare models using Bayes factors frequently accompany Bayesian phylogenetic inference. Approaches to estimate marginal likelihoods have garnered increased attention over the past decade. In particular, the introduction of path sampling (PS) and stepping-stone sampling (SS) into Bayesian phylogenetics has tremendously improved the accuracy of model selection. These sampling techniques are now used to evaluate complex evolutionary and population genetic models on empirical data sets, but considerable computational demands hamper their widespread adoption. Further, when very diffuse, but proper priors are specified for model parameters, numerical issues complicate the exploration of the priors, a necessary step in marginal likelihood estimation using PS or SS. To avoid such instabilities, generalized SS (GSS) has recently been proposed, introducing the concept of “working distributions” to facilitate—or shorten—the integration process that underlies marginal likelihood estimation. However, the need to fix the tree topology currently limits GSS in a coalescent-based framework. Here, we extend GSS by relaxing the fixed underlying tree topology assumption. To this purpose, we introduce a “working” distribution on the space of genealogies, which enables estimating marginal likelihoods while accommodating phylogenetic uncertainty. We propose two different “working” distributions that help GSS to outperform PS and SS in terms of accuracy when comparing demographic and evolutionary models applied to synthetic data and real-world examples. Further, we show that the use of very diffuse priors can lead to a considerable overestimation in marginal likelihood when using PS and SS, while still retrieving the correct marginal likelihood using both GSS approaches. The methods used in this article are available in BEAST, a powerful user-friendly software package to perform Bayesian evolutionary analyses. PMID:26526428
MASTtreedist: visualization of tree space based on maximum agreement subtree.

PubMed

Huang, Hong; Li, Yongji

2013-01-01

Phylogenetic tree construction process might produce many candidate trees as the "best estimates." As the number of constructed phylogenetic trees grows, the need to efficiently compare their topological or physical structures arises. One of the tree comparison's software tools, the Mesquite's Tree Set Viz module, allows the rapid and efficient visualization of the tree comparison distances using multidimensional scaling (MDS). Tree-distance measures, such as Robinson-Foulds (RF), for the topological distance among different trees have been implemented in Tree Set Viz. New and sophisticated measures such as Maximum Agreement Subtree (MAST) can be continuously built upon Tree Set Viz. MAST can detect the common substructures among trees and provide more precise information on the similarity of the trees, but it is NP-hard and difficult to implement. In this article, we present a practical tree-distance metric: MASTtreedist, a MAST-based comparison metric in Mesquite's Tree Set Viz module. In this metric, the efficient optimizations for the maximum weight clique problem are applied. The results suggest that the proposed method can efficiently compute the MAST distances among trees, and such tree topological differences can be translated as a scatter of points in two-dimensional (2D) space. We also provide statistical evaluation of provided measures with respect to RF-using experimental data sets. This new comparison module provides a new tree-tree pairwise comparison metric based on the differences of the number of MAST leaves among constructed phylogenetic trees. Such a new phylogenetic tree comparison metric improves the visualization of taxa differences by discriminating small divergences of subtree structures for phylogenetic tree reconstruction.

pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

PubMed Central

2010-01-01

Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service. PMID:21034504
Evidence for a close phylogenetic relationship between Melissococcus pluton, the causative agent of European foulbrood disease, and the genus Enterococcus.

PubMed

Cai, J; Collins, M D

1994-04-01

The 16S rRNA gene sequence of Melissococcus pluton, the causative agent of European foulbrood disease, was determined in order to investigate the phylogenetic relationships between this organism and other low-G + C-content gram-positive bacteria. A comparative sequence analysis revealed that M. pluton is a close phylogenetic relative of the genus Enterococcus.
Phylogenetic position of the genus Perkinsus (Protista, Apicomplexa) based on small subunit ribosomal RNA.

PubMed

Goggin, C L; Barker, S C

1993-07-01

Parasites of the genus Perkinsus destroy marine molluscs worldwide. Their phylogenetic position within the kingdom Protista is controversial. Nucleotide sequence data (1792 bp) from the small subunit rRNA gene of Perkinsus sp. from Anadara trapezia (Mollusca: Bivalvia) from Moreton Bay, Queensland, was used to examine the phylogenetic affinities of this enigmatic genus. These data were aligned with nucleotide sequences from 6 apicomplexans, 3 ciliates, 3 flagellates, a dinoflagellate, 3 fungi, maize and human. Phylogenetic trees were constructed after analysis with maximum parsimony and distance matrix methods. Our analyses indicate that Perkinsus is phylogenetically closer to dinoflagellates and to coccidean and piroplasm apicomplexans than to fungi or flagellates.
CtGEM typing: Discrimination of Chlamydia trachomatis ocular and urogenital strains and major evolutionary lineages by high resolution melting analysis of two amplified DNA fragments.

PubMed

Giffard, Philip M; Andersson, Patiyan; Wilson, Judith; Buckley, Cameron; Lilliebridge, Rachael; Harris, Tegan M; Kleinecke, Mariana; O'Grady, Kerry-Ann F; Huston, Wilhelmina M; Lambert, Stephen B; Whiley, David M; Holt, Deborah C

2018-01-01

Chlamydia trachomatis infects the urogenital tract (UGT) and eyes. Anatomical tropism is correlated with variation in the major outer membrane protein encoded by ompA. Strains possessing the ocular ompA variants A, B, Ba and C are typically found within the phylogenetically coherent "classical ocular lineage". However, variants B, Ba and C have also been found within three distinct strains in Australia, all associated with ocular disease in children and outside the classical ocular lineage. CtGEM genotyping is a method for detecting and discriminating ocular strains and also the major phylogenetic lineages. The rationale was facilitation of surveillance to inform responses to C. trachomatis detection in UGT specimens from young children. CtGEM typing is based on high resolution melting analysis (HRMA) of two PCR amplified fragments with high combinatorial resolving power, as defined by computerised comparison of 65 whole genomes. One fragment is from the hypothetical gene defined by Jali-1891 in the C. trachomatis B_Jali20 genome, while the other is from ompA. Twenty combinatorial CtGEM types have been shown to exist, and these encompass unique genotypes for all known ocular strains, and also delineate the TI and T2 major phylogenetic lineages, identify LGV strains and provide additional resolution beyond this. CtGEM typing and Sanger sequencing were compared with 42 C. trachomatis positive clinical specimens, and there were no disjunctions. CtGEM typing is a highly efficient method designed and tested using large scale comparative genomics. It divides C. trachomatis into clinically and biologically meaningful groups, and may have broad application in surveillance.
Multilocus variable-number tandem repeat analysis for molecular typing and phylogenetic analysis of Shigella flexneri

PubMed Central

2009-01-01

Background Shigella flexneri is one of the causative agents of shigellosis, a major cause of childhood mortality in developing countries. Multilocus variable-number tandem repeat (VNTR) analysis (MLVA) is a prominent subtyping method to resolve closely related bacterial isolates for investigation of disease outbreaks and provide information for establishing phylogenetic patterns among isolates. The present study aimed to develop an MLVA method for S. flexneri and the VNTR loci identified were tested on 242 S. flexneri isolates to evaluate their variability in various serotypes. The isolates were also analyzed by pulsed-field gel electrophoresis (PFGE) to compare the discriminatory power and to evaluate the usefulness of MLVA as a tool for phylogenetic analysis of S. flexneri. Results Thirty-six VNTR loci were identified by exploring the repeat sequence loci in genomic sequences of Shigella species and by testing the loci on nine isolates of different subserotypes. The VNTR loci in different serotype groups differed greatly in their variability. The discriminatory power of an MLVA assay based on four most variable VNTR loci was higher, though not significantly, than PFGE for the total isolates, a panel of 2a isolates, which were relatively diverse, and a panel of 4a/Y isolates, which were closely-related. Phylogenetic groupings based on PFGE patterns and MLVA profiles were considerably concordant. The genetic relationships among the isolates were correlated with serotypes. The phylogenetic trees constructed using PFGE patterns and MLVA profiles presented two distinct clusters for the isolates of serotype 3 and one distinct cluster for each of the serotype groups, 1a/1b/NT, 2a/2b/X/NT, 4a/Y, and 6. Isolates that had different serotypes but had closer genetic relatedness than those with the same serotype were observed between serotype Y and subserotype 4a, serotype X and subserotype 2b, subserotype 1a and 1b, and subserotype 3a and 3b. Conclusions The 36 VNTR loci identified exhibited considerably different degrees of variability among S. flexneri serotype groups. VNTR locus could be highly variable in a serotype but invariable in others. MLVA assay based on four highly variable loci could display a comparable resolving power to PFGE in discriminating isolates. MLVA is also a prominent molecular tool for phylogenetic analysis of S. flexneri; the resulting data are beneficial to establish clear clonal patterns among different serotype groups and to discern clonal groups among isolates within the same serotype. As highly variable VNTR loci could be serotype-specific, a common MLVA protocol that consists of only a small set of loci, for example four to eight loci, and that provides high resolving power to all S. flexneri serotypes may not be obtainable. PMID:20042119
Evolutionary History of the Asian Horned Frogs (Megophryinae): Integrative Approaches to Timetree Dating in the Absence of a Fossil Record.

PubMed

Mahony, Stephen; Foley, Nicole M; Biju, S D; Teeling, Emma C

2017-03-01

Molecular dating studies typically need fossils to calibrate the analyses. Unfortunately, the fossil record is extremely poor or presently nonexistent for many species groups, rendering such dating analysis difficult. One such group is the Asian horned frogs (Megophryinae). Sampling all generic nomina, we combined a novel ∼5 kb dataset composed of four nuclear and three mitochondrial gene fragments to produce a robust phylogeny, with an extensive external morphological study to produce a working taxonomy for the group. Expanding the molecular dataset to include out-groups of fossil-represented ancestral anuran families, we compared the priorless RelTime dating method with the widely used prior-based Bayesian timetree method, MCMCtree, utilizing a novel combination of fossil priors for anuran phylogenetic dating. The phylogeny was then subjected to ancestral phylogeographic analyses, and dating estimates were compared with likely biogeographic vicariant events. Phylogenetic analyses demonstrated that previously proposed systematic hypotheses were incorrect due to the paraphyly of genera. Molecular phylogenetic, morphological, and timetree results support the recognition of Megophryinae as a single genus, Megophrys, with a subgenus level classification. Timetree results using RelTime better corresponded with the known fossil record for the out-group anuran tree. For the priorless in-group, it also outperformed MCMCtree when node date estimates were compared with likely influential historical biogeographic events, providing novel insights into the evolutionary history of this pan-Asian anuran group. Given a relatively small molecular dataset, and limited prior knowledge, this study demonstrates that the computationally rapid RelTime dating tool may outperform more popular and complex prior reliant timetree methodologies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Invasions but not extinctions change phylogenetic diversity of angiosperm assemblage on southeastern Pacific Oceanic islands

PubMed Central

2017-01-01

We assessed changes in phylogenetic diversity of angiosperm flora on six oceanic islands located in the southeastern Pacific Ocean, by comparing flora from two periods: the pre-European colonization of islands and current times. We hypothesize that, in the time between these periods, extinction of local plant species and addition of exotic plants modified phylogenetic-α-diversity at different levels (deeper and terminal phylogeny) and increased phylo-β-diversity among islands. Based on floristic studies, we assembled a phylogenetic tree from occurrence data that includes 921 species, of which 165 and 756 were native or exotic in origin, respectively. Then, we studied change in the phylo-α-diversity and phylo-β-diversity (1 –Phylosor) by comparing pre-European and current times. Despite extinction of 18 native angiosperm species, an increase in species richness and phylo-α-diversity was observed for all islands studied, attributed to introduction of exotic plants (between 6 to 477 species per island). We did not observe significant variation of mean phylogenetic distance (MPD), a measure of the ‘deeper’ phylogenetic diversity of assemblages (e.g., orders, families), suggesting that neither extinctions nor introductions altered phylogenetic structure of the angiosperms of these islands. In regard to phylo-β-diversity, we detected temporal turnover (variation in phylogenetic composition) between periods to flora (0.38 ± 0.11). However, when analyses were performed only considering native plants, we did not observe significant temporal turnover between periods (0.07 ± 0.06). These results indicate that introduction of exotic angiosperms has contributed more notably than extinctions to the configuration of plant assemblages and phylogenetic diversity on the studied islands. Because phylogenetic diversity is closely related to functional diversity (species trait variations and roles performed by organisms), our results suggests that the introduction of exotic plants to these islands could have detrimental impacts for ecosystem functions and ecosystem services that islands provide (e.g. productivity). PMID:28763508
Taming the BEAST—A Community Teaching Material Resource for BEAST 2

PubMed Central

Barido-Sottani, Joëlle; Bošková, Veronika; Plessis, Louis Du; Kühnert, Denise; Magnus, Carsten; Mitov, Venelin; Müller, Nicola F.; PečErska, Jūlija; Rasmussen, David A.; Zhang, Chi; Drummond, Alexei J.; Heath, Tracy A.; Pybus, Oliver G.; Vaughan, Timothy G.; Stadler, Tanja

2018-01-01

Abstract Phylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics. The combination of increasingly large genetic data sets and increases in computing power is facilitating the development of more sophisticated phylogenetic and phylodynamic methods. Big data sets allow us to answer complex questions. However, since the required analyses are highly specific to the particular data set and question, a black-box method is not sufficient anymore. Instead, biologists are required to be actively involved with modeling decisions during data analysis. The modular design of the Bayesian phylogenetic software package BEAST 2 enables, and in fact enforces, this involvement. At the same time, the modular design enables computational biology groups to develop new methods at a rapid rate. A thorough understanding of the models and algorithms used by inference software is a critical prerequisite for successful hypothesis formulation and assessment. In particular, there is a need for more readily available resources aimed at helping interested scientists equip themselves with the skills to confidently use cutting-edge phylogenetic analysis software. These resources will also benefit researchers who do not have access to similar courses or training at their home institutions. Here, we introduce the “Taming the Beast” (https://taming-the-beast.github.io/) resource, which was developed as part of a workshop series bearing the same name, to facilitate the usage of the Bayesian phylogenetic software package BEAST 2. PMID:28673048
Taming the BEAST-A Community Teaching Material Resource for BEAST 2.

PubMed

Barido-Sottani, Joëlle; Bošková, Veronika; Plessis, Louis Du; Kühnert, Denise; Magnus, Carsten; Mitov, Venelin; Müller, Nicola F; PecErska, Julija; Rasmussen, David A; Zhang, Chi; Drummond, Alexei J; Heath, Tracy A; Pybus, Oliver G; Vaughan, Timothy G; Stadler, Tanja

2018-01-01

Phylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics. The combination of increasingly large genetic data sets and increases in computing power is facilitating the development of more sophisticated phylogenetic and phylodynamic methods. Big data sets allow us to answer complex questions. However, since the required analyses are highly specific to the particular data set and question, a black-box method is not sufficient anymore. Instead, biologists are required to be actively involved with modeling decisions during data analysis. The modular design of the Bayesian phylogenetic software package BEAST 2 enables, and in fact enforces, this involvement. At the same time, the modular design enables computational biology groups to develop new methods at a rapid rate. A thorough understanding of the models and algorithms used by inference software is a critical prerequisite for successful hypothesis formulation and assessment. In particular, there is a need for more readily available resources aimed at helping interested scientists equip themselves with the skills to confidently use cutting-edge phylogenetic analysis software. These resources will also benefit researchers who do not have access to similar courses or training at their home institutions. Here, we introduce the "Taming the Beast" (https://taming-the-beast.github.io/) resource, which was developed as part of a workshop series bearing the same name, to facilitate the usage of the Bayesian phylogenetic software package BEAST 2. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
ProtPhylo: identification of protein-phenotype and protein-protein functional associations via phylogenetic profiling.

PubMed

Cheng, Yiming; Perocchi, Fabiana

2015-07-01

ProtPhylo is a web-based tool to identify proteins that are functionally linked to either a phenotype or a protein of interest based on co-evolution. ProtPhylo infers functional associations by comparing protein phylogenetic profiles (co-occurrence patterns of orthology relationships) for more than 9.7 million non-redundant protein sequences from all three domains of life. Users can query any of 2048 fully sequenced organisms, including 1678 bacteria, 255 eukaryotes and 115 archaea. In addition, they can tailor ProtPhylo to a particular kind of biological question by choosing among four main orthology inference methods based either on pair-wise sequence comparisons (One-way Best Hits and Best Reciprocal Hits) or clustering of orthologous proteins across multiple species (OrthoMCL and eggNOG). Next, ProtPhylo ranks phylogenetic neighbors of query proteins or phenotypic properties using the Hamming distance as a measure of similarity between pairs of phylogenetic profiles. Candidate hits can be easily and flexibly prioritized by complementary clues on subcellular localization, known protein-protein interactions, membrane spanning regions and protein domains. The resulting protein list can be quickly exported into a csv text file for further analyses. ProtPhylo is freely available at http://www.protphylo.org. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Radiating despite a Lack of Character: Ecological Divergence among Closely Related, Morphologically Similar Honeyeaters (Aves: Meliphagidae) Co-occurring in Arid Australian Environments.

PubMed

Miller, Eliot T; Wagner, Sarah K; Harmon, Luke J; Ricklefs, Robert E

2017-02-01

Quantifying the relationship between form and function can inform use of morphology as a surrogate for ecology. How the strength of this relationship varies continentally can inform understanding of evolutionary radiations; for example, does the relationship break down when certain lineages invade and diversify in novel habitats? The 75 species of Australian honeyeaters (Meliphagidae) are morphologically and ecologically diverse, with species feeding on nectar, insects, fruit, and other resources. We investigated Meliphagidae ecomorphology and community structure by (1) quantifying the concordance between morphology and ecology (foraging behavior), (2) estimating rates of trait evolution in relation to the packing of ecological space, and (3) comparing phylogenetic and trait community structure across the broad environmental gradients of the continent. We found that morphology explained 37% of the variance in ecology (and 62% vice versa), and we uncovered well-known bivariate relationships among the multivariate ecomorphological data. Ecological trait diversity declined less rapidly than phylogenetic diversity along a gradient of decreasing precipitation. We employ a new method (trait fields) and extend another (phylogenetic fields) to show that while species in phylogenetically clustered, arid-environment assemblages are similar morphologically, they are as varied in foraging behavior as those from more diverse assemblages. Thus, although closely related and similar morphologically, these arid-adapted species have diverged in ecological space to a similar degree as their mesic counterparts.
Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance

PubMed Central

2013-01-01

Background Constructing species trees from multi-copy gene trees remains a challenging problem in phylogenetics. One difficulty is that the underlying genes can be incongruent due to evolutionary processes such as gene duplication and loss, deep coalescence, or lateral gene transfer. Gene tree estimation errors may further exacerbate the difficulties of species tree estimation. Results We present a new approach for inferring species trees from incongruent multi-copy gene trees that is based on a generalization of the Robinson-Foulds (RF) distance measure to multi-labeled trees (mul-trees). We prove that it is NP-hard to compute the RF distance between two mul-trees; however, it is easy to calculate this distance between a mul-tree and a singly-labeled species tree. Motivated by this, we formulate the RF problem for mul-trees (MulRF) as follows: Given a collection of multi-copy gene trees, find a singly-labeled species tree that minimizes the total RF distance from the input mul-trees. We develop and implement a fast SPR-based heuristic algorithm for the NP-hard MulRF problem. We compare the performance of the MulRF method (available at http://genome.cs.iastate.edu/CBL/MulRF/) with several gene tree parsimony approaches using gene tree simulations that incorporate gene tree error, gene duplications and losses, and/or lateral transfer. The MulRF method produces more accurate species trees than gene tree parsimony approaches. We also demonstrate that the MulRF method infers in minutes a credible plant species tree from a collection of nearly 2,000 gene trees. Conclusions Our new phylogenetic inference method, based on a generalized RF distance, makes it possible to quickly estimate species trees from large genomic data sets. Since the MulRF method, unlike gene tree parsimony, is based on a generic tree distance measure, it is appealing for analyses of genomic data sets, in which many processes such as deep coalescence, recombination, gene duplication and losses as well as phylogenetic error may contribute to gene tree discord. In experiments, the MulRF method estimated species trees accurately and quickly, demonstrating MulRF as an efficient alternative approach for phylogenetic inference from large-scale genomic data sets. PMID:24180377
Phylogenetic Molecular Species Delimitations Unravel Potential New Species in the Pest Genus Spodoptera Guenée, 1852 (Lepidoptera, Noctuidae)

PubMed Central

Dumas, Pascaline; Barbut, Jérôme; Le Ru, Bruno; Silvain, Jean-François; Clamens, Anne-Laure; d’Alençon, Emmanuelle; Kergoat, Gael J.

2015-01-01

Nowadays molecular species delimitation methods promote the identification of species boundaries within complex taxonomic groups by adopting innovative species concepts and theories (e.g. branching patterns, coalescence). As some of them can efficiently deal with large single-locus datasets, they could speed up the process of species discovery compared to more time consuming molecular methods, and benefit from the existence of large public datasets; these methods can also particularly favour scientific research and actions dealing with threatened or economically important taxa. In this study we aim to investigate and clarify the status of economically important moths species belonging to the genus Spodoptera (Lepidoptera, Noctuidae), a complex group in which previous phylogenetic analyses and integrative approaches already suggested the possible occurrence of cryptic species and taxonomic ambiguities. In this work, the effectiveness of innovative (and faster) species delimitation approaches to infer putative species boundaries has been successfully tested in Spodoptera, by processing the most comprehensive dataset (in terms of number of species and specimens) ever achieved; results are congruent and reliable, irrespective of the set of parameters and phylogenetic models applied. Our analyses confirm the existence of three potential new species clusters (for S. exigua (Hübner, 1808), S. frugiperda (J.E. Smith, 1797) and S. mauritia (Boisduval, 1833)) and support the synonymy of S. marima (Schaus, 1904) with S. ornithogalli (Guenée, 1852). They also highlight the ambiguity of the status of S. cosmiodes (Walker, 1858) and S. descoinsi Lalanne-Cassou & Silvain, 1994. This case study highlights the interest of molecular species delimitation methods as valuable tools for species discovery and to emphasize taxonomic ambiguities. PMID:25853412
Nodal distances for rooted phylogenetic trees.

PubMed

Cardona, Gabriel; Llabrés, Mercè; Rosselló, Francesc; Valiente, Gabriel

2010-08-01

Dissimilarity measures for (possibly weighted) phylogenetic trees based on the comparison of their vectors of path lengths between pairs of taxa, have been present in the systematics literature since the early seventies. For rooted phylogenetic trees, however, these vectors can only separate non-weighted binary trees, and therefore these dissimilarity measures are metrics only on this class of rooted phylogenetic trees. In this paper we overcome this problem, by splitting in a suitable way each path length between two taxa into two lengths. We prove that the resulting splitted path lengths matrices single out arbitrary rooted phylogenetic trees with nested taxa and arcs weighted in the set of positive real numbers. This allows the definition of metrics on this general class of rooted phylogenetic trees by comparing these matrices through metrics in spaces M(n)(R) of real-valued n x n matrices. We conclude this paper by establishing some basic facts about the metrics for non-weighted phylogenetic trees defined in this way using L(p) metrics on M(n)(R), with p [epsilon] R(>0).
Incompletely resolved phylogenetic trees inflate estimates of phylogenetic conservatism.

PubMed

Davies, T Jonathan; Kraft, Nathan J B; Salamin, Nicolas; Wolkovich, Elizabeth M

2012-02-01

The tendency for more closely related species to share similar traits and ecological strategies can be explained by their longer shared evolutionary histories and represents phylogenetic conservatism. How strongly species traits co-vary with phylogeny can significantly impact how we analyze cross-species data and can influence our interpretation of assembly rules in the rapidly expanding field of community phylogenetics. Phylogenetic conservatism is typically quantified by analyzing the distribution of species values on the phylogenetic tree that connects them. Many phylogenetic approaches, however, assume a completely sampled phylogeny: while we have good estimates of deeper phylogenetic relationships for many species-rich groups, such as birds and flowering plants, we often lack information on more recent interspecific relationships (i.e., within a genus). A common solution has been to represent these relationships as polytomies on trees using taxonomy as a guide. Here we show that such trees can dramatically inflate estimates of phylogenetic conservatism quantified using S. P. Blomberg et al.'s K statistic. Using simulations, we show that even randomly generated traits can appear to be phylogenetically conserved on poorly resolved trees. We provide a simple rarefaction-based solution that can reliably retrieve unbiased estimates of K, and we illustrate our method using data on first flowering times from Thoreau's woods (Concord, Massachusetts, USA).
Simultaneously estimating evolutionary history and repeated traits phylogenetic signal: applications to viral and host phenotypic evolution

PubMed Central

Vrancken, Bram; Lemey, Philippe; Rambaut, Andrew; Bedford, Trevor; Longdon, Ben; Günthard, Huldrych F.; Suchard, Marc A.

2014-01-01

Phylogenetic signal quantifies the degree to which resemblance in continuously-valued traits reflects phylogenetic relatedness. Measures of phylogenetic signal are widely used in ecological and evolutionary research, and are recently gaining traction in viral evolutionary studies. Standard estimators of phylogenetic signal frequently condition on data summary statistics of the repeated trait observations and fixed phylogenetics trees, resulting in information loss and potential bias. To incorporate the observation process and phylogenetic uncertainty in a model-based approach, we develop a novel Bayesian inference method to simultaneously estimate the evolutionary history and phylogenetic signal from molecular sequence data and repeated multivariate traits. Our approach builds upon a phylogenetic diffusion framework that model continuous trait evolution as a Brownian motion process and incorporates Pagel’s λ transformation parameter to estimate dependence among traits. We provide a computationally efficient inference implementation in the BEAST software package. We evaluate the synthetic performance of the Bayesian estimator of phylogenetic signal against standard estimators, and demonstrate the use of our coherent framework to address several virus-host evolutionary questions, including virulence heritability for HIV, antigenic evolution in influenza and HIV, and Drosophila sensitivity to sigma virus infection. Finally, we discuss model extensions that will make useful contributions to our flexible framework for simultaneously studying sequence and trait evolution. PMID:25780554
Phylogenetic versus functional signals in the evolution of form-function relationships in terrestrial vision.

PubMed

Motani, Ryosuke; Schmitz, Lars

2011-08-01

Phylogeny is deeply pertinent to evolutionary studies. Traits that perform a body function are expected to be strongly influenced by physical "requirements" of the function. We investigated if such traits exhibit phylogenetic signals, and, if so, how phylogenetic noises bias quantification of form-function relationships. A form-function system that is strongly influenced by physics, namely the relationship between eye morphology and visual optics in amniotes, was used. We quantified the correlation between form (i.e., eye morphology) and function (i.e., ocular optics) while varying the level of phylogenetic bias removal through adjusting Pagel's λ. Ocular soft-tissue dimensions exhibited the highest correlation with ocular optics when 1% of phylogenetic bias expected from Brownian motion was removed (i.e., λ= 0.01); the value for hard-tissue data were 8%. A small degree of phylogenetic bias therefore exists in morphology despite of the stringent functional constraints. We also devised a phylogenetically informed discriminant analysis and recorded the effects of phylogenetic bias on this method using the same data. Use of proper λ values during phylogenetic bias removal improved misidentification rates in resulting classifications when prior probabilities were assumed to be equal. Even a small degree of phylogenetic bias affected the classification resulting from phylogenetically informed discriminant analysis. © 2011 The Author(s). Evolution© 2011 The Society for the Study of Evolution.
Measuring the distance between multiple sequence alignments.

PubMed

Blackburne, Benjamin P; Whelan, Simon

2012-02-15

Multiple sequence alignment (MSA) is a core method in bioinformatics. The accuracy of such alignments may influence the success of downstream analyses such as phylogenetic inference, protein structure prediction, and functional prediction. The importance of MSA has lead to the proliferation of MSA methods, with different objective functions and heuristics to search for the optimal MSA. Different methods of inferring MSAs produce different results in all but the most trivial cases. By measuring the differences between inferred alignments, we may be able to develop an understanding of how these differences (i) relate to the objective functions and heuristics used in MSA methods, and (ii) affect downstream analyses. We introduce four metrics to compare MSAs, which include the position in a sequence where a gap occurs or the location on a phylogenetic tree where an insertion or deletion (indel) event occurs. We use both real and synthetic data to explore the information given by these metrics and demonstrate how the different metrics in combination can yield more information about MSA methods and the differences between them. MetAl is a free software implementation of these metrics in Haskell. Source and binaries for Windows, Linux and Mac OS X are available from http://kumiho.smith.man.ac.uk/whelan/software/metal/.
Picante: R tools for integrating phylogenies and ecology.

PubMed

Kembel, Steven W; Cowan, Peter D; Helmus, Matthew R; Cornwell, William K; Morlon, Helene; Ackerly, David D; Blomberg, Simon P; Webb, Campbell O

2010-06-01

Picante is a software package that provides a comprehensive set of tools for analyzing the phylogenetic and trait diversity of ecological communities. The package calculates phylogenetic diversity metrics, performs trait comparative analyses, manipulates phenotypic and phylogenetic data, and performs tests for phylogenetic signal in trait distributions, community structure and species interactions. Picante is a package for the R statistical language and environment written in R and C, released under a GPL v2 open-source license, and freely available on the web (http://picante.r-forge.r-project.org) and from CRAN (http://cran.r-project.org).
One tree to link them all: a phylogenetic dataset for the European tetrapoda.

PubMed

Roquet, Cristina; Lavergne, Sébastien; Thuiller, Wilfried

2014-08-08

Since the ever-increasing availability of phylogenetic informative data, the last decade has seen an upsurge of ecological studies incorporating information on evolutionary relationships among species. However, detailed species-level phylogenies are still lacking for many large groups and regions, which are necessary for comprehensive large-scale eco-phylogenetic analyses. Here, we provide a dataset of 100 dated phylogenetic trees for all European tetrapods based on a mixture of supermatrix and supertree approaches. Phylogenetic inference was performed separately for each of the main Tetrapoda groups of Europe except mammals (i.e. amphibians, birds, squamates and turtles) by means of maximum likelihood (ML) analyses of supermatrix applying a tree constraint at the family (amphibians and squamates) or order (birds and turtles) levels based on consensus knowledge. For each group, we inferred 100 ML trees to be able to provide a phylogenetic dataset that accounts for phylogenetic uncertainty, and assessed node support with bootstrap analyses. Each tree was dated using penalized-likelihood and fossil calibration. The trees obtained were well-supported by existing knowledge and previous phylogenetic studies. For mammals, we modified the most complete supertree dataset available on the literature to include a recent update of the Carnivora clade. As a final step, we merged the phylogenetic trees of all groups to obtain a set of 100 phylogenetic trees for all European Tetrapoda species for which data was available (91%). We provide this phylogenetic dataset (100 chronograms) for the purpose of comparative analyses, macro-ecological or community ecology studies aiming to incorporate phylogenetic information while accounting for phylogenetic uncertainty.

Universal artifacts affect the branching of phylogenetic trees, not universal scaling laws.

PubMed

Altaba, Cristian R

2009-01-01

The superficial resemblance of phylogenetic trees to other branching structures allows searching for macroevolutionary patterns. However, such trees are just statistical inferences of particular historical events. Recent meta-analyses report finding regularities in the branching pattern of phylogenetic trees. But is this supported by evidence, or are such regularities just methodological artifacts? If so, is there any signal in a phylogeny? In order to evaluate the impact of polytomies and imbalance on tree shape, the distribution of all binary and polytomic trees of up to 7 taxa was assessed in tree-shape space. The relationship between the proportion of outgroups and the amount of imbalance introduced with them was assessed applying four different tree-building methods to 100 combinations from a set of 10 ingroup and 9 outgroup species, and performing covariance analyses. The relevance of this analysis was explored taking 61 published phylogenies, based on nucleic acid sequences and involving various taxa, taxonomic levels, and tree-building methods. All methods of phylogenetic inference are quite sensitive to the artifacts introduced by outgroups. However, published phylogenies appear to be subject to a rather effective, albeit rather intuitive control against such artifacts. The data and methods used to build phylogenetic trees are varied, so any meta-analysis is subject to pitfalls due to their uneven intrinsic merits, which translate into artifacts in tree shape. The binary branching pattern is an imposition of methods, and seldom reflects true relationships in intraspecific analyses, yielding artifactual polytomies in short trees. Above the species level, the departure of real trees from simplistic random models is caused at least by two natural factors--uneven speciation and extinction rates; and artifacts such as choice of taxa included in the analysis, and imbalance introduced by outgroups and basal paraphyletic taxa. This artifactual imbalance accounts for tree shape convergence of large trees. There is no evidence for any universal scaling in the tree of life. Instead, there is a need for improved methods of tree analysis that can be used to discriminate the noise due to outgroups from the phylogenetic signal within the taxon of interest, and to evaluate realistic models of evolution, correcting the retrospective perspective and explicitly recognizing extinction as a driving force. Artifacts are pervasive, and can only be overcome through understanding the structure and biological meaning of phylogenetic trees. Catalan Abstract in Translation S1.
Phylogenetic perspective and the search for life on earth and elsewhere

NASA Technical Reports Server (NTRS)

Pace, Norman R.

1989-01-01

Any search for microbial life on Mars cannot rely upon cultivation of indigenous organisms. Only a minority of even terrestrial organisms that are observed in mixed, naturally-occurring microbial populations can be cultivated in the laboratory. Consequently, methods are being developed for analyzing the phylogenetic affiliations of the constituents of natural microbial populations without the need for their cultivation. This is more than an exercise in taxonomy, for the extent of phylogenetic relatedness between unknown and known organisms is some measure of the extent of their biochemical commonalities. In one approach, total DNA is isolated from natural microbial populations and 16S rRNA genes are shotgun cloned for rapid sequence determinations and phylogenetic analyses. A second approach employs oligodeoxynucleotide hybridization probes that bind to phylogenetic group-specific sequences in 16S rRNA. Since each actively growing cell contains about 104 ribosomes, the binding of the diagnostic probes to single cells can be visualized by radioactivity or fluorescence. The application of these methods and the use of in situ cultivation techniques is illustrated using submarine hydrothermal vent communities. Recommendations are made regarding planning toward future Mars missions.
The spatial sensitivity of the spectral diversity-biodiversity relationship: an experimental test in a prairie grassland.

PubMed

Wang, Ran; Gamon, John A; Cavender-Bares, Jeannine; Townsend, Philip A; Zygielbaum, Arthur I

2018-03-01

Remote sensing has been used to detect plant biodiversity in a range of ecosystems based on the varying spectral properties of different species or functional groups. However, the most appropriate spatial resolution necessary to detect diversity remains unclear. At coarse resolution, differences among spectral patterns may be too weak to detect. In contrast, at fine resolution, redundant information may be introduced. To explore the effect of spatial resolution, we studied the scale dependence of spectral diversity in a prairie ecosystem experiment at Cedar Creek Ecosystem Science Reserve, Minnesota, USA. Our study involved a scaling exercise comparing synthetic pixels resampled from high-resolution images within manipulated diversity treatments. Hyperspectral data were collected using several instruments on both ground and airborne platforms. We used the coefficient of variation (CV) of spectral reflectance in space as the indicator of spectral diversity and then compared CV at different scales ranging from 1 mm 2 to 1 m 2 to conventional biodiversity metrics, including species richness, Shannon's index, Simpson's index, phylogenetic species variation, and phylogenetic species evenness. In this study, higher species richness plots generally had higher CV. CV showed higher correlations with Shannon's index and Simpson's index than did species richness alone, indicating evenness contributed to the spectral diversity. Correlations with species richness and Simpson's index were generally higher than with phylogenetic species variation and evenness measured at comparable spatial scales, indicating weaker relationships between spectral diversity and phylogenetic diversity metrics than with species diversity metrics. High resolution imaging spectrometer data (1 mm 2 pixels) showed the highest sensitivity to diversity level. With decreasing spatial resolution, the difference in CV between diversity levels decreased and greatly reduced the optical detectability of biodiversity. The optimal pixel size for distinguishing α diversity in these prairie plots appeared to be around 1 mm to 10 cm, a spatial scale similar to the size of an individual herbaceous plant. These results indicate a strong scale-dependence of the spectral diversity-biodiversity relationships, with spectral diversity best able to detect a combination of species richness and evenness, and more weakly detecting phylogenetic diversity. These findings can be used to guide airborne studies of biodiversity and develop more effective large-scale biodiversity sampling methods. ©2018 The Authors Ecological Applications published by Wiley Periodicals, Inc. on behalf of Ecological Society of America.
Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement

PubMed Central

MacLeod, Dave; Charlebois, Robert L; Doolittle, Ford; Bapteste, Eric

2005-01-01

Background When organismal phylogenies based on sequences of single marker genes are poorly resolved, a logical approach is to add more markers, on the assumption that weak but congruent phylogenetic signal will be reinforced in such multigene trees. Such approaches are valid only when the several markers indeed have identical phylogenies, an issue which many multigene methods (such as the use of concatenated gene sequences or the assembly of supertrees) do not directly address. Indeed, even when the true history is a mixture of vertical descent for some genes and lateral gene transfer (LGT) for others, such methods produce unique topologies. Results We have developed software that aims to extract evidence for vertical and lateral inheritance from a set of gene trees compared against an arbitrary reference tree. This evidence is then displayed as a synthesis showing support over the tree for vertical inheritance, overlaid with explicit lateral gene transfer (LGT) events inferred to have occurred over the history of the tree. Like splits-tree methods, one can thus identify nodes at which conflict occurs. Additionally one can make reasonable inferences about vertical and lateral signal, assigning putative donors and recipients. Conclusion A tool such as ours can serve to explore the reticulated dimensionality of molecular evolution, by dissecting vertical and lateral inheritance at high resolution. By this, we mean that individual nodes can be examined not only for congruence, but also for coherence in light of LGT. We assert that our tools will facilitate the comparison of phylogenetic trees, and the interpretation of conflicting data. PMID:15819979
Clustering Genes of Common Evolutionary History

PubMed Central

Gori, Kevin; Suchan, Tomasz; Alvarez, Nadir; Goldman, Nick; Dessimoz, Christophe

2016-01-01

Phylogenetic inference can potentially result in a more accurate tree using data from multiple loci. However, if the loci are incongruent—due to events such as incomplete lineage sorting or horizontal gene transfer—it can be misleading to infer a single tree. To address this, many previous contributions have taken a mechanistic approach, by modeling specific processes. Alternatively, one can cluster loci without assuming how these incongruencies might arise. Such “process-agnostic” approaches typically infer a tree for each locus and cluster these. There are, however, many possible combinations of tree distance and clustering methods; their comparative performance in the context of tree incongruence is largely unknown. Furthermore, because standard model selection criteria such as AIC cannot be applied to problems with a variable number of topologies, the issue of inferring the optimal number of clusters is poorly understood. Here, we perform a large-scale simulation study of phylogenetic distances and clustering methods to infer loci of common evolutionary history. We observe that the best-performing combinations are distances accounting for branch lengths followed by spectral clustering or Ward’s method. We also introduce two statistical tests to infer the optimal number of clusters and show that they strongly outperform the silhouette criterion, a general-purpose heuristic. We illustrate the usefulness of the approach by 1) identifying errors in a previous phylogenetic analysis of yeast species and 2) identifying topological incongruence among newly sequenced loci of the globeflower fly genus Chiastocheta. We release treeCl, a new program to cluster genes of common evolutionary history (http://git.io/treeCl). PMID:26893301
Aquatic insect ecophysiological traits reveal phylogenetically based differences in dissolved cadmium susceptibility.

PubMed

Buchwalter, David B; Cain, Daniel J; Martin, Caitrin A; Xie, Lingtian; Luoma, Samuel N; Garland, Theodore

2008-06-17

We used a phylogenetically based comparative approach to evaluate the potential for physiological studies to reveal patterns of diversity in traits related to susceptibility to an environmental stressor, the trace metal cadmium (Cd). Physiological traits related to Cd bioaccumulation, compartmentalization, and ultimately susceptibility were measured in 21 aquatic insect species representing the orders Ephemeroptera, Plecoptera, and Trichoptera. We mapped these experimentally derived physiological traits onto a phylogeny and quantified the tendency for related species to be similar (phylogenetic signal). All traits related to Cd bioaccumulation and susceptibility exhibited statistically significant phylogenetic signal, although the signal strength varied among traits. Conventional and phylogenetically based regression models were compared, revealing great variability within orders but consistent, strong differences among insect families. Uptake and elimination rate constants were positively correlated among species, but only when effects of body size and phylogeny were incorporated in the analysis. Together, uptake and elimination rates predicted dramatic Cd bioaccumulation differences among species that agreed with field-based measurements. We discovered a potential tradeoff between the ability to eliminate Cd and the ability to detoxify it across species, particularly mayflies. The best-fit regression models were driven by phylogenetic parameters (especially differences among families) rather than functional traits, suggesting that it may eventually be possible to predict a taxon's physiological performance based on its phylogenetic position, provided adequate physiological information is available for close relatives. There appears to be great potential for evolutionary physiological approaches to augment our understanding of insect responses to environmental stressors in nature.
Aquatic insect ecophysiological traits reveal phylogenetically based differences in dissolved cadmium susceptibility

PubMed Central

Buchwalter, David B.; Cain, Daniel J.; Martin, Caitrin A.; Xie, Lingtian; Luoma, Samuel N.; Garland, Theodore

2008-01-01

We used a phylogenetically based comparative approach to evaluate the potential for physiological studies to reveal patterns of diversity in traits related to susceptibility to an environmental stressor, the trace metal cadmium (Cd). Physiological traits related to Cd bioaccumulation, compartmentalization, and ultimately susceptibility were measured in 21 aquatic insect species representing the orders Ephemeroptera, Plecoptera, and Trichoptera. We mapped these experimentally derived physiological traits onto a phylogeny and quantified the tendency for related species to be similar (phylogenetic signal). All traits related to Cd bioaccumulation and susceptibility exhibited statistically significant phylogenetic signal, although the signal strength varied among traits. Conventional and phylogenetically based regression models were compared, revealing great variability within orders but consistent, strong differences among insect families. Uptake and elimination rate constants were positively correlated among species, but only when effects of body size and phylogeny were incorporated in the analysis. Together, uptake and elimination rates predicted dramatic Cd bioaccumulation differences among species that agreed with field-based measurements. We discovered a potential tradeoff between the ability to eliminate Cd and the ability to detoxify it across species, particularly mayflies. The best-fit regression models were driven by phylogenetic parameters (especially differences among families) rather than functional traits, suggesting that it may eventually be possible to predict a taxon's physiological performance based on its phylogenetic position, provided adequate physiological information is available for close relatives. There appears to be great potential for evolutionary physiological approaches to augment our understanding of insect responses to environmental stressors in nature. PMID:18559853
PhyLIS: a simple GNU/Linux distribution for phylogenetics and phyloinformatics.

PubMed

Thomson, Robert C

2009-07-30

PhyLIS is a free GNU/Linux distribution that is designed to provide a simple, standardized platform for phylogenetic and phyloinformatic analysis. The operating system incorporates most commonly used phylogenetic software, which has been pre-compiled and pre-configured, allowing for straightforward application of phylogenetic methods and development of phyloinformatic pipelines in a stable Linux environment. The software is distributed as a live CD and can be installed directly or run from the CD without making changes to the computer. PhyLIS is available for free at http://www.eve.ucdavis.edu/rcthomson/phylis/.
PhyLIS: A Simple GNU/Linux Distribution for Phylogenetics and Phyloinformatics

PubMed Central

Thomson, Robert C.

2009-01-01

PhyLIS is a free GNU/Linux distribution that is designed to provide a simple, standardized platform for phylogenetic and phyloinformatic analysis. The operating system incorporates most commonly used phylogenetic software, which has been pre-compiled and pre-configured, allowing for straightforward application of phylogenetic methods and development of phyloinformatic pipelines in a stable Linux environment. The software is distributed as a live CD and can be installed directly or run from the CD without making changes to the computer. PhyLIS is available for free at http://www.eve.ucdavis.edu/rcthomson/phylis/. PMID:19812729
Phylogenetic Analysis of Genome Rearrangements among Five Mammalian Orders

PubMed Central

Luo, Haiwei; Arndt, William; Zhang, Yiwei; Shi, Guanqun; Alekseyev, Max; Tang, Jijun; Hughes, Austin L.; Friedman, Robert

2015-01-01

Evolutionary relationships among placental mammalian orders have been controversial. Whole genome sequencing and new computational methods offer opportunities to resolve the relationships among 10 genomes belonging to the mammalian orders Primates, Rodentia, Carnivora, Perissodactyla and Artiodactyla. By application of the double cut and join distance metric, where gene order is the phylogenetic character, we computed genomic distances among the sampled mammalian genomes. With a marsupial outgroup, the gene order tree supported a topology in which Rodentia fell outside the cluster of Primates, Carnivora, Perissodactyla, and Artiodactyla. Results of breakpoint reuse rate and synteny block length analyses were consistent with the prediction of random breakage model, which provided a diagnostic test to support use of gene order as an appropriate phylogenetic character in this study. We the influence of rate differences among lineages and other factors that may contribute to different resolutions of mammalian ordinal relationships by different methods of phylogenetic reconstruction. PMID:22929217
MGUPGMA: A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL

PubMed Central

Hua, Guan-Jie; Hung, Che-Lun; Lin, Chun-Yuan; Wu, Fu-Che; Chan, Yu-Wei; Tang, Chuan Yi

2017-01-01

A phylogenetic tree is a visual diagram of the relationship between a set of biological species. The scientists usually use it to analyze many characteristics of the species. The distance-matrix methods, such as Unweighted Pair Group Method with Arithmetic Mean and Neighbor Joining, construct a phylogenetic tree by calculating pairwise genetic distances between taxa. These methods have the computational performance issue. Although several new methods with high-performance hardware and frameworks have been proposed, the issue still exists. In this work, a novel parallel Unweighted Pair Group Method with Arithmetic Mean approach on multiple Graphics Processing Units is proposed to construct a phylogenetic tree from extremely large set of sequences. The experimental results present that the proposed approach on a DGX-1 server with 8 NVIDIA P100 graphic cards achieves approximately 3-fold to 7-fold speedup over the implementation of Unweighted Pair Group Method with Arithmetic Mean on a modern CPU and a single GPU, respectively. PMID:29051701
MGUPGMA: A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL.

PubMed

Hua, Guan-Jie; Hung, Che-Lun; Lin, Chun-Yuan; Wu, Fu-Che; Chan, Yu-Wei; Tang, Chuan Yi

2017-01-01

A phylogenetic tree is a visual diagram of the relationship between a set of biological species. The scientists usually use it to analyze many characteristics of the species. The distance-matrix methods, such as Unweighted Pair Group Method with Arithmetic Mean and Neighbor Joining, construct a phylogenetic tree by calculating pairwise genetic distances between taxa. These methods have the computational performance issue. Although several new methods with high-performance hardware and frameworks have been proposed, the issue still exists. In this work, a novel parallel Unweighted Pair Group Method with Arithmetic Mean approach on multiple Graphics Processing Units is proposed to construct a phylogenetic tree from extremely large set of sequences. The experimental results present that the proposed approach on a DGX-1 server with 8 NVIDIA P100 graphic cards achieves approximately 3-fold to 7-fold speedup over the implementation of Unweighted Pair Group Method with Arithmetic Mean on a modern CPU and a single GPU, respectively.
Coalescent methods for estimating phylogenetic trees.

PubMed

Liu, Liang; Yu, Lili; Kubatko, Laura; Pearl, Dennis K; Edwards, Scott V

2009-10-01

We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.
Trends in the sand: Directional evolution in the shell shape of recessing scallops (Bivalvia: Pectinidae).

PubMed

Sherratt, Emma; Alejandrino, Alvin; Kraemer, Andrew C; Serb, Jeanne M; Adams, Dean C

2016-09-01

Directional evolution is one of the most compelling evolutionary patterns observed in macroevolution. Yet, despite its importance, detecting such trends in multivariate data remains a challenge. In this study, we evaluate multivariate evolution of shell shape in 93 bivalved scallop species, combining geometric morphometrics and phylogenetic comparative methods. Phylomorphospace visualization described the history of morphological diversification in the group; revealing that taxa with a recessing life habit were the most distinctive in shell shape, and appeared to display a directional trend. To evaluate this hypothesis empirically, we extended existing methods by characterizing the mean directional evolution in phylomorphospace for recessing scallops. We then compared this pattern to what was expected under several alternative evolutionary scenarios using phylogenetic simulations. The observed pattern did not fall within the distribution obtained under multivariate Brownian motion, enabling us to reject this evolutionary scenario. By contrast, the observed pattern was more similar to, and fell within, the distribution obtained from simulations using Brownian motion combined with a directional trend. Thus, the observed data are consistent with a pattern of directional evolution for this lineage of recessing scallops. We discuss this putative directional evolutionary trend in terms of its potential adaptive role in exploiting novel habitats. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods

PubMed Central

Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir

2011-01-01

Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
Studying Biological Responses to Global Change in Atmospheric Oxygen

PubMed Central

Powell, Frank L.

2010-01-01

A popular book recently hypothesized that change in atmospheric oxygen over geological time is the most important physical factor in the evolution of many fundamental characteristics of modern terrestrial animals. This hypothesis is generated primarily using fossil data but the present paper considers how modern experimental biology can be used to test it. Comparative physiology and experimental evolution clearly show that changes in atmospheric O2 over the ages had the potential to drive evolution, assuming the physiological O2-sensitivity of animals today is similar to the past. Established methods, such as phylogenetically independent contrasts, as well new approaches, such as adding environmental history to phylogenetic analyses or modeling interactions between environmental stresses and biological responses with different rate constants, may be useful for testing (disproving) hypotheses about biological adaptations to changes in atmospheric O2. PMID:20385257
Ribosomal RNA: a key to phylogeny

NASA Technical Reports Server (NTRS)

Olsen, G. J.; Woese, C. R.

1993-01-01

As molecular phylogeny increasingly shapes our understanding of organismal relationships, no molecule has been applied to more questions than have ribosomal RNAs. We review this role of the rRNAs and some of the insights that have been gained from them. We also offer some of the practical considerations in extracting the phylogenetic information from the sequences. Finally, we stress the importance of comparing results from multiple molecules, both as a method for testing the overall reliability of the organismal phylogeny and as a method for more broadly exploring the history of the genome.
Curious parallels and curious connections--phylogenetic thinking in biology and historical linguistics.

PubMed

Atkinson, Quentin D; Gray, Russell D

2005-08-01

In The Descent of Man (1871), Darwin observed "curious parallels" between the processes of biological and linguistic evolution. These parallels mean that evolutionary biologists and historical linguists seek answers to similar questions and face similar problems. As a result, the theory and methodology of the two disciplines have evolved in remarkably similar ways. In addition to Darwin's curious parallels of process, there are a number of equally curious parallels and connections between the development of methods in biology and historical linguistics. Here we briefly review the parallels between biological and linguistic evolution and contrast the historical development of phylogenetic methods in the two disciplines. We then look at a number of recent studies that have applied phylogenetic methods to language data and outline some current problems shared by the two fields.
AST: an automated sequence-sampling method for improving the taxonomic diversity of gene phylogenetic trees.

PubMed

Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

2014-01-01

A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.
AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees

PubMed Central

Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

2014-01-01

A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php. PMID:24892935

Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.

PubMed

Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A

2018-01-30

Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Posterior Predictive Bayesian Phylogenetic Model Selection

PubMed Central

Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn

2014-01-01

We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892
A novel approach to identifying regulatory motifs in distantly related genomes

PubMed Central

Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

2005-01-01

Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672
The effect of miniaturized body size on skeletal morphology in frogs.

PubMed

Yeh, Jennifer

2002-03-01

Miniaturization has evolved numerous times and reached impressive extremes in the Anura. I compared the skeletons of miniature frog species to those of closely related larger species to assess patterns of morphological change, sampling 129 species from 12 families. Two types of morphological data were examined: (1) qualitative data on bone presence and absence; and (2) thin-plate spline morphometric descriptions of skull structure and bone shape. Phylogenetic comparative methods were used to address the shared history of species. Miniature anurans were more likely to lose skull bones and phalangeal elements of the limbs. Their skulls also showed consistent differences compared to those of their larger relatives, including relatively larger braincases and sensory capsules, verticalization of lateral elements, rostral displacement of the jaw joint, and reduction of some skull elements. These features are explained by functional constraints and by paedomorphosis. Variation among lineages in the morphological response to miniaturization was also explored. Certain lineages appear to be unusually resistant to the morphological trends that characterize miniature frogs as a whole. This study represents the first large-scale examination of morphology and miniaturization across a major, diverse group of organisms conducted in a phylogenetic framework and with statistical rigor.
Phylogenetic effective sample size.

PubMed

Bartoszek, Krzysztof

2016-10-21

In this paper I address the question-how large is a phylogenetic sample? I propose a definition of a phylogenetic effective sample size for Brownian motion and Ornstein-Uhlenbeck processes-the regression effective sample size. I discuss how mutual information can be used to define an effective sample size in the non-normal process case and compare these two definitions to an already present concept of effective sample size (the mean effective sample size). Through a simulation study I find that the AICc is robust if one corrects for the number of species or effective number of species. Lastly I discuss how the concept of the phylogenetic effective sample size can be useful for biodiversity quantification, identification of interesting clades and deciding on the importance of phylogenetic correlations. Copyright © 2016 Elsevier Ltd. All rights reserved.
Assessing the influence of biogeographical region and phylogenetic history on chemical defences and herbivory in Quercus species.

PubMed

Moreira, Xoaquín; Abdala-Roberts, Luis; Galmán, Andrea; Francisco, Marta; Fuente, María de la; Butrón, Ana; Rasmann, Sergio

2018-06-07

Biogeographical factors and phylogenetic history are key determinants of inter-specific variation in plant defences. However, few studies have conducted broad-scale geographical comparisons of plant defences while controlling for phylogenetic relationships, and, in doing so, none have separated constitutive from induced defences. This gap has limited our understanding of how historical or large-scale processes mediate biogeographical patterns in plant defences since these may be contingent upon shared evolutionary history and phylogenetic constraints. We conducted a phylogenetically-controlled experiment testing for differences in constitutive leaf chemical defences and their inducibility between Palearctic and Nearctic oak species (Quercus, total 18 species). We induced defences in one-year old plants by inflicting damage by gypsy moth larvae (Lymantria dispar), estimated the amount of leaf area consumed, and quantified various groups of phenolic compounds. There was no detectable phylogenetic signal for constitutive or induced levels of most defensive traits except for constitutive condensed tannins, as well as no phylogenetic signal in leaf herbivory. We did, however, find marked differences in defence levels between oak species from each region: Palearctic species had higher levels of constitutive condensed tannins, but less constitutive lignins and less constitutive and induced hydrolysable tannins compared with Nearctic species. Additionally, Palearctic species had lower levels of leaf damage compared with Nearctic species. These differences in leaf damage, lignins and hydrolysable (but not condensed) tannins were lost after accounting for phylogeny, suggesting that geographical structuring of phylogenetic relationships mediated biogeographical differences in defences and herbivore resistance. Together, these findings suggest that historical processes and large-scale drivers have shaped differences in allocation to constitutive defences (and in turn resistance) between Palearctic and Nearctic oaks. Moreover, although evidence of phylogenetic conservatism in the studied traits is rather weak, shared evolutionary history appears to mediate some of these biogeographical patterns in allocation to chemical defences. Copyright © 2018 Elsevier Ltd. All rights reserved.
Phylogeny of the Genus Drosophila

PubMed Central

O’Grady, Patrick M.; DeSalle, Rob

2018-01-01

Understanding phylogenetic relationships among taxa is key to designing and implementing comparative analyses. The genus Drosophila, which contains over 1600 species, is one of the most important model systems in the biological sciences. For over a century, one species in this group, Drosophila melanogaster, has been key to studies of animal development and genetics, genome organization and evolution, and human disease. As whole-genome sequencing becomes more cost-effective, there is increasing interest in other members of this morphologically, ecologically, and behaviorally diverse genus. Phylogenetic relationships within Drosophila are complicated, and the goal of this paper is to provide a review of the recent taxonomic changes and phylogenetic relationships in this genus to aid in further comparative studies. PMID:29716983
On the quirks of maximum parsimony and likelihood on phylogenetic networks.

PubMed

Bryant, Christopher; Fischer, Mareike; Linz, Simone; Semple, Charles

2017-03-21

Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogenetic networks, which can display such events, are becoming of more and more interest in phylogenetic research. It is therefore necessary to extend concepts like maximum parsimony from phylogenetic trees to networks. Several suggestions for possible extensions can be found in recent literature, for instance the softwired and the hardwired parsimony concepts. In this paper, we analyze the so-called big parsimony problem under these two concepts, i.e. we investigate maximum parsimonious networks and analyze their properties. In particular, we show that finding a softwired maximum parsimony network is possible in polynomial time. We also show that the set of maximum parsimony networks for the hardwired definition always contains at least one phylogenetic tree. Lastly, we investigate some parallels of parsimony to different likelihood concepts on phylogenetic networks. Copyright © 2017 Elsevier Ltd. All rights reserved.
Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach

PubMed Central

Erickson, David L.; Jones, Frank A.; Swenson, Nathan G.; Pei, Nancai; Bourg, Norman A.; Chen, Wenna; Davies, Stuart J.; Ge, Xue-jun; Hao, Zhanqing; Howe, Robert W.; Huang, Chun-Lin; Larson, Andrew J.; Lum, Shawn K. Y.; Lutz, James A.; Ma, Keping; Meegaskumbura, Madhava; Mi, Xiangcheng; Parker, John D.; Fang-Sun, I.; Wright, S. Joseph; Wolf, Amy T.; Ye, W.; Xing, Dingliang; Zimmerman, Jess K.; Kress, W. John

2014-01-01

Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK, and psbA-trnH) and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance (PD) metrics that are commonly used to infer assembly processes were estimated for each plot [PD, Mean Phylogenetic Distance (MPD), and Mean Nearest Taxon Distance (MNTD)]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for individual plots, estimates of phylogenetic diversity in the mega-phylogeny were more consistent, thereby removing a potential source of bias at the plot-level, and demonstrating the value of assessing phylogenetic relationships simultaneously within a mega-phylogeny. An unexpected result of the comparisons among plots based on the mega-phylogeny was that the communities in the ForestGEO plots in general appear to be assemblages of more closely related species than expected by chance, and that differentiation among communities is very low, suggesting deep floristic connections among communities and new avenues for future analyses in community ecology. PMID:25414723
Bayesian phylogenetic analysis supports an agricultural origin of Japonic languages

PubMed Central

Lee, Sean; Hasegawa, Toshikazu

2011-01-01

Languages, like genes, evolve by a process of descent with modification. This striking similarity between biological and linguistic evolution allows us to apply phylogenetic methods to explore how languages, as well as the people who speak them, are related to one another through evolutionary history. Language phylogenies constructed with lexical data have so far revealed population expansions of Austronesian, Indo-European and Bantu speakers. However, how robustly a phylogenetic approach can chart the history of language evolution and what language phylogenies reveal about human prehistory must be investigated more thoroughly on a global scale. Here we report a phylogeny of 59 Japonic languages and dialects. We used this phylogeny to estimate time depth of its root and compared it with the time suggested by an agricultural expansion scenario for Japanese origin. In agreement with the scenario, our results indicate that Japonic languages descended from a common ancestor approximately 2182 years ago. Together with archaeological and biological evidence, our results suggest that the first farmers of Japan had a profound impact on the origins of both people and languages. On a broader level, our results are consistent with a theory that agricultural expansion is the principal factor for shaping global linguistic diversity. PMID:21543358
Mitochondrial genome of Pteronotus personatus (Chiroptera: Mormoopidae): comparison with selected bats and phylogenetic considerations.

PubMed

López-Wilchis, Ricardo; Del Río-Portilla, Miguel Ángel; Guevara-Chumacero, Luis Manuel

2017-02-01

We described the complete mitochondrial genome (mitogenome) of the Wagner's mustached bat, Pteronotus personatus, a species belonging to the family Mormoopidae, and compared it with other published mitogenomes of bats (Chiroptera). The mitogenome of P. personatus was 16,570 bp long and contained a typically conserved structure including 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and one control region (D-loop). Most of the genes were encoded on the H-strand, except for eight tRNA and the ND6 genes. The order of protein-coding and rRNA genes was highly conserved in all mitogenomes. All protein-coding genes started with an ATG codon, except for ND2, ND3, and ND5, which initiated with ATA, and terminated with the typical stop codon TAA/TAG or the codon AGA. Phylogenetic trees constructed using Maximum Parsimony, Maximum Likelihood, and Bayesian inference methods showed an identical topology and indicated the monophyly of different families of bats (Mormoopidae, Phyllostomidae, Vespertilionidae, Rhinolophidae, and Pteropopidae) and the existence of two major clades corresponding to the suborders Yangochiroptera and Yinpterochiroptera. The mitogenome sequence provided here will be useful for further phylogenetic analyses and population genetic studies in mormoopid bats.
Genetic Identification of Orientobilharzia turkestanicum from Sheep Isolates in Iran.

PubMed

Tabaripour, Reza; Youssefi, Mohammad Reza; Tabaripour, Rabeeh

2015-01-01

Adult worms of Orientobilharzia turkestanicum live in the portal veins, or intestinal veins of cattle, sheep, goat and many other mammals causing orientobilharziasis. Orientobilharziasis causes significant economic losses to livestock industry of Iran. However, there is limited information about genotypes of O. turkestanicum in Iran. In this study, 30 isolates of O. turkestanicum obtained from sheep were characterized by sequencing mitochondrial cytochrome c oxidase subunit 1 (cox1) and nicotinamide adenine dinucleotide dehydrogenase subunit 1 (nad1) gene. The mitochondrial cox1 and nad1 DNA were amplified by polymerase chain reaction (PCR) and then sequenced and compared with O. turkestanicum and that of other members of the Schistosomatidae available in Gen-Bank(™). Phylogenetic relationships between them were re-constructed using the maximum parsimony method. Phylogenetic analyses done in present study placed O. turkestanicum within the Schistosoma genus, and indicates that O. turkestanicum was phylogenetically closer to the African schistosome group than to the Asian schistosome group. Comparison of nad1 and cox1 sequences of O. turkestanicum obtained in this study with corresponding sequences available in Genbank(™) revealed some sequence variations and provided evidence for presence of microvarients in Iran.
A new monster from southwest Oregon forests: Cryptomaster behemoth sp. n. (Opiliones, Laniatores, Travunioidea)

PubMed Central

Starrett, James; Derkarabetian, Shahan; Richart, Casey H.; Cabrero, Allan; Hedin, Marshal

2016-01-01

Abstract The monotypic genus Cryptomaster Briggs, 1969 was described based on individuals from a single locality in southwestern Oregon. The described species Cryptomaster leviathan Briggs, 1969 was named for its large body size compared to most travunioid Laniatores. However, as the generic name suggests, Cryptomaster are notoriously difficult to find, and few subsequent collections have been recorded for this genus. Here, we increase sampling of Cryptomaster to 15 localities, extending their known range from the Coast Range northeast to the western Cascade Mountains of southern Oregon. Phylogenetic analyses of mitochondrial and nuclear DNA sequence data reveal deep phylogenetic breaks consistent with independently evolving lineages. We use discovery and validation species delimitation approaches to generate and test species hypotheses, including a coalescent species delimitation method to test multi-species hypotheses. For delimited species, we use light microscopy and SEM to discover diagnostic morphological characters. Although Cryptomaster has a small geographic distribution, this taxon is consistent with other short-range endemics in having deep phylogenetic breaks indicative of species level divergences. Herein we describe Cryptomaster behemoth sp. n., and provide morphological diagnostic characters for identifying Cryptomaster leviathan and Cryptomaster behemoth. PMID:26877685
Social Mating System and Sex-Biased Dispersal in Mammals and Birds: A Phylogenetic Analysis

PubMed Central

Mabry, Karen E.; Shelley, Erin L.; Davis, Katie E.; Blumstein, Daniel T.; Van Vuren, Dirk H.

2013-01-01

The hypothesis that patterns of sex-biased dispersal are related to social mating system in mammals and birds has gained widespread acceptance over the past 30 years. However, two major complications have obscured the relationship between these two behaviors: 1) dispersal frequency and dispersal distance, which measure different aspects of the dispersal process, have often been confounded, and 2) the relationship between mating system and sex-biased dispersal in these vertebrate groups has not been examined using modern phylogenetic comparative methods. Here, we present a phylogenetic analysis of the relationship between mating system and sex-biased dispersal in mammals and birds. Results indicate that the evolution of female-biased dispersal in mammals may be more likely on monogamous branches of the phylogeny, and that females may disperse farther than males in socially monogamous mammalian species. However, we found no support for a relationship between social mating system and sex-biased dispersal in birds when the effects of phylogeny are taken into consideration. We caution that although there are larger-scale behavioral differences in mating system and sex-biased dispersal between mammals and birds, mating system and sex-biased dispersal are far from perfectly associated within these taxa. PMID:23483957
Pan-genome and phylogeny of Bacillus cereus sensu lato.

PubMed

Bazinet, Adam L

2017-08-02

Bacillus cereus sensu lato (s. l.) is an ecologically diverse bacterial group of medical and agricultural significance. In this study, I use publicly available genomes and novel bioinformatic workflows to characterize the B. cereus s. l. pan-genome and perform the largest phylogenetic and population genetic analyses of this group to date in terms of the number of genes and taxa included. With these fundamental data in hand, I identify genes associated with particular phenotypic traits (i.e., "pan-GWAS" analysis), and quantify the degree to which taxa sharing common attributes are phylogenetically clustered. A rapid k-mer based approach (Mash) was used to create reduced representations of selected Bacillus genomes, and a fast distance-based phylogenetic analysis of this data (FastME) was performed to determine which species should be included in B. cereus s. l. The complete genomes of eight B. cereus s. l. species were annotated de novo with Prokka, and these annotations were used by Roary to produce the B. cereus s. l. pan-genome. Scoary was used to associate gene presence and absence patterns with various phenotypes. The orthologous protein sequence clusters produced by Roary were filtered and used to build HaMStR databases of gene models that were used in turn to construct phylogenetic data matrices. Phylogenetic analyses used RAxML, DendroPy, ClonalFrameML, PAUP*, and SplitsTree. Bayesian model-based population genetic analysis assigned taxa to clusters using hierBAPS. The genealogical sorting index was used to quantify the phylogenetic clustering of taxa sharing common attributes. The B. cereus s. l. pan-genome currently consists of ≈60,000 genes, ≈600 of which are "core" (common to at least 99% of taxa sampled). Pan-GWAS analysis revealed genes associated with phenotypes such as isolation source, oxygen requirement, and ability to cause diseases such as anthrax or food poisoning. Extensive phylogenetic analyses using an unprecedented amount of data produced phylogenies that were largely concordant with each other and with previous studies. Phylogenetic support as measured by bootstrap probabilities increased markedly when all suitable pan-genome data was included in phylogenetic analyses, as opposed to when only core genes were used. Bayesian population genetic analysis recommended subdividing the three major clades of B. cereus s. l. into nine clusters. Taxa sharing common traits and species designations exhibited varying degrees of phylogenetic clustering. All phylogenetic analyses recapitulated two previously used classification systems, and taxa were consistently assigned to the same major clade and group. By including accessory genes from the pan-genome in the phylogenetic analyses, I produced an exceptionally well-supported phylogeny of 114 complete B. cereus s. l. genomes. The best-performing methods were used to produce a phylogeny of all 498 publicly available B. cereus s. l. genomes, which was in turn used to compare three different classification systems and to test the monophyly status of various B. cereus s. l. species. The majority of the methodology used in this study is generic and could be leveraged to produce pan-genome estimates and similarly robust phylogenetic hypotheses for other bacterial groups.
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study

PubMed Central

Weißenborn, Sandra; Walther, Dirk

2017-01-01

Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
Rooting phylogenetic trees under the coalescent model using site pattern probabilities.

PubMed

Tian, Yuan; Kubatko, Laura

2017-12-19

Phylogenetic tree inference is a fundamental tool to estimate ancestor-descendant relationships among different species. In phylogenetic studies, identification of the root - the most recent common ancestor of all sampled organisms - is essential for complete understanding of the evolutionary relationships. Rooted trees benefit most downstream application of phylogenies such as species classification or study of adaptation. Often, trees can be rooted by using outgroups, which are species that are known to be more distantly related to the sampled organisms than any other species in the phylogeny. However, outgroups are not always available in evolutionary research. In this study, we develop a new method for rooting species tree under the coalescent model, by developing a series of hypothesis tests for rooting quartet phylogenies using site pattern probabilities. The power of this method is examined by simulation studies and by application to an empirical North American rattlesnake data set. The method shows high accuracy across the simulation conditions considered, and performs well for the rattlesnake data. Thus, it provides a computationally efficient way to accurately root species-level phylogenies that incorporates the coalescent process. The method is robust to variation in substitution model, but is sensitive to the assumption of a molecular clock. Our study establishes a computationally practical method for rooting species trees that is more efficient than traditional methods. The method will benefit numerous evolutionary studies that require rooting a phylogenetic tree without having to specify outgroups.
Understanding phylogenetic incongruence: lessons from phyllostomid bats

PubMed Central

Dávalos, Liliana M; Cirranello, Andrea L; Geisler, Jonathan H; Simmons, Nancy B

2012-01-01

All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive morphological convergence among nectar-feeding lineages, and incongruent gene trees. Applying methods to account for nucleotide sequence saturation reduces, but does not completely eliminate, phylogenetic conflict. We ruled out paralogy, lateral gene transfer, and poor taxon sampling and outgroup choices among the processes leading to incongruent gene trees in phyllostomid bats. Uncovering and countering the possible effects of introgression and lineage sorting of ancestral polymorphism on gene trees will require great leaps in genomic and allelic sequencing in this species-rich mammalian family. We also found evidence for adaptive molecular evolution leading to convergence in mitochondrial proteins among nectar-feeding lineages. In conclusion, the biological processes that generate phylogenetic conflict are ubiquitous, and overcoming incongruence requires better models and more data than have been collected even in well-studied organisms such as phyllostomid bats. PMID:22891620
From symmetry to asymmetry: Phylogenetic patterns of asymmetry variation in animals and their evolutionary significance

PubMed Central

Palmer, A. Richard

1996-01-01

Phylogenetic analyses of asymmetry variation offer a powerful tool for exploring the interplay between ontogeny and evolution because (i) conspicuous asymmetries exist in many higher metazoans with widely varying modes of development, (ii) patterns of bilateral variation within species may identify genetically and environmentally triggered asymmetries, and (iii) asymmetries arising at different times during development may be more sensitive to internal cytoplasmic inhomogeneities compared to external environmental stimuli. Using four broadly comparable asymmetry states (symmetry, antisymmetry, dextral, and sinistral), and two stages at which asymmetry appears developmentally (larval and postlarval), I evaluated relations between ontogenetic and phylogenetic patterns of asymmetry variation. Among 140 inferred phylogenetic transitions between asymmetry states, recorded from 11 classes in five phyla, directional asymmetry (dextral or sinistral) evolved directly from symmetrical ancestors proportionally more frequently among larval asymmetries. In contrast, antisymmetry, either as an end state or as a transitional stage preceding directional asymmetry, was confined primarily to postlarval asymmetries. The ontogenetic origin of asymmetry thus significantly influences its subsequent evolution. Furthermore, because antisymmetry typically signals an environmentally triggered asymmetry, the phylogenetic transition from antisymmetry to directional asymmetry suggests that many cases of laterally fixed asymmetries evolved via genetic assimilation. PMID:8962039
Cophenetic metrics for phylogenetic trees, after Sokal and Rohlf.

PubMed

Cardona, Gabriel; Mir, Arnau; Rosselló, Francesc; Rotger, Lucía; Sánchez, David

2013-01-16

Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair of phylogenetic trees by first encoding them by means of their half-matrices of cophenetic values, and then comparing these matrices. This idea has been used several times since then to define dissimilarity measures between phylogenetic trees but, to our knowledge, no proper metric on weighted phylogenetic trees with nested taxa based on this idea has been formally defined and studied yet. Actually, the cophenetic values of pairs of different taxa alone are not enough to single out phylogenetic trees with weighted arcs or nested taxa. For every (rooted) phylogenetic tree T, let its cophenetic vectorφ(T) consist of all pairs of cophenetic values between pairs of taxa in T and all depths of taxa in T. It turns out that these cophenetic vectors single out weighted phylogenetic trees with nested taxa. We then define a family of cophenetic metrics dφ,p by comparing these cophenetic vectors by means of Lp norms, and we study, either analytically or numerically, some of their basic properties: neighbors, diameter, distribution, and their rank correlation with each other and with other metrics. The cophenetic metrics can be safely used on weighted phylogenetic trees with nested taxa and no restriction on degrees, and they can be computed in O(n2) time, where n stands for the number of taxa. The metrics dφ,1 and dφ,2 have positive skewed distributions, and they show a low rank correlation with the Robinson-Foulds metric and the nodal metrics, and a very high correlation with each other and with the splitted nodal metrics. The diameter of dφ,p, for p⩾1 , is in O(n(p+2)/p), and thus for low p they are more discriminative, having a wider range of values.

The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation

PubMed Central

Roger, Andrew J; Hug, Laura A

2006-01-01

Determining the relationships among and divergence times for the major eukaryotic lineages remains one of the most important and controversial outstanding problems in evolutionary biology. The sequencing and phylogenetic analyses of ribosomal RNA (rRNA) genes led to the first nearly comprehensive phylogenies of eukaryotes in the late 1980s, and supported a view where cellular complexity was acquired during the divergence of extant unicellular eukaryote lineages. More recently, however, refinements in analytical methods coupled with the availability of many additional genes for phylogenetic analysis showed that much of the deep structure of early rRNA trees was artefactual. Recent phylogenetic analyses of a multiple genes and the discovery of important molecular and ultrastructural phylogenetic characters have resolved eukaryotic diversity into six major hypothetical groups. Yet relationships among these groups remain poorly understood because of saturation of sequence changes on the billion-year time-scale, possible rapid radiations of major lineages, phylogenetic artefacts and endosymbiotic or lateral gene transfer among eukaryotes. Estimating the divergence dates between the major eukaryote lineages using molecular analyses is even more difficult than phylogenetic estimation. Error in such analyses comes from a myriad of sources including: (i) calibration fossil dates, (ii) the assumed phylogenetic tree, (iii) the nucleotide or amino acid substitution model, (iv) substitution number (branch length) estimates, (v) the model of how rates of evolution change over the tree, (vi) error inherent in the time estimates for a given model and (vii) how multiple gene data are treated. By reanalysing datasets from recently published molecular clock studies, we show that when errors from these various sources are properly accounted for, the confidence intervals on inferred dates can be very large. Furthermore, estimated dates of divergence vary hugely depending on the methods used and their assumptions. Accurate dating of divergence times among the major eukaryote lineages will require a robust tree of eukaryotes, a much richer Proterozoic fossil record of microbial eukaryotes assignable to extant groups for calibration, more sophisticated relaxed molecular clock methods and many more genes sampled from the full diversity of microbial eukaryotes. PMID:16754613
Comparative genomic and phylogenetic investigation of the xenobiotic metabolizing arylamine N-acetyltransferase enzyme family

USDA-ARS?s Scientific Manuscript database

Arylamine N-acetyltransferases (NATs) are xenobiotic metabolizing enzymes characterized in several bacteria and eukaryotic organisms. We report a comprehensive phylogenetic analysis employing an exhaustive dataset of NAT-homologous sequences recovered through inspection of 2445 genomes. We describe ...
Aquatic insect ecophysiological traits reveal phylogenetically based differences in dissolved cadmium susceptibility

USGS Publications Warehouse

Buchwalter, D.B.; Cain, D.J.; Martin, C.A.; Xie, Lingtian; Luoma, S.N.; Garland, T.

2008-01-01

We used a phylogenetically based comparative approach to evaluate the potential for physiological studies to reveal patterns of diversity in traits related to susceptibility to an environmental stressor, the trace metal cadmium (Cd). Physiological traits related to Cd bioaccumulation, compartmentalization, and ultimately susceptibility were measured in 21 aquatic insect species representing the orders Ephemeroptera, Plecoptera, and Trichoptera. We mapped these experimentally derived physiological traits onto a phylogeny and quantified the tendency for related species to be similar (phylogenetic signal). All traits related to Cd bioaccumulation and susceptibility exhibited statistically significant phylogenetic signal, although the signal strength varied among traits. Conventional and phylogenetically based regression models were compared, revealing great variability within orders but consistent, strong differences among insect families. Uptake and elimination rate constants were positively correlated among species, but only when effects of body size and phylogeny were incorporated in the analysis. Together, uptake and elimination rates predicted dramatic Cd bioaccumulation differences among species that agreed with field-based measurements. We discovered a potential tradeoff between the ability to eliminate Cd and the ability to detoxify it across species, particularly mayflies. The best-fit regression models were driven by phylogenetic parameters (especially differences among families) rather than functional traits, suggesting that it may eventually be possible to predict a taxon's physiological performance based on its phylogenetic position, provided adequate physiological information is available for close relatives. There appears to be great potential for evolutionary physiological approaches to augment our understanding of insect responses to environmental stressors in nature. ?? 2008 by The National Academy of Sciences of the USA.
SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data.

PubMed

Lee, Tae-Ho; Guo, Hui; Wang, Xiyin; Kim, Changsoo; Paterson, Andrew H

2014-02-26

Phylogenetic trees are widely used for genetic and evolutionary studies in various organisms. Advanced sequencing technology has dramatically enriched data available for constructing phylogenetic trees based on single nucleotide polymorphisms (SNPs). However, massive SNP data makes it difficult to perform reliable analysis, and there has been no ready-to-use pipeline to generate phylogenetic trees from these data. We developed a new pipeline, SNPhylo, to construct phylogenetic trees based on large SNP datasets. The pipeline may enable users to construct a phylogenetic tree from three representative SNP data file formats. In addition, in order to increase reliability of a tree, the pipeline has steps such as removing low quality data and considering linkage disequilibrium. A maximum likelihood method for the inference of phylogeny is also adopted in generation of a tree in our pipeline. Using SNPhylo, users can easily produce a reliable phylogenetic tree from a large SNP data file. Thus, this pipeline can help a researcher focus more on interpretation of the results of analysis of voluminous data sets, rather than manipulations necessary to accomplish the analysis.
Neutrophilic Iron-Oxidizing Zetaproteobacteria and Mild Steel Corrosion in Nearshore Marine Environments

DTIC Science & Technology

2011-02-16

were checked for the presence of heterotrophic bacteria by streak- ing a sample on ASW-R2A agar plates. DNA extraction and analysis of phylogenetic ...Bellerophon v. 3 (greengenes.lbl.gov) and Pintail (www.bioinformatics -toolkit.org/Web-Pintail/). Phylogenetic trees were constructed for SSU rRNA gene...CLUSTALW (44), and phylogenetic analyses were conducted in MEGA4 (42). The evolutionary history was inferred using the neighbor-joining method (39), and
The origins of modern biodiversity on land

PubMed Central

Benton, Michael J.

2010-01-01

Comparative studies of large phylogenies of living and extinct groups have shown that most biodiversity arises from a small number of highly species-rich clades. To understand biodiversity, it is important to examine the history of these clades on geological time scales. This is part of a distinct ‘phylogenetic expansion’ view of macroevolution, and contrasts with the alternative, non-phylogenetic ‘equilibrium’ approach to the history of biodiversity. The latter viewpoint focuses on density-dependent models in which all life is described by a single global-scale model, and a case is made here that this approach may be less successful at representing the shape of the evolution of life than the phylogenetic expansion approach. The terrestrial fossil record is patchy, but is adequate for coarse-scale studies of groups such as vertebrates that possess fossilizable hard parts. New methods in phylogenetic analysis, morphometrics and the study of exceptional biotas allow new approaches. Models for diversity regulation through time range from the entirely biotic to the entirely physical, with many intermediates. Tetrapod diversity has risen as a result of the expansion of ecospace, rather than niche subdivision or regional-scale endemicity resulting from continental break-up. Tetrapod communities on land have been remarkably stable and have changed only when there was a revolution in floras (such as the demise of the Carboniferous coal forests, or the Cretaceous radiation of angiosperms) or following particularly severe mass extinction events, such as that at the end of the Permian. PMID:20980315
The relationship between energy expenditure and speed during pedestrian locomotion in birds: a morphological basis for the elevated y-intercept?

PubMed

Halsey, Lewis G

2013-06-01

The slope of the typically linear relationship between metabolic rate and walking speed represents the net cost of transport (NCOT). The extrapolated y-intercept is often greater than resting metabolic rate, thus representing a fixed cost associated with pedestrian transport including body maintenance costs. The full cause of the elevated y-intercept remains elusive and it could simply represent experimental stresses. The present literature-based study compares the mass-independent energetic cost of pedestrian locomotion in birds (excluding those with an upright posture, i.e. penguins), represented by the y-intercept, to a known predictor of cost of transport, hip height. Both phylogenetically informed and non-phylogenetically informed analyses were undertaken to determine if patterns of association between hip height, body mass, and the y-intercept are robust with respect to the method of analysis. Body mass and hip height were significant predictors of the y-intercept in the best phylogenetically-informed and non-phylogenetically informed models. Thus there is evidence that, in birds at least, the elevated y-intercept is a legitimate component of locomotion energy expenditure. Hip height is probably a good proxy of effective limb length and thus perhaps birds with greater hip heights have lower y-intercepts because their longer legs more efficiently accommodate body motion and/or because their limbs are more aligned with the ground reaction forces. Copyright © 2013 Elsevier Inc. All rights reserved.
Comparative analysis of Campylobacter isolates from wild birds and chickens using MALDI-TOF MS, biochemical testing, and DNA sequencing.

PubMed

Lawton, Samantha J; Weis, Allison M; Byrne, Barbara A; Fritz, Heather; Taff, Conor C; Townsend, Andrea K; Weimer, Bart C; Mete, Aslı; Wheeler, Sarah; Boyce, Walter M

2018-05-01

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was compared to conventional biochemical testing methods and nucleic acid analyses (16S rDNA sequencing, hippurate hydrolysis gene testing, whole genome sequencing [WGS]) for species identification of Campylobacter isolates obtained from chickens ( Gallus gallus domesticus, n = 8), American crows ( Corvus brachyrhynchos, n = 17), a mallard duck ( Anas platyrhynchos, n = 1), and a western scrub-jay ( Aphelocoma californica, n = 1). The test results for all 27 isolates were in 100% agreement between MALDI-TOF MS, the combined results of 16S rDNA sequencing, and the hippurate hydrolysis gene PCR ( p = 0.0027, kappa = 1). Likewise, the identifications derived from WGS from a subset of 14 isolates were in 100% agreement with the MALDI-TOF MS identification. In contrast, biochemical testing misclassified 5 isolates of C. jejuni as C. coli, and 16S rDNA sequencing alone was not able to differentiate between C. coli and C. jejuni for 11 sequences ( p = 0.1573, kappa = 0.0857) when compared to MALDI-TOF MS and WGS. No agreement was observed between MALDI-TOF MS dendrograms and the phylogenetic relationships revealed by rDNA sequencing or WGS. Our results confirm that MALDI-TOF MS is a fast and reliable method for identifying Campylobacter isolates to the species level from wild birds and chickens, but not for elucidating phylogenetic relationships among Campylobacter isolates.
rbcL gene sequences provide evidence for the evolutionary lineages of leptosporangiate ferns.

PubMed

Hasebe, M; Omori, T; Nakazawa, M; Sano, T; Kato, M; Iwatsuki, K

1994-06-07

Pteriodophytes have a longer evolutionary history than any other vascular land plant and, therefore, have endured greater loss of phylogenetically informative information. This factor has resulted in substantial disagreements in evaluating characters and, thus, controversy in establishing a stable classification. To compare competing classifications, we obtained DNA sequences of a chloroplast gene. The sequence of 1206 nt of the large subunit of the ribulose-bisphosphate carboxylase gene (rbcL) was determined from 58 species, representing almost all families of leptosporangiate ferns. Phlogenetic trees were inferred by the neighbor-joining and the parsimony methods. The two methods produced almost identical phylogenetic trees that provided insights concerning major general evolutionary trends in the leptosporangiate ferns. Interesting findings were as follows: (i) two morphologically distinct heterosporous water ferns, Marsilea and Salvinia, are sister genera; (ii) the tree ferns (Cyatheaceae, Dicksoniaceae, and Metaxyaceae) are monophyletic; and (iii) polypodioids are distantly related to the gleichenioids in spite of the similarity of their exindusiate soral morphology and are close to the higher indusiate ferns. In addition, the affinities of several "problematic genera" were assessed.
Assessment of phylogenetic sensitivity for reconstructing HIV-1 epidemiological relationships.

PubMed

Beloukas, Apostolos; Magiorkinis, Emmanouil; Magiorkinis, Gkikas; Zavitsanou, Asimina; Karamitros, Timokratis; Hatzakis, Angelos; Paraskevis, Dimitrios

2012-06-01

Phylogenetic analysis has been extensively used as a tool for the reconstruction of epidemiological relations for research or for forensic purposes. It was our objective to assess the sensitivity of different phylogenetic methods and various phylogenetic programs to reconstruct epidemiological links among HIV-1 infected patients that is the probability to reveal a true transmission relationship. Multiple datasets (90) were prepared consisting of HIV-1 sequences in protease (PR) and partial reverse transcriptase (RT) sampled from patients with documented epidemiological relationship (target population), and from unrelated individuals (control population) belonging to the same HIV-1 subtype as the target population. Each dataset varied regarding the number, the geographic origin and the transmission risk groups of the sequences among the control population. Phylogenetic trees were inferred by neighbor-joining (NJ), maximum likelihood heuristics (hML) and Bayesian methods. All clusters of sequences belonging to the target population were correctly reconstructed by NJ and Bayesian methods receiving high bootstrap and posterior probability (PP) support, respectively. On the other hand, TreePuzzle failed to reconstruct or provide significant support for several clusters; high puzzling step support was associated with the inclusion of control sequences from the same geographic area as the target population. In contrary, all clusters were correctly reconstructed by hML as implemented in PhyML 3.0 receiving high bootstrap support. We report that under the conditions of our study, hML using PhyML, NJ and Bayesian methods were the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. Copyright © 2012 Elsevier B.V. All rights reserved.
A phylogenetically-based nomenclature for Cordycipitaceae (Hypocreales)

USDA-ARS?s Scientific Manuscript database

Changes in Article 59 of the International Code of Nomenclature for algae, fungi, and plants (ICN) disallow the use of dual nomenclatural systems for fungi. This change requires the reconciliation of competing names, ideally linked through culture based or molecular methods. The phylogenetic syste...
Phylogenetic Structure of Tree Species across Different Life Stages from Seedlings to Canopy Trees in a Subtropical Evergreen Broad-Leaved Forest.

PubMed

Jin, Yi; Qian, Hong; Yu, Mingjian

2015-01-01

Investigating patterns of phylogenetic structure across different life stages of tree species in forests is crucial to understanding forest community assembly, and investigating forest gap influence on the phylogenetic structure of forest regeneration is necessary for understanding forest community assembly. Here, we examine the phylogenetic structure of tree species across life stages from seedlings to canopy trees, as well as forest gap influence on the phylogenetic structure of forest regeneration in a forest of the subtropical region in China. We investigate changes in phylogenetic relatedness (measured as NRI) of tree species from seedlings, saplings, treelets to canopy trees; we compare the phylogenetic turnover (measured as βNRI) between canopy trees and seedlings in forest understory with that between canopy trees and seedlings in forest gaps. We found that phylogenetic relatedness generally increases from seedlings through saplings and treelets up to canopy trees, and that phylogenetic relatedness does not differ between seedlings in forest understory and those in forest gaps, but phylogenetic turnover between canopy trees and seedlings in forest understory is lower than that between canopy trees and seedlings in forest gaps. We conclude that tree species tend to be more closely related from seedling to canopy layers, and that forest gaps alter the seedling phylogenetic turnover of the studied forest. It is likely that the increasing trend of phylogenetic clustering as tree stem size increases observed in this subtropical forest is primarily driven by abiotic filtering processes, which select a set of closely related evergreen broad-leaved tree species whose regeneration has adapted to the closed canopy environments of the subtropical forest developed under the regional monsoon climate.
Phylogenetic Structure of Tree Species across Different Life Stages from Seedlings to Canopy Trees in a Subtropical Evergreen Broad-Leaved Forest

PubMed Central

Jin, Yi; Qian, Hong; Yu, Mingjian

2015-01-01

Investigating patterns of phylogenetic structure across different life stages of tree species in forests is crucial to understanding forest community assembly, and investigating forest gap influence on the phylogenetic structure of forest regeneration is necessary for understanding forest community assembly. Here, we examine the phylogenetic structure of tree species across life stages from seedlings to canopy trees, as well as forest gap influence on the phylogenetic structure of forest regeneration in a forest of the subtropical region in China. We investigate changes in phylogenetic relatedness (measured as NRI) of tree species from seedlings, saplings, treelets to canopy trees; we compare the phylogenetic turnover (measured as βNRI) between canopy trees and seedlings in forest understory with that between canopy trees and seedlings in forest gaps. We found that phylogenetic relatedness generally increases from seedlings through saplings and treelets up to canopy trees, and that phylogenetic relatedness does not differ between seedlings in forest understory and those in forest gaps, but phylogenetic turnover between canopy trees and seedlings in forest understory is lower than that between canopy trees and seedlings in forest gaps. We conclude that tree species tend to be more closely related from seedling to canopy layers, and that forest gaps alter the seedling phylogenetic turnover of the studied forest. It is likely that the increasing trend of phylogenetic clustering as tree stem size increases observed in this subtropical forest is primarily driven by abiotic filtering processes, which select a set of closely related evergreen broad-leaved tree species whose regeneration has adapted to the closed canopy environments of the subtropical forest developed under the regional monsoon climate. PMID:26098916
Symbiosis between hydra and chlorella: molecular phylogenetic analysis and experimental study provide insight into its origin and evolution.

PubMed

Kawaida, Hitomi; Ohba, Kohki; Koutake, Yuhki; Shimizu, Hiroshi; Tachida, Hidenori; Kobayakawa, Yoshitaka

2013-03-01

Although many physiological studies have been reported on the symbiosis between hydra and green algae, very little information from a molecular phylogenetic aspect of symbiosis is available. In order to understand the origin and evolution of symbiosis between the two organisms, we compared the phylogenetic relationships among symbiotic green algae with the phylogenetic relationships among host hydra strains. To do so, we reconstructed molecular phylogenetic trees of several strains of symbiotic chlorella harbored in the endodermal epithelial cells of viridissima group hydra strains and investigated their congruence with the molecular phylogenetic trees of the host hydra strains. To examine the species specificity between the host and the symbiont with respect to the genetic distance, we also tried to introduce chlorella strains into two aposymbiotic strains of viridissima group hydra in which symbiotic chlorella had been eliminated in advance. We discussed the origin and history of symbiosis between hydra and green algae based on the analysis. Copyright © 2012 Elsevier Inc. All rights reserved.
Methods for determining the genetic affinity of microorganisms and viruses

NASA Technical Reports Server (NTRS)

Fox, George E. (Inventor); Willson, III, Richard C. (Inventor); Zhang, Zhengdong (Inventor)

2012-01-01

Selecting which sub-sequences in a database of nucleic acid such as 16S rRNA are highly characteristic of particular groupings of bacteria, microorganisms, fungi, etc. on a substantially phylogenetic tree. Also applicable to viruses comprising viral genomic RNA or DNA. A catalogue of highly characteristic sequences identified by this method is assembled to establish the genetic identity of an unknown organism. The characteristic sequences are used to design nucleic acid hybridization probes that include the characteristic sequence or its complement, or are derived from one or more characteristic sequences. A plurality of these characteristic sequences is used in hybridization to determine the phylogenetic tree position of the organism(s) in a sample. Those target organisms represented in the original sequence database and sufficient characteristic sequences can identify to the species or subspecies level. Oligonucleotide arrays of many probes are especially preferred. A hybridization signal can comprise fluorescence, chemiluminescence, or isotopic labeling, etc.; or sequences in a sample can be detected by direct means, e.g. mass spectrometry. The method's characteristic sequences can also be used to design specific PCR primers. The method uniquely identifies the phylogenetic affinity of an unknown organism without requiring prior knowledge of what is present in the sample. Even if the organism has not been previously encountered, the method still provides useful information about which phylogenetic tree bifurcation nodes encompass the organism.
False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing.

PubMed

Xiao, Jian; Cao, Hongyuan; Chen, Jun

2017-09-15

Next generation sequencing technologies have enabled the study of the human microbiome through direct sequencing of microbial DNA, resulting in an enormous amount of microbiome sequencing data. One unique characteristic of microbiome data is the phylogenetic tree that relates all the bacterial species. Closely related bacterial species have a tendency to exhibit a similar relationship with the environment or disease. Thus, incorporating the phylogenetic tree information can potentially improve the detection power for microbiome-wide association studies, where hundreds or thousands of tests are conducted simultaneously to identify bacterial species associated with a phenotype of interest. Despite much progress in multiple testing procedures such as false discovery rate (FDR) control, methods that take into account the phylogenetic tree are largely limited. We propose a new FDR control procedure that incorporates the prior structure information and apply it to microbiome data. The proposed procedure is based on a hierarchical model, where a structure-based prior distribution is designed to utilize the phylogenetic tree. By borrowing information from neighboring bacterial species, we are able to improve the statistical power of detecting associated bacterial species while controlling the FDR at desired levels. When the phylogenetic tree is mis-specified or non-informative, our procedure achieves a similar power as traditional procedures that do not take into account the tree structure. We demonstrate the performance of our method through extensive simulations and real microbiome datasets. We identified far more alcohol-drinking associated bacterial species than traditional methods. R package StructFDR is available from CRAN. chen.jun2@mayo.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Mitochondrial genomes of Meloidogyne chitwoodi and M. incognita (Nematoda: Tylenchina): comparative analysis, gene order and phylogenetic relationships with other nematodes.

PubMed

Humphreys-Pereira, Danny A; Elling, Axel A

2014-01-01

Root-knot nematodes (Meloidogyne spp.) are among the most important plant pathogens. In this study, the mitochondrial (mt) genomes of the root-knot nematodes, M. chitwoodi and M. incognita were sequenced. PCR analyses suggest that both mt genomes are circular, with an estimated size of 19.7 and 18.6-19.1kb, respectively. The mt genomes each contain a large non-coding region with tandem repeats and the control region. The mt gene arrangement of M. chitwoodi and M. incognita is unlike that of other nematodes. Sequence alignments of the two Meloidogyne mt genomes showed three translocations; two in transfer RNAs and one in cox2. Compared with other nematode mt genomes, the gene arrangement of M. chitwoodi and M. incognita was most similar to Pratylenchus vulnus. Phylogenetic analyses (Maximum Likelihood and Bayesian inference) were conducted using 78 complete mt genomes of diverse nematode species. Analyses based on nucleotides and amino acids of the 12 protein-coding mt genes showed strong support for the monophyly of class Chromadorea, but only amino acid-based analyses supported the monophyly of class Enoplea. The suborder Spirurina was not monophyletic in any of the phylogenetic analyses, contradicting the Clade III model, which groups Ascaridomorpha, Spiruromorpha and Oxyuridomorpha based on the small subunit ribosomal RNA gene. Importantly, comparisons of mt gene arrangement and tree-based methods placed Meloidogyne as sister taxa of Pratylenchus, a migratory plant endoparasitic nematode, and not with the sedentary endoparasitic Heterodera. Thus, comparative analyses of mt genomes suggest that sedentary endoparasitism in Meloidogyne and Heterodera is based on convergent evolution. Copyright © 2014 Elsevier B.V. All rights reserved.
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

DOE PAGES

Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; ...

2016-11-24

Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.

Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less
Missing Data and Influential Sites: Choice of Sites for Phylogenetic Analysis Can Be As Important As Taxon Sampling and Model Choice

PubMed Central

Shavit Grievink, Liat; Penny, David; Holland, Barbara R.

2013-01-01

Phylogenetic studies based on molecular sequence alignments are expected to become more accurate as the number of sites in the alignments increases. With the advent of genomic-scale data, where alignments have very large numbers of sites, bootstrap values close to 100% and posterior probabilities close to 1 are the norm, suggesting that the number of sites is now seldom a limiting factor on phylogenetic accuracy. This provokes the question, should we be fussy about the sites we choose to include in a genomic-scale phylogenetic analysis? If some sites contain missing data, ambiguous character states, or gaps, then why not just throw them away before conducting the phylogenetic analysis? Indeed, this is exactly the approach taken in many phylogenetic studies. Here, we present an example where the decision on how to treat sites with missing data is of equal importance to decisions on taxon sampling and model choice, and we introduce a graphical method for illustrating this. PMID:23471508

[Phylogeny of protostome moulting animals (Ecdysozoa) inferred from 18 and 28S rRNA gene sequences].

PubMed

Petrov, N B; Vladychenskaia, N S

2005-01-01

Reliability of reconstruction of phylogenetic relationships within a group of protostome moulting animals was evaluated by means of comparison of 18 and 28S rRNA gene sequences sets both taken separately and combined. Reliability of reconstructions was evaluated by values of the bootstrap support of major phylogenetic tree nodes and by degree of congruence of phylogenetic trees inferred by various methods. By both criteria, phylogenetic trees reconstructed from the combined 18 and 28S rRNA gene sequences were better than those inferred from 18 and 28S sequences taken separately. Results obtained are consistent with phylogenetic hypothesis separating protostome animals into two major clades, moulting Ecdysozoa (Priapulida + Kinorhyncha, Nematoda + Nematomorpha, Onychophora + Tardigrada, Myriapoda + Chelicerata, Crustacea + Hexapoda) and unmoulting Lophotrochozoa (Plathelminthes, Nemertini, Annelida, Mollusca, Echiura, Sipuncula). Clade Cephalorhyncha does not include nematomorphs (Nematomorpha). Conclusion was taken that it is necessary to use combined 18 and 28S data in phylogenetic studies.
Lactobacillus nantensis sp. nov., isolated from French wheat sourdough.

PubMed

Valcheva, Rosica; Ferchichi, Mounir F; Korakli, Maher; Ivanova, Iskra; Gänzle, Michael G; Vogel, Rudi F; Prévost, Hervé; Onno, Bernard; Dousset, Xavier

2006-03-01

A polyphasic taxonomic study of the bacterial flora isolated from traditional French wheat sourdough, using phenotypic characterization and phylogenetic as well as genetic methods, revealed a consistent group of isolates that could not be assigned to any recognized species. These results were confirmed by randomly amplified polymorphic DNA and amplified fragment length polymorphism fingerprinting analyses. Cells were Gram-positive, homofermentative rods. Comparative 16S rRNA gene sequence analysis of the representative strain LP33T indicated that these strains belong to the genus Lactobacillus and that they formed a branch distinct from their closest relatives Lactobacillus farciminis, Lactobacillus alimentarius, Lactobacillus paralimentarius and Lactobacillus mindensis. DNA-DNA reassociation experiments with the three phylogenetically closest Lactobacillus species confirmed that LP33T (= DSM 16982T = CIP 108546T = TMW 1.1265T) represents the type strain of a novel species, for which the name Lactobacillus nantensis sp. nov. is proposed.
The Use of Weighted Graphs for Large-Scale Genome Analysis

PubMed Central

Zhou, Fang; Toivonen, Hannu; King, Ross D.

2014-01-01

There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution. PMID:24619061
Using hybridization networks to retrace the evolution of Indo-European languages.

PubMed

Willems, Matthieu; Lord, Etienne; Laforest, Louise; Labelle, Gilbert; Lapointe, François-Joseph; Di Sciullo, Anna Maria; Makarenkov, Vladimir

2016-09-06

Curious parallels between the processes of species and language evolution have been observed by many researchers. Retracing the evolution of Indo-European (IE) languages remains one of the most intriguing intellectual challenges in historical linguistics. Most of the IE language studies use the traditional phylogenetic tree model to represent the evolution of natural languages, thus not taking into account reticulate evolutionary events, such as language hybridization and word borrowing which can be associated with species hybridization and horizontal gene transfer, respectively. More recently, implicit evolutionary networks, such as split graphs and minimal lateral networks, have been used to account for reticulate evolution in linguistics. Striking parallels existing between the evolution of species and natural languages allowed us to apply three computational biology methods for reconstruction of phylogenetic networks to model the evolution of IE languages. We show how the transfer of methods between the two disciplines can be achieved, making necessary methodological adaptations. Considering basic vocabulary data from the well-known Dyen's lexical database, which contains word forms in 84 IE languages for the meanings of a 200-meaning Swadesh list, we adapt a recently developed computational biology algorithm for building explicit hybridization networks to study the evolution of IE languages and compare our findings to the results provided by the split graph and galled network methods. We conclude that explicit phylogenetic networks can be successfully used to identify donors and recipients of lexical material as well as the degree of influence of each donor language on the corresponding recipient languages. We show that our algorithm is well suited to detect reticulate relationships among languages, and present some historical and linguistic justification for the results obtained. Our findings could be further refined if relevant syntactic, phonological and morphological data could be analyzed along with the available lexical data.
Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing

PubMed Central

Hykin, Sarah M.; Bi, Ke; McGuire, Jimmy A.

2015-01-01

For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis. PMID:26505622
Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

PubMed

Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

2015-01-01

For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.
Conservation Action Based on Threatened Species Capture Taxonomic and Phylogenetic Richness in Breeding and Wintering Populations of Central Asian Birds

PubMed Central

Schweizer, Manuel; Ayé, Raffael; Kashkarov, Roman; Roth, Tobias

2014-01-01

Although phylogenetic diversity has been suggested to be relevant from a conservation point of view, its role is still limited in applied nature conservation. Recently, the practice of investing conservation resources based on threatened species was identified as a reason for the slow integration of phylogenetic diversity in nature conservation planning. One of the main arguments is based on the observation that threatened species are not evenly distributed over the phylogenetic tree. However this argument seems to dismiss the fact that conservation action is a spatially explicit process, and even if threatened species are not evenly distributed over the phylogenetic tree, the occurrence of threatened species could still indicate areas with above average phylogenetic diversity and consequently could protect phylogenetic diversity. Here we aim to study the selection of important bird areas in Central Asia, which were nominated largely based on the presence of threatened bird species. We show that although threatened species occurring in Central Asia do not capture phylogenetically more distinct species than expected by chance, the current spatially explicit conservation approach of selecting important bird areas covers above average taxonomic and phylogenetic diversity of breeding and wintering birds. We conclude that the spatially explicit processes of conservation actions need to be considered in the current discussion of whether new prioritization methods are needed to complement conservation action based on threatened species. PMID:25337861
Optimal network alignment with graphlet degree vectors.

PubMed

Milenković, Tijana; Ng, Weng Leong; Hayes, Wayne; Przulj, Natasa

2010-06-30

Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.
Phylogenetic studies of transmission dynamics in generalized HIV epidemics: An essential tool where the burden is greatest?

PubMed Central

Dennis, Ann M.; Herbeck, Joshua T.; Brown, Andrew Leigh; Kellam, Paul; de Oliveira, Tulio; Pillay, Deenan; Fraser, Christophe; Cohen, Myron S.

2014-01-01

Efficient and effective HIV prevention measures for generalized epidemics in sub-Saharan Africa have not yet been validated at the population-level. Design and impact evaluation of such measures requires fine-scale understanding of local HIV transmission dynamics. The novel tools of HIV phylogenetics and molecular epidemiology may elucidate these transmission dynamics. Such methods have been incorporated into studies of concentrated HIV epidemics to identify proximate and determinant traits associated with ongoing transmission. However, applying similar phylogenetic analyses to generalized epidemics, including the design and evaluation of prevention trials, presents additional challenges. Here we review the scope of these methods and present examples of their use in concentrated epidemics in the context of prevention. Next, we describe the current uses for phylogenetics in generalized epidemics, and discuss their promise for elucidating transmission patterns and informing prevention trials. Finally, we review logistic and technical challenges inherent to large-scale molecular epidemiological studies of generalized epidemics, and suggest potential solutions. PMID:24977473
Primate comparative neuroscience using magnetic resonance imaging: promises and challenges

PubMed Central

Mars, Rogier B.; Neubert, Franz-Xaver; Verhagen, Lennart; Sallet, Jérôme; Miller, Karla L.; Dunbar, Robin I. M.; Barton, Robert A.

2014-01-01

Primate comparative anatomy is an established field that has made rich and substantial contributions to neuroscience. However, the labor-intensive techniques employed mean that most comparisons are often based on a small number of species, which limits the conclusions that can be drawn. In this review we explore how new developments in magnetic resonance imaging have the potential to apply comparative neuroscience to a much wider range of species, allowing it to realize an even greater potential. We discuss (1) new advances in the types of data that can be acquired, (2) novel methods for extracting meaningful measures from such data that can be compared between species, and (3) methods to analyse these measures within a phylogenetic framework. Together these developments will allow researchers to characterize the relationship between different brains, the ecological niche they occupy, and the behavior they produce in more detail than ever before. PMID:25339857
Using Genotype Abundance to Improve Phylogenetic Inference

PubMed Central

Mesin, Luka; Victora, Gabriel D; Minin, Vladimir N; Matsen, Frederick A

2018-01-01

Abstract Modern biological techniques enable very dense genetic sampling of unfolding evolutionary histories, and thus frequently sample some genotypes multiple times. This motivates strategies to incorporate genotype abundance information in phylogenetic inference. In this article, we synthesize a stochastic process model with standard sequence-based phylogenetic optimality, and show that tree estimation is substantially improved by doing so. Our method is validated with extensive simulations and an experimental single-cell lineage tracing study of germinal center B cell receptor affinity maturation. PMID:29474671
Dimensional Reduction for the General Markov Model on Phylogenetic Trees.

PubMed

Sumner, Jeremy G

2017-03-01

We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.
Divergent ancestral lineages of newfound hantaviruses harbored by phylogenetically related crocidurine shrew species in Korea

PubMed Central

Arai, Satoru; Gu, Se Hun; Baek, Luck Ju; Tabara, Kenji; Bennett, Shannon; Oh, Hong-Shik; Takada, Nobuhiro; Kang, Hae Ji; Tanaka-Taya, Keiko; Morikawa, Shigeru; Okabe, Nobuhiko; Yanagihara, Richard; Song, Jin-Won

2012-01-01

Spurred by the recent isolation of a novel hantavirus, named Imjin virus (MJNV), from the Ussuri white-toothed shrew (Crocidura lasiura), targeted trapping was conducted for the phylogenetically related Asian lesser white-toothed shrew (Crocidura shantungensis). Pair-wise alignment and comparison of the S, M and L segments of a newfound hantavirus, designated Jeju virus (JJUV), indicated remarkably low nucleotide and amino acid sequence similarity with MJNV. Phylogenetic analyses, using maximum likelihood and Bayesian methods, showed divergent ancestral lineages for JJUV and MJNV, despite the close phylogenetic relationship of their reservoir soricid hosts. Also, no evidence of host switching was apparent in tanglegrams, generated by TreeMap 2.0β. PMID:22230701
Phylogenetics.

PubMed

Sleator, Roy D

2011-04-01

The recent rapid expansion in the DNA and protein databases, arising from large-scale genomic and metagenomic sequence projects, has forced significant development in the field of phylogenetics: the study of the evolutionary relatedness of the planet's inhabitants. Advances in phylogenetic analysis have greatly transformed our view of the landscape of evolutionary biology, transcending the view of the tree of life that has shaped evolutionary theory since Darwinian times. Indeed, modern phylogenetic analysis no longer focuses on the restricted Darwinian-Mendelian model of vertical gene transfer, but must also consider the significant degree of lateral gene transfer, which connects and shapes almost all living things. Herein, I review the major tree-building methods, their strengths, weaknesses and future prospects.
Determining protein function and interaction from genome analysis

DOEpatents

Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

2004-08-03

A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
The riddle of Tasmanian languages

PubMed Central

Bowern, Claire

2012-01-01

Recent work which combines methods from linguistics and evolutionary biology has been fruitful in discovering the history of major language families because of similarities in evolutionary processes. Such work opens up new possibilities for language research on previously unsolvable problems, especially in areas where information from other sources may be lacking. I use phylogenetic methods to investigate Tasmanian languages. Existing materials are so fragmentary that scholars have been unable to discover how many languages are represented in the sources. Using a clustering algorithm which identifies admixture, source materials representing more than one language are identified. Using the Neighbor-Net algorithm, 12 languages are identified in five clusters. Bayesian phylogenetic methods reveal that the families are not demonstrably related; an important result, given the importance of Tasmanian Aborigines for information about how societies have responded to population collapse in prehistory. This work provides insight into the societies of prehistoric Tasmania and illustrates a new utility of phylogenetics in reconstructing linguistic history. PMID:23015621
Monte Carlo estimation of total variation distance of Markov chains on large spaces, with application to phylogenetics.

PubMed

Herbei, Radu; Kubatko, Laura

2013-03-26

Markov chains are widely used for modeling in many areas of molecular biology and genetics. As the complexity of such models advances, it becomes increasingly important to assess the rate at which a Markov chain converges to its stationary distribution in order to carry out accurate inference. A common measure of convergence to the stationary distribution is the total variation distance, but this measure can be difficult to compute when the state space of the chain is large. We propose a Monte Carlo method to estimate the total variation distance that can be applied in this situation, and we demonstrate how the method can be efficiently implemented by taking advantage of GPU computing techniques. We apply the method to two Markov chains on the space of phylogenetic trees, and discuss the implications of our findings for the development of algorithms for phylogenetic inference.
Molecular phylogeny of the aquatic beetle family Noteridae (Coleoptera: Adephaga) with an emphasis on data partitioning strategies.

PubMed

Baca, Stephen M; Toussaint, Emmanuel F A; Miller, Kelly B; Short, Andrew E Z

2017-02-01

The first molecular phylogenetic hypothesis for the aquatic beetle family Noteridae is inferred using DNA sequence data from five gene fragments (mitochondrial and nuclear): COI, H3, 16S, 18S, and 28S. Our analysis is the most comprehensive phylogenetic reconstruction of Noteridae to date, and includes 53 species representing all subfamilies, tribes and 16 of the 17 genera within the family. We examine the impact of data partitioning on phylogenetic inference by comparing two different algorithm-based partitioning strategies: one using predefined subsets of the dataset, and another recently introduced method, which uses the k-means algorithm to iteratively divide the dataset into clusters of sites evolving at similar rates across sampled loci. We conducted both maximum likelihood and Bayesian inference analyses using these different partitioning schemes. Resulting trees are strongly incongruent with prior classifications of Noteridae. We recover variant tree topologies and support values among the implemented partitioning schemes. Bayes factors calculated with marginal likelihoods of Bayesian analyses support a priori partitioning over k-means and unpartitioned data strategies. Our study substantiates the importance of data partitioning in phylogenetic inference, and underscores the use of comparative analyses to determine optimal analytical strategies. Our analyses recover Noterini Thomson to be paraphyletic with respect to three other tribes. The genera Suphisellus Crotch and Hydrocanthus Say are also recovered as paraphyletic. Following the results of the preferred partitioning scheme, we here propose a revised classification of Noteridae, comprising two subfamilies, three tribes and 18 genera. The following taxonomic changes are made: Notomicrinae sensu n. (= Phreatodytinae syn. n.) is expanded to include the tribe Phreatodytini; Noterini sensu n. (= Neohydrocoptini syn. n., Pronoterini syn. n., Tonerini syn. n.) is expanded to include all genera of the Noterinae; The genus Suphisellus Crotch is expanded to include species of Pronoterus Sharp syn. n.; and the former subgenus Sternocanthus Guignot stat. rev. is resurrected from synonymy and elevated to genus rank. Copyright © 2016 Elsevier Inc. All rights reserved.
Reticulate evolutionary history and extensive introgression in mosquito species revealed by phylogenetic network analysis

PubMed Central

Wen, Dingqiao; Yu, Yun; Hahn, Matthew W.; Nakhleh, Luay

2016-01-01

The role of hybridization and subsequent introgression has been demonstrated in an increasing number of species. Recently, Fontaine et al. (Science, 347, 2015, 1258524) conducted a phylogenomic analysis of six members of the Anopheles gambiae species complex. Their analysis revealed a reticulate evolutionary history and pointed to extensive introgression on all four autosomal arms. The study further highlighted the complex evolutionary signals that the co-occurrence of incomplete lineage sorting (ILS) and introgression can give rise to in phylogenomic analyses. While tree-based methodologies were used in the study, phylogenetic networks provide a more natural model to capture reticulate evolutionary histories. In this work, we reanalyse the Anopheles data using a recently devised framework that combines the multispecies coalescent with phylogenetic networks. This framework allows us to capture ILS and introgression simultaneously, and forms the basis for statistical methods for inferring reticulate evolutionary histories. The new analysis reveals a phylogenetic network with multiple hybridization events, some of which differ from those reported in the original study. To elucidate the extent and patterns of introgression across the genome, we devise a new method that quantifies the use of reticulation branches in the phylogenetic network by each genomic region. Applying the method to the mosquito data set reveals the evolutionary history of all the chromosomes. This study highlights the utility of ‘network thinking’ and the new insights it can uncover, in particular in phylogenomic analyses of large data sets with extensive gene tree incongruence. PMID:26808290
Relationships among genera of the Saccharomycotina (Ascomycota) from multigene phylogenetic analysis of type species

USDA-ARS?s Scientific Manuscript database

Phylogenetic relatedness among ascomycetous yeast genera (subphylum Saccharomycotina, phylum Ascomycota) has been uncertain. In the present study, type species of 70 currently recognized genera are compared from divergence in the nearly entire nuclear gene sequences for large subunit rRNA, small sub...

Constructing Student Problems in Phylogenetic Tree Construction.

ERIC Educational Resources Information Center

Brewer, Steven D.

Evolution is often equated with natural selection and is taught from a primarily functional perspective while comparative and historical approaches, which are critical for developing an appreciation of the power of evolutionary theory, are often neglected. This report describes a study of expert problem-solving in phylogenetic tree construction.…
Cyber-infrastructure for Fusarium (CiF): Three integrated platforms supporting strain identification, phylogenetics, comparative genomics, and knowledge sharing

USDA-ARS?s Scientific Manuscript database

The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on ...
Disentangling the phylogenetic and ecological components of spider phenotypic variation.

PubMed

Gonçalves-Souza, Thiago; Diniz-Filho, José Alexandre Felizola; Romero, Gustavo Quevedo

2014-01-01

An understanding of how the degree of phylogenetic relatedness influences the ecological similarity among species is crucial to inferring the mechanisms governing the assembly of communities. We evaluated the relative importance of spider phylogenetic relationships and ecological niche (plant morphological variables) to the variation in spider body size and shape by comparing spiders at different scales: (i) between bromeliads and dicot plants (i.e., habitat scale) and (ii) among bromeliads with distinct architectural features (i.e., microhabitat scale). We partitioned the interspecific variation in body size and shape into phylogenetic (that express trait values as expected by phylogenetic relationships among species) and ecological components (that express trait values independent of phylogenetic relationships). At the habitat scale, bromeliad spiders were larger and flatter than spiders associated with the surrounding dicots. At this scale, plant morphology sorted out close related spiders. Our results showed that spider flatness is phylogenetically clustered at the habitat scale, whereas it is phylogenetically overdispersed at the microhabitat scale, although phylogenic signal is present in both scales. Taken together, these results suggest that whereas at the habitat scale selective colonization affect spider body size and shape, at fine scales both selective colonization and adaptive evolution determine spider body shape. By partitioning the phylogenetic and ecological components of phenotypic variation, we were able to disentangle the evolutionary history of distinct spider traits and show that plant architecture plays a role in the evolution of spider body size and shape. We also discussed the relevance in considering multiple scales when studying phylogenetic community structure.
Disentangling the Phylogenetic and Ecological Components of Spider Phenotypic Variation

PubMed Central

Gonçalves-Souza, Thiago; Diniz-Filho, José Alexandre Felizola; Romero, Gustavo Quevedo

2014-01-01

An understanding of how the degree of phylogenetic relatedness influences the ecological similarity among species is crucial to inferring the mechanisms governing the assembly of communities. We evaluated the relative importance of spider phylogenetic relationships and ecological niche (plant morphological variables) to the variation in spider body size and shape by comparing spiders at different scales: (i) between bromeliads and dicot plants (i.e., habitat scale) and (ii) among bromeliads with distinct architectural features (i.e., microhabitat scale). We partitioned the interspecific variation in body size and shape into phylogenetic (that express trait values as expected by phylogenetic relationships among species) and ecological components (that express trait values independent of phylogenetic relationships). At the habitat scale, bromeliad spiders were larger and flatter than spiders associated with the surrounding dicots. At this scale, plant morphology sorted out close related spiders. Our results showed that spider flatness is phylogenetically clustered at the habitat scale, whereas it is phylogenetically overdispersed at the microhabitat scale, although phylogenic signal is present in both scales. Taken together, these results suggest that whereas at the habitat scale selective colonization affect spider body size and shape, at fine scales both selective colonization and adaptive evolution determine spider body shape. By partitioning the phylogenetic and ecological components of phenotypic variation, we were able to disentangle the evolutionary history of distinct spider traits and show that plant architecture plays a role in the evolution of spider body size and shape. We also discussed the relevance in considering multiple scales when studying phylogenetic community structure. PMID:24651264
Phylogenetic analysis of molecular and morphological data highlights uncertainty in the relationships of fossil and living species of Elopomorpha (Actinopterygii: Teleostei).

PubMed

Dornburg, Alex; Friedman, Matt; Near, Thomas J

2015-08-01

Elopomorpha is one of the three main clades of living teleost fishes and includes a range of disparate lineages including eels, tarpons, bonefishes, and halosaurs. Elopomorphs were among the first groups of fishes investigated using Hennigian phylogenetic methods and continue to be the object of intense phylogenetic scrutiny due to their economic significance, diversity, and crucial evolutionary status as the sister group of all other teleosts. While portions of the phylogenetic backbone for Elopomorpha are consistent between studies, the relationships among Albula, Pterothrissus, Notacanthiformes, and Anguilliformes remain contentious and difficult to evaluate. This lack of phylogenetic resolution is problematic as fossil lineages are often described and placed taxonomically based on an assumed sister group relationship between Albula and Pterothrissus. In addition, phylogenetic studies using morphological data that sample elopomorph fossil lineages often do not include notacanthiform or anguilliform lineages, potentially introducing a bias toward interpreting fossils as members of the common stem of Pterothrissus and Albula. Here we provide a phylogenetic analysis of DNA sequences sampled from multiple nuclear genes that include representative taxa from Albula, Pterothrissus, Notacanthiformes and Anguilliformes. We integrate our molecular dataset with a morphological character matrix that spans both living and fossil elopomorph lineages. Our results reveal substantial uncertainty in the placement of Pterothrissus as well as all sampled fossil lineages, questioning the stability of the taxonomy of fossil Elopomorpha. However, despite topological uncertainty, our integration of fossil lineages into a Bayesian time calibrated framework provides divergence time estimates for the clade that are consistent with previously published age estimates based on the elopomorph fossil record and molecular estimates resulting from traditional node-dating methods. Copyright © 2015 Elsevier Inc. All rights reserved.
On the distribution of interspecies correlation for Markov models of character evolution on Yule trees.

PubMed

Mulder, Willem H; Crawford, Forrest W

2015-01-07

Efforts to reconstruct phylogenetic trees and understand evolutionary processes depend fundamentally on stochastic models of speciation and mutation. The simplest continuous-time model for speciation in phylogenetic trees is the Yule process, in which new species are "born" from existing lineages at a constant rate. Recent work has illuminated some of the structural properties of Yule trees, but it remains mostly unknown how these properties affect sequence and trait patterns observed at the tips of the phylogenetic tree. Understanding the interplay between speciation and mutation under simple models of evolution is essential for deriving valid phylogenetic inference methods and gives insight into the optimal design of phylogenetic studies. In this work, we derive the probability distribution of interspecies covariance under Brownian motion and Ornstein-Uhlenbeck models of phenotypic change on a Yule tree. We compute the probability distribution of the number of mutations shared between two randomly chosen taxa in a Yule tree under discrete Markov mutation models. Our results suggest summary measures of phylogenetic information content, illuminate the correlation between site patterns in sequences or traits of related organisms, and provide heuristics for experimental design and reconstruction of phylogenetic trees. Copyright © 2014 Elsevier Ltd. All rights reserved.
Phylogeny and phylogenetic classification of the antbirds, ovenbirds, woodcreepers, and allies (Aves: Passeriformes: Infraorder Furnariides)

USGS Publications Warehouse

Moyle, R.G.; Chesser, R.T.; Brumfield, R.T.; Tello, J.G.; Marchese, D.J.; Cracraft, J.

2009-01-01

The infraorder Furnariides is a diverse group of suboscine passerine birds comprising a substantial component of the Neotropical avifauna. The included species encompass a broad array of morphologies and behaviours, making them appealing for evolutionary studies, but the size of the group (ca. 600 species) has limited well-sampled higher-level phylogenetic studies. Using DNA sequence data from the nuclear RAG-1 and RAG-2 exons, we undertook a phylogenetic analysis of the Furnariides sampling 124 (more than 88%) of the genera. Basal relationships among family-level taxa differed depending on phylogenetic method, but all topologies had little nodal support, mirroring the results from earlier studies in which discerning relationships at the base of the radiation was also difficult. In contrast, branch support for family-rank taxa and for many relationships within those clades was generally high. Our results support the Melanopareidae and Grallariidae as distinct from the Rhinocryptidae and Formicariidae, respectively. Within the Furnariides our data contradict some recent phylogenetic hypotheses and suggest that further study is needed to resolve these discrepancies. Of the few genera represented by multiple species, several were not monophyletic, indicating that additional systematic work remains within furnariine families and must include dense taxon sampling. We use this study as a basis for proposing a new phylogenetic classification for the group and in the process erect new family-group names for clades having high branch support across methods. ?? 2009 The Willi Hennig Society.
Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions

PubMed Central

Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize

2017-01-01

Background Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. Methods DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Results Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia. Discussion This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds. PMID:29302399
Conformation of phylogenetic relationship of Penaeidae shrimp based on morphometric and molecular investigations.

PubMed

Rajakumaran, P; Vaseeharan, B; Jayakumar, R; Chidambara, R

2014-01-01

Understanding of accurate phylogenetic relationship among Penaeidae shrimp is important for academic and fisheries industry. The Morphometric and Randomly amplified polymorphic DNA (RAPD) analysis was used to make the phylogenetic relationsip among 13 Penaeidae shrimp. For morphometric analysis forty variables and total lengths of shrimp were measured for each species, and removed the effect of size variation. The size normalized values obtained was subjected to UPGMA (Unweighted Pair-Group Method with Arithmetic Mean) cluster analysis. For RAPD analysis, the four primers showed reliable differentiation between species, and used correlation coefficient between the DNA banding patterns of 13 Penaeidae species to construct UPGMA dendrogram. Phylogenetic relationship from morphometric and molecular analysis for Penaeidae species found to be congruent. We concluded that as the results from morphometry investigations concur with molecular one, phylogenetic relationship obtained for the studied Penaeidae are considered to be reliable.
Morphological, molecular and phylogenetic analyses of Diplotriaena bargusinica Skrjabin, 1917 (Nematoda: Diplotriaenidae).

PubMed

Dutra Vieira, Thainá; Pegoraro de Macedo, Marcia Raquel; Fedatto Bernardon, Fabiana; Müller, Gertrud

2017-10-01

The nematode Diplotriaena bargusinica is a bird air sac parasite, and its taxonomy is based mainly on morphological and morphometric characteristics. Increasing knowledge of genetic information variability has spurred the use of DNA markers in conjunction with morphological data for inferring phylogenetic relationships in different taxa. Considering the potential of molecular biology in taxonomy, this study presents the morphological and molecular characterization of D. bargusinica, and establishes the phylogenetic position of the nematode in Spirurina. Twenty partial sequences of the 18S region of D. bargusinica rDNA were generated. Phylogenetic trees were obtained through the Maximum Likelihood and Bayesian Inference methods where both had similar topology. The group Diplotriaenoidea is monophyletic and the topologies generated corroborate the phylogenetic studies based on traditional and previously performed molecular taxonomy. This study is the first to generate molecular data associated with the morphology of the species. Copyright © 2017 Elsevier B.V. All rights reserved.
dCITE: Measuring Necessary Cladistic Information Can Help You Reduce Polytomy Artefacts in Trees.

PubMed

Wise, Michael J

2016-01-01

Biologists regularly create phylogenetic trees to better understand the evolutionary origins of their species of interest, and often use genomes as their data source. However, as more and more incomplete genomes are published, in many cases it may not be possible to compute genome-based phylogenetic trees due to large gaps in the assembled sequences. In addition, comparison of complete genomes may not even be desirable due to the presence of horizontally acquired and homologous genes. A decision must therefore be made about which gene, or gene combinations, should be used to compute a tree. Deflated Cladistic Information based on Total Entropy (dCITE) is proposed as an easily computed metric for measuring the cladistic information in multiple sequence alignments representing a range of taxa, without the need to first compute the corresponding trees. dCITE scores can be used to rank candidate genes or decide whether input sequences provide insufficient cladistic information, making artefactual polytomies more likely. The dCITE method can be applied to protein, nucleotide or encoded phenotypic data, so can be used to select which data-type is most appropriate, given the choice. In a series of experiments the dCITE method was compared with related measures. Then, as a practical demonstration, the ideas developed in the paper were applied to a dataset representing species from the order Campylobacterales; trees based on sequence combinations, selected on the basis of their dCITE scores, were compared with a tree constructed to mimic Multi-Locus Sequence Typing (MLST) combinations of fragments. We see that the greater the dCITE score the more likely it is that the computed phylogenetic tree will be free of artefactual polytomies. Secondly, cladistic information saturates, beyond which little additional cladistic information can be obtained by adding additional sequences. Finally, sequences with high cladistic information produce more consistent trees for the same taxa.
dCITE: Measuring Necessary Cladistic Information Can Help You Reduce Polytomy Artefacts in Trees

PubMed Central

2016-01-01

Biologists regularly create phylogenetic trees to better understand the evolutionary origins of their species of interest, and often use genomes as their data source. However, as more and more incomplete genomes are published, in many cases it may not be possible to compute genome-based phylogenetic trees due to large gaps in the assembled sequences. In addition, comparison of complete genomes may not even be desirable due to the presence of horizontally acquired and homologous genes. A decision must therefore be made about which gene, or gene combinations, should be used to compute a tree. Deflated Cladistic Information based on Total Entropy (dCITE) is proposed as an easily computed metric for measuring the cladistic information in multiple sequence alignments representing a range of taxa, without the need to first compute the corresponding trees. dCITE scores can be used to rank candidate genes or decide whether input sequences provide insufficient cladistic information, making artefactual polytomies more likely. The dCITE method can be applied to protein, nucleotide or encoded phenotypic data, so can be used to select which data-type is most appropriate, given the choice. In a series of experiments the dCITE method was compared with related measures. Then, as a practical demonstration, the ideas developed in the paper were applied to a dataset representing species from the order Campylobacterales; trees based on sequence combinations, selected on the basis of their dCITE scores, were compared with a tree constructed to mimic Multi-Locus Sequence Typing (MLST) combinations of fragments. We see that the greater the dCITE score the more likely it is that the computed phylogenetic tree will be free of artefactual polytomies. Secondly, cladistic information saturates, beyond which little additional cladistic information can be obtained by adding additional sequences. Finally, sequences with high cladistic information produce more consistent trees for the same taxa. PMID:27898695
Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms.

PubMed

Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H

2014-11-19

Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Phylogenomic Reconstruction of the Oomycete Phylogeny Derived from 37 Genomes

PubMed Central

McCarthy, Charley G. P.

2017-01-01

ABSTRACT The oomycetes are a class of microscopic, filamentous eukaryotes within the Stramenopiles-Alveolata-Rhizaria (SAR) supergroup which includes ecologically significant animal and plant pathogens, most infamously the causative agent of potato blight Phytophthora infestans. Single-gene and concatenated phylogenetic studies both of individual oomycete genera and of members of the larger class have resulted in conflicting conclusions concerning species phylogenies within the oomycetes, particularly for the large Phytophthora genus. Genome-scale phylogenetic studies have successfully resolved many eukaryotic relationships by using supertree methods, which combine large numbers of potentially disparate trees to determine evolutionary relationships that cannot be inferred from individual phylogenies alone. With a sufficient amount of genomic data now available, we have undertaken the first whole-genome phylogenetic analysis of the oomycetes using data from 37 oomycete species and 6 SAR species. In our analysis, we used established supertree methods to generate phylogenies from 8,355 homologous oomycete and SAR gene families and have complemented those analyses with both phylogenomic network and concatenated supermatrix analyses. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and individual clades within the problematic Phytophthora genus. Support for the resolution of the inferred relationships between individual Phytophthora clades varies depending on the methodology used. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. IMPORTANCE The oomycetes are a class of eukaryotes and include ecologically significant animal and plant pathogens. Single-gene and multigene phylogenetic studies of individual oomycete genera and of members of the larger classes have resulted in conflicting conclusions concerning interspecies relationships among these species, particularly for the Phytophthora genus. The onset of next-generation sequencing techniques now means that a wealth of oomycete genomic data is available. For the first time, we have used genome-scale phylogenetic methods to resolve oomycete phylogenetic relationships. We used supertree methods to generate single-gene and multigene species phylogenies. Overall, our supertree analyses utilized phylogenetic data from 8,355 oomycete gene families. We have also complemented our analyses with superalignment phylogenies derived from 131 single-copy ubiquitous gene families. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and clades. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. PMID:28435885
[New isolation methods and phylogenetic diversity of actinobacteria from hypersaline beach in Aksu].

PubMed

Zhang, Yao; Xia, Zhanfeng; Cao, Xinbo; Li, Jun; Zhang, Lili

2013-08-04

We explored 4 new methods to improve the isolation of actinobacterial resources from high salt areas. Optimized media based on 4 new strategies were used for isolating actinobacteria from hypersaline beaches. Glycerin-arginine, trehalose-creatine, glycerol-asparticacid, mannitol-casein, casein-mannitol, mannitol-alanine, chitosan-asparagineand GAUZE' No. 1 were used as basic media. New isolation strategy includes 4 methods: ten-fold dilution culture, simulation of the original environment, actinobacterial culture guided by uncultured molecular technology detected, and reference of actinobacterial media for brackish marine environment. The 16S rRNA genes of the isolates were amplified with bacterial universal primers. The results of 16S rRNA gene sequences were compared with sequences obtained from GenBank databases. We constructed phylogenetic tree with the neighbor-joining method. No actinobacterial strains were isolated by 8 media of control group, while 403 strains were isolated by new strategies. The isolates by new methods were members of 14 genera (Streptomyces, Streptomonospora, Saccharomonospora, Plantactinospora, Nocardia, Amycolatopsis, Glycomyces, Micromonospora, Nocardiopsis, Isoptericola, Nonomuraea, Thermobifida, Actinopolyspora, Actinomadura) of 10 families in 8 suborders. The most abundant and diverse isolates were the two suborders of Streptomycineae (69.96%) and Streptosporangineaesuborder (9.68%) within the phylum Actinobacteria, including 9 potential novel species. New isolation methods significantly improved the actinobacterial culturability of hypersaline areas, and obtained many potential novel species, which provided a new and more effective way to isolate actinobacteria resources in hypersaline environments.
A scalable method for identifying frequent subtrees in sets of large phylogenetic trees.

PubMed

Ramu, Avinash; Kahveci, Tamer; Burleigh, J Gordon

2012-10-03

We consider the problem of finding the maximum frequent agreement subtrees (MFASTs) in a collection of phylogenetic trees. Existing methods for this problem often do not scale beyond datasets with around 100 taxa. Our goal is to address this problem for datasets with over a thousand taxa and hundreds of trees. We develop a heuristic solution that aims to find MFASTs in sets of many, large phylogenetic trees. Our method works in multiple phases. In the first phase, it identifies small candidate subtrees from the set of input trees which serve as the seeds of larger subtrees. In the second phase, it combines these small seeds to build larger candidate MFASTs. In the final phase, it performs a post-processing step that ensures that we find a frequent agreement subtree that is not contained in a larger frequent agreement subtree. We demonstrate that this heuristic can easily handle data sets with 1000 taxa, greatly extending the estimation of MFASTs beyond current methods. Although this heuristic does not guarantee to find all MFASTs or the largest MFAST, it found the MFAST in all of our synthetic datasets where we could verify the correctness of the result. It also performed well on large empirical data sets. Its performance is robust to the number and size of the input trees. Overall, this method provides a simple and fast way to identify strongly supported subtrees within large phylogenetic hypotheses.
A scalable method for identifying frequent subtrees in sets of large phylogenetic trees

PubMed Central

2012-01-01

Background We consider the problem of finding the maximum frequent agreement subtrees (MFASTs) in a collection of phylogenetic trees. Existing methods for this problem often do not scale beyond datasets with around 100 taxa. Our goal is to address this problem for datasets with over a thousand taxa and hundreds of trees. Results We develop a heuristic solution that aims to find MFASTs in sets of many, large phylogenetic trees. Our method works in multiple phases. In the first phase, it identifies small candidate subtrees from the set of input trees which serve as the seeds of larger subtrees. In the second phase, it combines these small seeds to build larger candidate MFASTs. In the final phase, it performs a post-processing step that ensures that we find a frequent agreement subtree that is not contained in a larger frequent agreement subtree. We demonstrate that this heuristic can easily handle data sets with 1000 taxa, greatly extending the estimation of MFASTs beyond current methods. Conclusions Although this heuristic does not guarantee to find all MFASTs or the largest MFAST, it found the MFAST in all of our synthetic datasets where we could verify the correctness of the result. It also performed well on large empirical data sets. Its performance is robust to the number and size of the input trees. Overall, this method provides a simple and fast way to identify strongly supported subtrees within large phylogenetic hypotheses. PMID:23033843
Community phylogenetics at the biogeographical scale: cold tolerance, niche conservatism and the structure of North American forests.

PubMed

Hawkins, Bradford A; Rueda, Marta; Rangel, Thiago F; Field, Richard; Diniz-Filho, José Alexandre F; Linder, Peter

2014-01-01

Aim The fossil record has led to a historical explanation for forest diversity gradients within the cool parts of the Northern Hemisphere, founded on a limited ability of woody angiosperm clades to adapt to mid-Tertiary cooling. We tested four predictions of how this should be manifested in the phylogenetic structure of 91,340 communities: (1) forests to the north should comprise species from younger clades (families) than forests to the south; (2) average cold tolerance at a local site should be associated with the mean family age (MFA) of species; (3) minimum temperature should account for MFA better than alternative environmental variables; and (4) traits associated with survival in cold climates should evolve under a niche conservatism constraint. Location The contiguous United States. Methods We extracted angiosperms from the US Forest Service's Forest Inventory and Analysis database. MFA was calculated by assigning age of the family to which each species belongs and averaging across the species in each community. We developed a phylogeny to identify phylogenetic signal in five traits: realized cold tolerance, seed size, seed dispersal mode, leaf phenology and height. Phylogenetic signal representation curves and phylogenetic generalized least squares were used to compare patterns of trait evolution against Brownian motion. Eleven predictors structured at broad or local scales were generated to explore relationships between environment and MFA using random forest and general linear models. Results Consistent with predictions, (1) southern communities comprise angiosperm species from older families than northern communities, (2) cold tolerance is the trait most strongly associated with local MFA, (3) minimum temperature in the coldest month is the environmental variable that best describes MFA, broad-scale variables being much stronger correlates than local-scale variables, and (4) the phylogenetic structures of cold tolerance and at least one other trait associated with survivorship in cold climates indicate niche conservatism. Main conclusions Tropical niche conservatism in the face of long-term climate change, probably initiated in the Late Cretaceous associated with the rise of the Rocky Mountains, is a strong driver of the phylogenetic structure of the angiosperm component of forest communities across the USA. However, local deterministic and/or stochastic processes account for perhaps a quarter of the variation in the MFA of local communities.
Let them fall where they may: congruence analysis in massive phylogenetically messy data sets.

PubMed

Leigh, Jessica W; Schliep, Klaus; Lopez, Philippe; Bapteste, Eric

2011-10-01

Interest in congruence in phylogenetic data has largely focused on issues affecting multicellular organisms, and animals in particular, in which the level of incongruence is expected to be relatively low. In addition, assessment methods developed in the past have been designed for reasonably small numbers of loci and scale poorly for larger data sets. However, there are currently over a thousand complete genome sequences available and of interest to evolutionary biologists, and these sequences are predominantly from microbial organisms, whose molecular evolution is much less frequently tree-like than that of multicellular life forms. As such, the level of incongruence in these data is expected to be high. We present a congruence method that accommodates both very large numbers of genes and high degrees of incongruence. Our method uses clustering algorithms to identify subsets of genes based on similarity of phylogenetic signal. It involves only a single phylogenetic analysis per gene, and therefore, computation time scales nearly linearly with the number of genes in the data set. We show that our method performs very well with sets of sequence alignments simulated under a wide variety of conditions. In addition, we present an analysis of core genes of prokaryotes, often assumed to have been largely vertically inherited, in which we identify two highly incongruent classes of genes. This result is consistent with the complexity hypothesis.
Rosetta stone method for detecting protein function and protein-protein interactions from genome sequences

DOEpatents

Eisenberg, David; Marcotte, Edward M.; Pellegrini, Matteo; Thompson, Michael J.; Yeates, Todd O.

2002-10-15

A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

Comparative phylogenetic analyses of Halomonas variabilis and related organisms based on 16S rRNA, gyrB and ectBC gene sequences.

PubMed

Okamoto, Takuji; Maruyama, Akihiko; Imura, Satoshi; Takeyama, Haruko; Naganuma, Takeshi

2004-05-01

Halomonas variabilis and phylogenetically related organisms were isolated from various habitats such as Antarctic terrain and saline ponds, deep-sea sediment, deep-sea waters affected by hydrothermal plumes, and hydrothermal vent fluids. Ten strains were selected for physiological and phylogenetic characterization in detail. All of those strains were found to be piezotolerant and psychrotolerant, as well as euryhaline halophilic or halotolerant. Their stress tolerance may facilitate their wide occurrence, even in so-called extreme environments. The 16S rDNA-based phylogenetic relationship was complemented by analyses of the DNA gyrase subunit B gene (gyrB) and genes involved in the synthesis of the major compatible solute, ectoine: diaminobutyric acid aminotransferase gene (ectB) and ectoine synthase gene (ectC). The phylogenetic relationships of H. variabilis and related organisms were very similar in terms of 16S rDNA, gyrB, and ectB. The ectC-based tree was inconsistent with the other phylogenetic trees. For that reason, ectC was inferred to derive from horizontal transfer.
Phylomemetics—Evolutionary Analysis beyond the Gene

PubMed Central

Howe, Christopher J.; Windram, Heather F.

2011-01-01

Genes are propagated by error-prone copying, and the resulting variation provides the basis for phylogenetic reconstruction of evolutionary relationships. Horizontal gene transfer may be superimposed on a tree-like evolutionary pattern, with some relationships better depicted as networks. The copying of manuscripts by scribes is very similar to the replication of genes, and phylogenetic inference programs can be used directly for reconstructing the copying history of different versions of a manuscript text. Phylogenetic methods have also been used for some time to analyse the evolution of languages and the development of physical cultural artefacts. These studies can help to answer a range of anthropological questions. We propose the adoption of the term “phylomemetics” for phylogenetic analysis of reproducing non-genetic elements. PMID:21655311
Evolutionary Association of Stomatal Traits with Leaf Vein Density in Paphiopedilum, Orchidaceae

PubMed Central

Sun, Mei; Zhang, Juan-Juan; Cao, Kun-Fang; Hu, Hong

2012-01-01

Background Both leaf attributes and stomatal traits are linked to water economy in land plants. However, it is unclear whether these two components are associated evolutionarily. Methodology/Principal Findings In characterizing the possible effect of phylogeny on leaf attributes and stomatal traits, we hypothesized that a correlated evolution exists between the two. Using a phylogenetic comparative method, we analyzed 14 leaf attributes and stomatal traits for 17 species in Paphiopedilum. Stomatal length (SL), stomatal area (SA), upper cuticular thickness (UCT), and total cuticular thickness (TCT) showed strong phylogenetic conservatism whereas stomatal density (SD) and stomatal index (SI) were significantly convergent. Leaf vein density was correlated with SL and SD whether or not phylogeny was considered. The lower epidermal thickness (LET) was correlated positively with SL, SA, and stomatal width but negatively with SD when phylogeny was not considered. When this phylogenetic influence was factored in, only the significant correlation between SL and LET remained. Conclusion/Significance Our results support the hypothesis for correlated evolution between stomatal traits and vein density in Paphiopedilum. However, they do not provide evidence for an evolutionary association between stomata and leaf thickness. These findings lend insight into the evolution of traits related to water economy for orchids under natural selection. PMID:22768224
Systematics of marine brown alga Sargassum from Thailand: A preliminary study based on morphological data and nuclear ribosomal internal transcribed spacer 2 (ITS2) sequences

NASA Astrophysics Data System (ADS)

Kantachumpoo, Attachai; Uwai, Shinya; Noiraksar, Thidarat; Komatsu, Teruhisa

2015-06-01

The marine brown algal genus Sargassum has been investigated extensively based on genetic information. In this report, we performed the first comparative study of morphological and molecular data among common species of Sargassum found in Thailand and explored the phylogenetic diversity within the genus. Our results revealed an incongruent pattern for species classification in Thai Sargassum. Morphologically, our Sargassum specimens were distinguishable and represented 8 species, namely, S. aquifolium (Turner) C.Agardh, Sargassum baccularia (Mertens) C. Agardh, S. cinereum J. Agardh, S. ilicifolium (Turner) C.Agardh, S. oligocystum Montagne, S. plagiophyllum C. Agardh, S. polycystum C. Agardh and S. swartzii (Turuner) C. Agardh. In contrast, using three different methods, phylogenetic analysis of nuclear ribosomal internal transcribed spacer 2 (ITS2) revealed six distinct clades, including S. baccularia/ S. oligosyntum clade, S. aquifolium/ S. swartzii clade, S. cinereum clade, S. aquifolium/ S. ilicifolium clade, S. polycystum clade, and S. plagiophyllum clade, which was suggestive of a phenotypic plasticity species complex. Our molecular data also confirmed the paraphyletic relationship in the section Binderianae and suggested that this section requires reassessment. Overall, further studies are required to increase our understanding of the taxonomy, phylogenetic relationships and species boundaries among Sargassum species in Thailand.
Bayesian, maximum parsimony and UPGMA models for inferring the phylogenies of antelopes using mitochondrial markers.

PubMed

Khan, Haseeb A; Arif, Ibrahim A; Bahkali, Ali H; Al Farhan, Ahmad H; Al Homaidan, Ali A

2008-10-06

This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Bayesian, Maximum Parsimony and UPGMA Models for Inferring the Phylogenies of Antelopes Using Mitochondrial Markers

PubMed Central

Khan, Haseeb A.; Arif, Ibrahim A.; Bahkali, Ali H.; Al Farhan, Ahmad H.; Al Homaidan, Ali A.

2008-01-01

This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers. PMID:19204824
Choosing and Using Introns in Molecular Phylogenetics

PubMed Central

Creer, Simon

2007-01-01

Introns are now commonly used in molecular phylogenetics in an attempt to recover gene trees that are concordant with species trees, but there are a range of genomic, logistical and analytical considerations that are infrequently discussed in empirical studies that utilize intron data. This review outlines expedient approaches for locus selection, overcoming paralogy problems, recombination detection methods and the identification and incorporation of LVHs in molecular systematics. A range of parsimony and Bayesian analytical approaches are also described in order to highlight the methods that can currently be employed to align sequences and treat indels in subsequent analyses. By covering the main points associated with the generation and analysis of intron data, this review aims to provide a comprehensive introduction to using introns (or any non-coding nuclear data partition) in contemporary phylogenetics. PMID:19461984
Octocoral Species Assembly and Coexistence in Caribbean Coral Reefs

PubMed Central

Velásquez, Johanna; Sánchez, Juan A.

2015-01-01

Background What are the determinant factors of community assemblies in the most diverse ecosystem in the ocean? Coral reefs can be divided in continental (i.e., reefs that develop on the continental shelf, including siliciclastic reefs) and oceanic (i.e., far off the continental shelf, usually on volcanic substratum); whether or not these habitat differences impose community-wide ecological divergence or species exclusion/coexistence with evolutionary consequences, is unknown. Methods Studying Caribbean octocorals as model system, we determined the phylogenetic community structure in a coral reef community, making emphasis on species coexistence evidenced on trait evolution and environmental feedbacks. Forty-nine species represented in five families constituted the species pool from which a phylogenetic tree was reconstructed using mtDNA. We included data from 11 localities in the Western Caribbean (Colombia) including most reef types. To test diversity-environment and phenotype-environment relationships, phylogenetic community structure and trait evolution we carried out comparative analyses implementing ecological and evolutionary approaches. Results Phylogenetic inferences suggest clustering of oceanic reefs (e.g., atolls) contrasting with phylogenetic overdispersion of continental reefs (e.g., reefs banks). Additionally, atolls and barrier reefs had the highest species diversity (Shannon index) whereas phylogenetic diversity was higher in reef banks. The discriminant component analysis supported this differentiation between oceanic and continental reefs, where continental octocoral species tend to have greater calyx apertures, thicker branches, prominent calyces and azooxanthellate species. This analysis also indicated a clear separation between the slope and the remaining habitats, caused by the presence or absence of Symbiodinium. K statistic analysis showed that this trait is conserved as well as the branch shape. Discussion There was strong octocoral community structure with opposite diversity and composition patterns between oceanic and continental reefs. Even habitats with similar depths and overall environmental conditions did not share similar communities between oceanic and continental reefs. This indicates a strong regional influence over the local communities, probably due to water transparency differences between major reef types, i.e., oceanic vs. continental shelf-neritic. This was supported by contrasting patterns found in morphology, composition and evolutionary history of the species between atolls and reef banks. PMID:26177191
Phylogenetic Status of an Unrecorded Species of Curvularia, C. spicifera, Based on Current Classification System of Curvularia and Bipolaris Group Using Multi Loci.

PubMed

Jeon, Sun Jeong; Nguyen, Thi Thuong Thuong; Lee, Hyang Burm

2015-09-01

A seed-borne fungus, Curvularia sp. EML-KWD01, was isolated from an indigenous wheat seed by standard blotter method. This fungus was characterized based on the morphological characteristics and molecular phylogenetic analysis. Phylogenetic status of the fungus was determined using sequences of three loci: rDNA internal transcribed spacer, large ribosomal subunit, and glyceraldehyde 3-phosphate dehydrogenase gene. Multi loci sequencing analysis revealed that this fungus was Curvularia spicifera within Curvularia group 2 of family Pleosporaceae.
Evolutionary history of the Afro-Madagascan Ixora species (Rubiaceae): species diversification and distribution of key morphological traits inferred from dated molecular phylogenetic trees

PubMed Central

Tosh, J.; Dessein, S.; Buerki, S.; Groeninckx, I.; Mouly, A.; Bremer, B.; Smets, E. F.; De Block, P.

2013-01-01

Background and Aims Previous work on the pantropical genus Ixora has revealed an Afro-Madagascan clade, but as yet no study has focused in detail on the evolutionary history and morphological trends in this group. Here the evolutionary history of Afro-Madagascan Ixora spp. (a clade of approx. 80 taxa) is investigated and the phylogenetic trees compared with several key morphological traits in taxa occurring in Madagascar. Methods Phylogenetic relationships of Afro-Madagascan Ixora are assessed using sequence data from four plastid regions (petD, rps16, rpoB-trnC and trnL-trnF) and nuclear ribosomal external transcribed spacer (ETS) and internal transcribed spacer (ITS) regions. The phylogenetic distribution of key morphological characters is assessed. Bayesian inference (implemented in BEAST) is used to estimate the temporal origin of Ixora based on fossil evidence. Key Results Two separate lineages of Madagascan taxa are recovered, one of which is nested in a group of East African taxa. Divergence in Ixora is estimated to have commenced during the mid Miocene, with extensive cladogenesis occurring in the Afro-Madagascan clade during the Pliocene onwards. Conclusions Both lineages of Madagascan Ixora exhibit morphological innovations that are rare throughout the rest of the genus, including a trend towards pauciflorous inflorescences and a trend towards extreme corolla tube length, suggesting that the same ecological and selective pressures are acting upon taxa from both Madagascan lineages. Novel ecological opportunities resulting from climate-induced habitat fragmentation and corolla tube length diversification are likely to have facilitated species radiation on Madagascar. PMID:24142919
Mapping Biodiversity and Setting Conservation Priorities for SE Queensland’s Rainforests Using DNA Barcoding

PubMed Central

Shapcott, Alison; Forster, Paul I.; Guymer, Gordon P.; McDonald, William J. F.; Faith, Daniel P.; Erickson, David; Kress, W. John

2015-01-01

Australian rainforests have been fragmented due to past climatic changes and more recently landscape change as a result of clearing for agriculture and urban spread. The subtropical rainforests of South Eastern Queensland are significantly more fragmented than the tropical World Heritage listed northern rainforests and are subject to much greater human population pressures. The Australian rainforest flora is relatively taxonomically rich at the family level, but less so at the species level. Current methods to assess biodiversity based on species numbers fail to adequately capture this richness at higher taxonomic levels. We developed a DNA barcode library for the SE Queensland rainforest flora to support a methodology for biodiversity assessment that incorporates both taxonomic diversity and phylogenetic relationships. We placed our SE Queensland phylogeny based on a three marker DNA barcode within a larger international rainforest barcode library and used this to calculate phylogenetic diversity (PD). We compared phylo- diversity measures, species composition and richness and ecosystem diversity of the SE Queensland rainforest estate to identify which bio subregions contain the greatest rainforest biodiversity, subregion relationships and their level of protection. We identified areas of highest conservation priority. Diversity was not correlated with rainforest area in SE Queensland subregions but PD was correlated with both the percent of the subregion occupied by rainforest and the diversity of regional ecosystems (RE) present. The patterns of species diversity and phylogenetic diversity suggest a strong influence of historical biogeography. Some subregions contain significantly more PD than expected by chance, consistent with the concept of refugia, while others were significantly phylogenetically clustered, consistent with recent range expansions. PMID:25803607
Mapping biodiversity and setting conservation priorities for SE Queensland's rainforests using DNA barcoding.

PubMed

Shapcott, Alison; Forster, Paul I; Guymer, Gordon P; McDonald, William J F; Faith, Daniel P; Erickson, David; Kress, W John

2015-01-01

Australian rainforests have been fragmented due to past climatic changes and more recently landscape change as a result of clearing for agriculture and urban spread. The subtropical rainforests of South Eastern Queensland are significantly more fragmented than the tropical World Heritage listed northern rainforests and are subject to much greater human population pressures. The Australian rainforest flora is relatively taxonomically rich at the family level, but less so at the species level. Current methods to assess biodiversity based on species numbers fail to adequately capture this richness at higher taxonomic levels. We developed a DNA barcode library for the SE Queensland rainforest flora to support a methodology for biodiversity assessment that incorporates both taxonomic diversity and phylogenetic relationships. We placed our SE Queensland phylogeny based on a three marker DNA barcode within a larger international rainforest barcode library and used this to calculate phylogenetic diversity (PD). We compared phylo- diversity measures, species composition and richness and ecosystem diversity of the SE Queensland rainforest estate to identify which bio subregions contain the greatest rainforest biodiversity, subregion relationships and their level of protection. We identified areas of highest conservation priority. Diversity was not correlated with rainforest area in SE Queensland subregions but PD was correlated with both the percent of the subregion occupied by rainforest and the diversity of regional ecosystems (RE) present. The patterns of species diversity and phylogenetic diversity suggest a strong influence of historical biogeography. Some subregions contain significantly more PD than expected by chance, consistent with the concept of refugia, while others were significantly phylogenetically clustered, consistent with recent range expansions.
Evolutionary relationships of Fusobacterium nucleatum based on phylogenetic analysis and comparative genomics

PubMed Central

Mira, Alex; Pushker, Ravindra; Legault, Boris A; Moreira, David; Rodríguez-Valera, Francisco

2004-01-01

Background The phylogenetic position and evolutionary relationships of Fusobacteria remain uncertain. Especially intriguing is their relatedness to low G+C Gram positive bacteria (Firmicutes) by ribosomal molecular phylogenies, but their possession of a typical gram negative outer membrane. Taking advantage of the recent completion of the Fusobacterium nucleatum genome sequence we have examined the evolutionary relationships of Fusobacterium genes by phylogenetic analysis and comparative genomics tools. Results The data indicate that Fusobacterium has a core genome of a very different nature to other bacterial lineages, and branches out at the base of Firmicutes. However, depending on the method used, 35–56% of Fusobacterium genes appear to have a xenologous origin from bacteroidetes, proteobacteria, spirochaetes and the Firmicutes themselves. A high number of hypothetical ORFs with unusual codon usage and short lengths were found and hypothesized to be remnants of transferred genes that were discarded. Some proteins and operons are also hypothesized to be of mixed ancestry. A large portion of the Gram-negative cell wall-related genes seems to have been transferred from proteobacteria. Conclusions Many instances of similarity to other inhabitants of the dental plaque that have been sequenced were found. This suggests that the close physical contact found in this environment might facilitate horizontal gene transfer, supporting the idea of niche-specific gene pools. We hypothesize that at a point in time, probably associated to the rise of mammals, a strong selective pressure might have existed for a cell with a Clostridia-like metabolic apparatus but with the adhesive and immune camouflage features of Proteobacteria. PMID:15566569
Cenozoic climate change shaped the evolutionary ecophysiology of the Cupressaceae conifers

PubMed Central

Pittermann, Jarmila; Stuart, Stephanie A.; Dawson, Todd E.; Moreau, Astrid

2012-01-01

The Cupressaceae clade has the broadest diversity in habitat and morphology of any conifer family. This clade is characterized by highly divergent physiological strategies, with deciduous swamp-adapted genera-like Taxodium at one extreme, and evergreen desert genera-like Cupressus at the other. The size disparity within the Cupressaceae is equally impressive, with members ranging from 5-m-tall juniper shrubs to 100-m-tall redwood trees. Phylogenetic studies demonstrate that despite this variation, these taxa all share a single common ancestor; by extension, they also share a common ancestral habitat. Here, we use a common-garden approach to compare xylem and leaf-level physiology in this family. We then apply comparative phylogenetic methods to infer how Cenozoic climatic change shaped the morphological and physiological differences between modern-day members of the Cupressaceae. Our data show that drought-resistant crown clades (the Cupressoid and Callitroid clades) most likely evolved from drought-intolerant Mesozoic ancestors, and that this pattern is consistent with proposed shifts in post-Eocene paleoclimates. We also provide evidence that within the Cupressaceae, the evolution of drought-resistant xylem is coupled to increased carbon investment in xylem tissue, reduced xylem transport efficiency, and at the leaf level, reduced photosynthetic capacity. Phylogenetically based analyses suggest that the ancestors of the Cupressaceae were dependent upon moist habitats, and that drought-resistant physiology developed along with increasing habitat aridity from the Oligocene onward. We conclude that the modern biogeography of the Cupressaceae conifers was shaped in large part by their capacity to adapt to drought. PMID:22628565
Cenozoic climate change shaped the evolutionary ecophysiology of the Cupressaceae conifers.

PubMed

Pittermann, Jarmila; Stuart, Stephanie A; Dawson, Todd E; Moreau, Astrid

2012-06-12

The Cupressaceae clade has the broadest diversity in habitat and morphology of any conifer family. This clade is characterized by highly divergent physiological strategies, with deciduous swamp-adapted genera-like Taxodium at one extreme, and evergreen desert genera-like Cupressus at the other. The size disparity within the Cupressaceae is equally impressive, with members ranging from 5-m-tall juniper shrubs to 100-m-tall redwood trees. Phylogenetic studies demonstrate that despite this variation, these taxa all share a single common ancestor; by extension, they also share a common ancestral habitat. Here, we use a common-garden approach to compare xylem and leaf-level physiology in this family. We then apply comparative phylogenetic methods to infer how Cenozoic climatic change shaped the morphological and physiological differences between modern-day members of the Cupressaceae. Our data show that drought-resistant crown clades (the Cupressoid and Callitroid clades) most likely evolved from drought-intolerant Mesozoic ancestors, and that this pattern is consistent with proposed shifts in post-Eocene paleoclimates. We also provide evidence that within the Cupressaceae, the evolution of drought-resistant xylem is coupled to increased carbon investment in xylem tissue, reduced xylem transport efficiency, and at the leaf level, reduced photosynthetic capacity. Phylogenetically based analyses suggest that the ancestors of the Cupressaceae were dependent upon moist habitats, and that drought-resistant physiology developed along with increasing habitat aridity from the Oligocene onward. We conclude that the modern biogeography of the Cupressaceae conifers was shaped in large part by their capacity to adapt to drought.
treeman: an R package for efficient and intuitive manipulation of phylogenetic trees.

PubMed

Bennett, Dominic J; Sutton, Mark D; Turvey, Samuel T

2017-01-07

Phylogenetic trees are hierarchical structures used for representing the inter-relationships between biological entities. They are the most common tool for representing evolution and are essential to a range of fields across the life sciences. The manipulation of phylogenetic trees-in terms of adding or removing tips-is often performed by researchers not just for reasons of management but also for performing simulations in order to understand the processes of evolution. Despite this, the most common programming language among biologists, R, has few class structures well suited to these tasks. We present an R package that contains a new class, called TreeMan, for representing the phylogenetic tree. This class has a list structure allowing phylogenetic trees to be manipulated more efficiently. Computational running times are reduced because of the ready ability to vectorise and parallelise methods. Development is also improved due to fewer lines of code being required for performing manipulation processes. We present three use cases-pinning missing taxa to a supertree, simulating evolution with a tree-growth model and detecting significant phylogenetic turnover-that demonstrate the new package's speed and simplicity.
Alternative methods of phylogenetic inference for the Patagonian lizard group Liolaemus elongatus-kriegi (Iguania: Liolaemini) based on mitochondrial and nuclear markers.

PubMed

Medina, Cintia Débora; Avila, Luciano Javier; Sites, Jack Walter; Santos, Juan; Morando, Mariana

2018-03-01

We present different approaches to a multi-locus phylogeny for the Liolaemus elongatus-kriegi group, including almost all species and recognized lineages. We sequenced two mitochondrial and five nuclear gene regions for 123 individuals from 35 taxa, and compared relationships resolved from concatenated and species tree methods. The L. elongatus-kriegi group was inferred as monophyletic in three of the five analyses (concatenated mitochondrial, concatenated mitochondrial + nuclear gene trees, and SVD quartet species tree). The mitochondrial gene tree resolved four haploclades, three corresponding to the previously recognized complexes: L. elongatus, L. kriegi and L. petrophilus complexes, and the L. punmahuida group. The BEAST species tree approach included the L. punmahuida group within the L. kriegi complex, but the SVD quartet method placed it as sister to the L. elongatus-kriegi group. BEAST inferred species of the L. elongatus and L. petrophilus complexes as one clade, while SVDquartet inferred these two complexes as monophyletic (although with no statistical support for the L. petrophilus complex). The species tree approach also included the L. punmahuida group as part of the L. elongatus-kriegi group. Our study provides detailed multilocus phylogenetic hypotheses for the L. elongatus-kriegi group, and we discuss possible reasons for differences in the concatenation and species tree methods. Copyright © 2017 Elsevier Inc. All rights reserved.
A note on probabilistic models over strings: the linear algebra approach.

PubMed

Bouchard-Côté, Alexandre

2013-12-01

Probabilistic models over strings have played a key role in developing methods that take into consideration indels as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important theoretical question is the complexity of computing the normalization of a class of string-valued graphical models. This question has been investigated using tools from combinatorics, dynamic programming, and graph theory, and has practical applications in Bayesian phylogenetics. In this work, we revisit this theoretical question from a different point of view, based on linear algebra. The main contribution is a set of results based on this linear algebra view that facilitate the analysis and design of inference algorithms on string-valued graphical models. As an illustration, we use this method to give a new elementary proof of a known result on the complexity of inference on the "TKF91" model, a well-known probabilistic model over strings. Compared to previous work, our proving method is easier to extend to other models, since it relies on a novel weak condition, triangular transducers, which is easy to establish in practice. The linear algebra view provides a concise way of describing transducer algorithms and their compositions, opens the possibility of transferring fast linear algebra libraries (for example, based on GPUs), as well as low rank matrix approximation methods, to string-valued inference problems.
Disturbance by an endemic rodent in an arid shrubland is a habitat filter: effects on plant invasion and taxonomical, functional and phylogenetic community structure

PubMed Central

Escobedo, Víctor M.; Rios, Rodrigo S.; Salgado-Luarte, Cristian; Stotz, Gisela C.

2017-01-01

Abstract Background and Aims Disturbance often drives plant invasion and may modify community assembly. However, little is known about how these modifications of community patterns occur in terms of taxonomic, functional and phylogenetic structure. This study evaluated in an arid shrubland the influence of disturbance by an endemic rodent on community functional divergence and phylogenetic structure as well as on plant invasion. It was expected that disturbance would operate as a habitat filter favouring exotic species with short life cycles. Methods Sixteen plots were sampled along a disturbance gradient caused by the endemic fossorial rodent Spalacopus cyanus, measuring community parameters and estimating functional divergence for life history traits (functional dispersion index) and the relative contribution to functional divergence of exotic and native species. The phylogenetic signal (Pagel’s lambda) and phylogenetic community structure (mean phylogenetic distance and mean nearest taxon phylogenetic distance) were also estimated. The use of a continuous approach to the disturbance gradient allowed the identification of non-linear relationships between disturbance and community parameters. Key Results The relationship between disturbance and both species richness and abundance was positive for exotic species and negative for native species. Disturbance modified community composition, and exotic species were associated with more disturbed sites. Disturbance increased trait convergence, which resulted in phylogenetic clustering because traits showed a significant phylogenetic signal. The relative contribution of exotic species to functional divergence increased, while that of natives decreased, with disturbance. Exotic and native species were not phylogenetically distinct. Conclusions Disturbance by rodents in this arid shrubland constitutes a habitat filter over phylogeny-dependent life history traits, leading to phylogenetic clustering, and drives invasion by favouring species with short life cycles. Results can be explained by high phenotypic and phylogenetic resemblance between exotic and native species. The use of continuous gradients when studying the effects of disturbance on community assembly is advocated. PMID:28087661
The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection

PubMed Central

Yu, Yun; Degnan, James H.; Nakhleh, Luay

2012-01-01

Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. PMID:22536161

Opposing assembly mechanisms in a neotropical dry forest: implications for phylogenetic and functional community ecology.

PubMed

Swenson, Nathan G; Enquist, Brian J

2009-08-01

Species diversity is promoted and maintained by ecological and evolutionary processes operating on species attributes through space and time. The degree to which variability in species function regulates distribution and promotes coexistence of species has been debated. Previous work has attempted to quantify the relative importance of species function by using phylogenetic relatedness as a proxy for functional similarity. The key assumption of this approach is that function is phylogenetically conserved. If this assumption is supported, then the phylogenetic dispersion in a community should mirror the functional dispersion. Here we quantify functional trait dispersion along several key axes of tree life-history variation and on multiple spatial scales in a Neotropical dry-forest community. We next compare these results to previously reported patterns of phylogenetic dispersion in this same forest. We find that, at small spatial scales, coexisting species are typically more functionally clustered than expected, but traits related to adult and regeneration niches are overdispersed. This outcome was repeated when the analyses were stratified by size class. Some of the trait dispersion results stand in contrast to the previously reported phylogenetic dispersion results. In order to address this inconsistency we examined the strength of phylogenetic signal in traits at different depths in the phylogeny. We argue that: (1) while phylogenetic relatedness may be a good general multivariate proxy for ecological similarity, it may have a reduced capacity to depict the functional mechanisms behind species coexistence when coexisting species simultaneously converge and diverge in function; and (2) the previously used metric of phylogenetic signal provided erroneous inferences about trait dispersion when married with patterns of phylogenetic dispersion.
Phylogenetic placement of two species known only from resting spores: Zoophthora independentia sp. nov. and Z. porteri comb. nov. (Entomophthorales: Entomophthoraceae)

USDA-ARS?s Scientific Manuscript database

Molecular methods were used to determine the generic placement of two species of Entomophthorales known only from resting spores. Historically, these species would belong in the form-genus Tarichium, but this classification provides no information about phylogenetic relationships. Using DNA from res...
PHYLOGENETIC ANALYSIS OF 16S RRNA GENE SEQUENCES REVEALS THE PREVALENCE OF MYCOBACTERIA SP., ALPHA-PROTEOBACTERIA, AND UNCULTURED BACTERIA IN DRINKING WATER MICROBIAL COMMUNITIES

EPA Science Inventory

Previous studies have shown that culture-based methods tend to underestimate the densities and diversity of bacterial populations inhabiting water distribution systems (WDS). In this study, the phylogenetic diversity of drinking water bacteria was assessed using sequence analysis...
Phylogenetically informed logic relationships improve detection of biological network organization

PubMed Central

2011-01-01

Background A "phylogenetic profile" refers to the presence or absence of a gene across a set of organisms, and it has been proven valuable for understanding gene functional relationships and network organization. Despite this success, few studies have attempted to search beyond just pairwise relationships among genes. Here we search for logic relationships involving three genes, and explore its potential application in gene network analyses. Results Taking advantage of a phylogenetic matrix constructed from the large orthologs database Roundup, we invented a method to create balanced profiles for individual triplets of genes that guarantee equal weight on the different phylogenetic scenarios of coevolution between genes. When we applied this idea to LAPP, the method to search for logic triplets of genes, the balanced profiles resulted in significant performance improvement and the discovery of hundreds of thousands more putative triplets than unadjusted profiles. We found that logic triplets detected biological network organization and identified key proteins and their functions, ranging from neighbouring proteins in local pathways, to well separated proteins in the whole pathway, and to the interactions among different pathways at the system level. Finally, our case study suggested that the directionality in a logic relationship and the profile of a triplet could disclose the connectivity between the triplet and surrounding networks. Conclusion Balanced profiles are superior to the raw profiles employed by traditional methods of phylogenetic profiling in searching for high order gene sets. Gene triplets can provide valuable information in detection of biological network organization and identification of key genes at different levels of cellular interaction. PMID:22172058
I-HEDGE: determining the optimum complementary sets of taxa for conservation using evolutionary isolation

PubMed Central

Mooers, Arne Ø.; Caccone, Adalgisa; Russello, Michael A.

2016-01-01

In the midst of the current biodiversity crisis, conservation efforts might profitably be directed towards ensuring that extinctions do not result in inordinate losses of evolutionary history. Numerous methods have been developed to evaluate the importance of species based on their contribution to total phylogenetic diversity on trees and networks, but existing methods fail to take complementarity into account, and thus cannot identify the best order or subset of taxa to protect. Here, we develop a novel iterative calculation of the heightened evolutionary distinctiveness and globally endangered metric (I-HEDGE) that produces the optimal ranked list for conservation prioritization, taking into account complementarity and based on both phylogenetic diversity and extinction probability. We applied this metric to a phylogenetic network based on mitochondrial control region data from extant and recently extinct giant Galápagos tortoises, a highly endangered group of closely related species. We found that the restoration of two extinct species (a project currently underway) will contribute the greatest gain in phylogenetic diversity, and present an ordered list of rankings that is the optimum complementarity set for conservation prioritization. PMID:27635324
I-HEDGE: determining the optimum complementary sets of taxa for conservation using evolutionary isolation.

PubMed

Jensen, Evelyn L; Mooers, Arne Ø; Caccone, Adalgisa; Russello, Michael A

2016-01-01

In the midst of the current biodiversity crisis, conservation efforts might profitably be directed towards ensuring that extinctions do not result in inordinate losses of evolutionary history. Numerous methods have been developed to evaluate the importance of species based on their contribution to total phylogenetic diversity on trees and networks, but existing methods fail to take complementarity into account, and thus cannot identify the best order or subset of taxa to protect. Here, we develop a novel iterative calculation of the heightened evolutionary distinctiveness and globally endangered metric (I-HEDGE) that produces the optimal ranked list for conservation prioritization, taking into account complementarity and based on both phylogenetic diversity and extinction probability. We applied this metric to a phylogenetic network based on mitochondrial control region data from extant and recently extinct giant Galápagos tortoises, a highly endangered group of closely related species. We found that the restoration of two extinct species (a project currently underway) will contribute the greatest gain in phylogenetic diversity, and present an ordered list of rankings that is the optimum complementarity set for conservation prioritization.
A review of bioinformatics platforms for comparative genomics. Recent developments of the EDGAR 2.0 platform and its utility for taxonomic and phylogenetic studies.

PubMed

Yu, J; Blom, J; Glaeser, S P; Jaenicke, S; Juhre, T; Rupp, O; Schwengers, O; Spänig, S; Goesmann, A

2017-11-10

The rapid development of next generation sequencing technology has greatly increased the amount of available microbial genomes. As a result of this development, there is a rising demand for fast and automated approaches in analyzing these genomes in a comparative way. Whole genome sequencing also bears a huge potential for obtaining a higher resolution in phylogenetic and taxonomic classification. During the last decade, several software tools and platforms have been developed in the field of comparative genomics. In this manuscript, we review the most commonly used platforms and approaches for ortholog group analyses with a focus on their potential for phylogenetic and taxonomic research. Furthermore, we describe the latest improvements of the EDGAR platform for comparative genome analyses and present recent examples of its application for the phylogenomic analysis of different taxa. Finally, we illustrate the role of the EDGAR platform as part of the BiGi Center for Microbial Bioinformatics within the German network on Bioinformatics Infrastructure (de.NBI). Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Use of phylogenetical analysis to predict susceptibility of pathogenic Candida spp. to antifungal drugs.

PubMed

Maheux, Andrée F; Sellam, Adnane; Piché, Yves; Boissinot, Maurice; Pelletier, René; Boudreau, Dominique K; Picard, François J; Trépanier, Hélène; Boily, Marie-Josée; Ouellette, Marc; Roy, Paul H; Bergeron, Michel G

2016-12-01

Successful treatment of a Candida infection relies on 1) an accurate identification of the pathogenic fungus and 2) on its susceptibility to antifungal drugs. In the present study we investigated the level of correlation between phylogenetical evolution and susceptibility of pathogenic Candida spp. to antifungal drugs. For this, we compared a phylogenetic tree, assembled with the concatenated sequences (2475-bp) of the ATP2, TEF1, and TUF1 genes from 20 representative Candida species, with published minimal inhibitory concentrations (MIC) of the four principal antifungal drug classes commonly used in the treatment of candidiasis: polyenes, triazoles, nucleoside analogues, and echinocandins. The phylogenetic tree revealed three distinct phylogenetic clusters among Candida species. Species within a given phylogenetic cluster have generally similar susceptibility profiles to antifungal drugs and species within Clusters II and III were less sensitive to antifungal drugs than Cluster I species. These results showed that phylogenetical relationship between clusters and susceptibility to several antifungal drugs could be used to guide therapy when only species identification is available prior to information pertaining to its resistance profile. An extended study comprising a large panel of clinical samples should be conducted to confirm the efficiency of this approach in the treatment of candidiasis. Copyright Â© 2016. Published by Elsevier B.V.
Evolutionary lineages of marine snails identified using molecular phylogenetics and geometric morphometric analysis of shells.

PubMed

Vaux, Felix; Trewick, Steven A; Crampton, James S; Marshall, Bruce A; Beu, Alan G; Hills, Simon F K; Morgan-Richards, Mary

2018-06-15

The relationship between morphology and inheritance is of perennial interest in evolutionary biology and palaeontology. Using three marine snail genera Penion, Antarctoneptunea and Kelletia, we investigate whether systematics based on shell morphology accurately reflect evolutionary lineages indicated by molecular phylogenetics. Members of these gastropod genera have been a taxonomic challenge due to substantial variation in shell morphology, conservative radular and soft tissue morphology, few known ecological differences, and geographical overlap between numerous species. Sampling all sixteen putative taxa identified across the three genera, we infer mitochondrial and nuclear ribosomal DNA phylogenetic relationships within the group, and compare this to variation in adult shell shape and size. Results of phylogenetic analysis indicate that each genus is monophyletic, although the status of some phylogenetically derived and likely more recently evolved taxa within Penion is uncertain. The recently described species P. lineatus is supported by genetic evidence. Morphology, captured using geometric morphometric analysis, distinguishes the genera and matches the molecular phylogeny, although using the same dataset, species and phylogenetic subclades are not identified with high accuracy. Overall, despite abundant variation, we find that shell morphology accurately reflects genus-level classification and the corresponding deep phylogenetic splits identified in this group of marine snails. Copyright © 2018 Elsevier Inc. All rights reserved.
Molecular epidemiology and phylogenetic analysis of Hepatitis B virus in a group of migrants in Italy.

PubMed

Villano, Umbertina; Lo Presti, Alessandra; Equestre, Michele; Cella, Eleonora; Pisani, Giulio; Giovanetti, Marta; Bruni, Roberto; Tritarelli, Elena; Amicosante, Massimo; Grifoni, Alba; Scarcella, Carmelo; El-Hamad, Issa; Pezzoli, Maria Chiara; Angeletti, Silvia; Silvia, Angeletti; Ciccaglione, Anna Rita; Ciccozzi, Massimo

2015-07-25

Hepatitis B virus infection (HBV) is widespread and it is considered a major health problem worldwide. The global distribution of HBV varies significantly between countries and between regions of the world. Among the many factors contributing to the changing epidemiology of viral hepatitis, the movement of people within and between countries is a potentially important one. In Italy, the number of migrant individuals has been increasing during the past 25 years. HBV genotype D has been found throughout the world, although its highest prevalence is in the Mediterranean area, the Middle East and southern Asia. We describe the molecular epidemiology of HBV in a chronically infected population of migrants (living in Italy), by using the phylogenetic analysis. HBV-DNA was amplified and sequenced from 43 HBV chronically infected patients. Phylogenetic and evolutionary analysis were performed using both maximum Likelihood and Bayesian methods. Of the 43 HBV S gene isolates from migrants, 25 (58.1 %) were classified as D genotype. Maximum Likelihood analysis showed an intermixing between Moldavian and foreigners sequences mostly respect to Italian ones. Italian sequences clustered mostly together in a main clade separately from all others. The estimation of the time of the tree's root gave a mean value of 17 years ago, suggesting the origin of the tree back to 1992 year. The skyline plot showed that the number of infections softly increased until the early 2005s, after which reached a plateau. Comparing phylogenetic data to the migrants date of arrival in Italy, it should be possible that migrants arrived in Italy yet infected from their country of origin. In conclusion, this is the first paper where phylogenetic analysis and genetic evolution has been used to characterize HBV sub genotypes D1 circulation in a selected and homogenous group of migrants coming from a restricted area of Balkans and to approximately define the period of infection besides the migration date.
Organellar phylogenomics of an emerging model system: Sphagnum (peatmoss).

PubMed

Jonathan Shaw, A; Devos, Nicolas; Liu, Yang; Cox, Cymon J; Goffinet, Bernard; Flatberg, Kjell Ivar; Shaw, Blanka

2016-08-01

Sphagnum-dominated peatlands contain approx. 30 % of the terrestrial carbon pool in the form of partially decomposed plant material (peat), and, as a consequence, Sphagnum is currently a focus of studies on biogeochemistry and control of global climate. Sphagnum species differ in ecologically important traits that scale up to impact ecosystem function, and sequencing of the genome from selected Sphagnum species is currently underway. As an emerging model system, these resources for Sphagnum will facilitate linking nucleotide variation to plant functional traits, and through those traits to ecosystem processes. A solid phylogenetic framework for Sphagnum is crucial to comparative analyses of species-specific traits, but relationships among major clades within Sphagnum have been recalcitrant to resolution because the genus underwent a rapid radiation. Herein a well-supported hypothesis for phylogenetic relationships among major clades within Sphagnum based on organellar genome sequences (plastid, mitochondrial) is provided. We obtained nucleotide sequences (273 753 nucleotides in total) from the two organellar genomes from 38 species (including three outgroups). Phylogenetic analyses were conducted using a variety of methods applied to nucleotide and amino acid sequences. The Sphagnum phylogeny was rooted with sequences from the related Sphagnopsida genera, Eosphagnum and Flatbergium Phylogenetic analyses of the data converge on the following subgeneric relationships: (Rigida (((Subsecunda) (Cuspidata)) ((Sphagnum) (Acutifolia))). All relationships were strongly supported. Species in the two major clades (i.e. Subsecunda + Cuspidata and Sphagnum + Acutifolia), which include >90 % of all Sphagnum species, differ in ecological niches and these differences correlate with other functional traits that impact biogeochemical cycling. Mitochondrial intron presence/absence are variable among species and genera of the Sphagnopsida. Two new nomenclatural combinations are made, in the genera Eosphagnum and Flatbergium Newly resolved relationships now permit phylogenetic analyses of morphological, biochemical and ecological traits among Sphagnum species. The results clarify long-standing disagreements about subgeneric relationships and intrageneric classification. © The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Polynomial Supertree Methods Revisited

PubMed Central

Brinkmeyer, Malte; Griebel, Thasso; Böcker, Sebastian

2011-01-01

Supertree methods allow to reconstruct large phylogenetic trees by combining smaller trees with overlapping leaf sets into one, more comprehensive supertree. The most commonly used supertree method, matrix representation with parsimony (MRP), produces accurate supertrees but is rather slow due to the underlying hard optimization problem. In this paper, we present an extensive simulation study comparing the performance of MRP and the polynomial supertree methods MinCut Supertree, Modified MinCut Supertree, Build-with-distances, PhySIC, PhySIC_IST, and super distance matrix. We consider both quality and resolution of the reconstructed supertrees. Our findings illustrate the tradeoff between accuracy and running time in supertree construction, as well as the pros and cons of voting- and veto-based supertree approaches. Based on our results, we make some general suggestions for supertree methods yet to come. PMID:22229028
Functional & phylogenetic diversity of copepod communities

NASA Astrophysics Data System (ADS)

Benedetti, F.; Ayata, S. D.; Blanco-Bercial, L.; Cornils, A.; Guilhaumon, F.

2016-02-01

The diversity of natural communities is classically estimated through species identification (taxonomic diversity) but can also be estimated from the ecological functions performed by the species (functional diversity), or from the phylogenetic relationships among them (phylogenetic diversity). Estimating functional diversity requires the definition of specific functional traits, i.e., phenotypic characteristics that impact fitness and are relevant to ecosystem functioning. Estimating phylogenetic diversity requires the description of phylogenetic relationships, for instance by using molecular tools. In the present study, we focused on the functional and phylogenetic diversity of copepod surface communities in the Mediterranean Sea. First, we implemented a specific trait database for the most commonly-sampled and abundant copepod species of the Mediterranean Sea. Our database includes 191 species, described by seven traits encompassing diverse ecological functions: minimal and maximal body length, trophic group, feeding type, spawning strategy, diel vertical migration and vertical habitat. Clustering analysis in the functional trait space revealed that Mediterranean copepods can be gathered into groups that have different ecological roles. Second, we reconstructed a phylogenetic tree using the available sequences of 18S rRNA. Our tree included 154 of the analyzed Mediterranean copepod species. We used these two datasets to describe the functional and phylogenetic diversity of copepod surface communities in the Mediterranean Sea. The replacement component (turn-over) and the species richness difference component (nestedness) of the beta diversity indices were identified. Finally, by comparing various and complementary aspects of plankton diversity (taxonomic, functional, and phylogenetic diversity) we were able to gain a better understanding of the relationships among the zooplankton community, biodiversity, ecosystem function, and environmental forcing.
Phylogenetic support for the Tropical Niche Conservatism Hypothesis despite the absence of a clear latitudinal species richness gradient in Yunnan's woody flora

NASA Astrophysics Data System (ADS)

Tang, G.; Zhang, M. G.; Liu, C.; Zhou, Z.; Chen, W.; Slik, J. W. F.

2014-05-01

The Tropical Niche Conservatism Hypothesis (TCH) tries to explain the generally observed latitudinal gradient of increasing species diversity towards the tropics. To date, few studies have used phylogenetic approaches to assess its validity, even though such methods are especially suited to detect changes in niche structure. We test the TCH using modeled distributions of 1898 woody species in Yunnan Province (southwest China) in combination with a family level phylogeny. Unlike predicted, species richness and phylogenetic diversity did not show a latitudinal gradient, but identified two high diversity zones, one in Northwest and one in South Yunnan. Despite this, the underlying residual phylogenetic diversity showed a clear decline away from the tropics, while the species composition became progressingly more phylogenetically clustered towards the North. These latitudinal changes were strongly associated with more extreme temperature variability and declining precipitation and soil water availability, especially during the dry season. Our results suggests that the climatically more extreme conditions outside the tropics require adaptations for successful colonization, most likely related to the plant hydraulic system, that have been acquired by only a limited number of phylogenetically closely related plant lineages. We emphasize the importance of phylogenetic approaches for testing the TCH.
Evidence of a chimpanzee-sized ancestor of humans but a gibbon-sized ancestor of apes.

PubMed

Grabowski, Mark; Jungers, William L

2017-10-12

Body mass directly affects how an animal relates to its environment and has a wide range of biological implications. However, little is known about the mass of the last common ancestor (LCA) of humans and chimpanzees, hominids (great apes and humans), or hominoids (all apes and humans), which is needed to evaluate numerous paleobiological hypotheses at and prior to the root of our lineage. Here we use phylogenetic comparative methods and data from primates including humans, fossil hominins, and a wide sample of fossil primates including Miocene apes from Africa, Europe, and Asia to test alternative hypotheses of body mass evolution. Our results suggest, contrary to previous suggestions, that the LCA of all hominoids lived in an environment that favored a gibbon-like size, but a series of selective regime shifts, possibly due to resource availability, led to a decrease and then increase in body mass in early hominins from a chimpanzee-sized LCA.The pattern of body size evolution in hominids can provide insight into historical human ecology. Here, Grabowski and Jungers use comparative phylogenetic analysis to reconstruct the likely size of the ancestor of humans and chimpanzees and the evolutionary history of selection on body size in primates.
Component identification of electron transport chains in curdlan-producing Agrobacterium sp. ATCC 31749 and its genome-specific prediction using comparative genome and phylogenetic trees analysis.

PubMed

Zhang, Hongtao; Setubal, Joao Carlos; Zhan, Xiaobei; Zheng, Zhiyong; Yu, Lijun; Wu, Jianrong; Chen, Dingqiang

2011-06-01

Agrobacterium sp. ATCC 31749 (formerly named Alcaligenes faecalis var. myxogenes) is a non-pathogenic aerobic soil bacterium used in large scale biotechnological production of curdlan. However, little is known about its genomic information. DNA partial sequence of electron transport chains (ETCs) protein genes were obtained in order to understand the components of ETC and genomic-specificity in Agrobacterium sp. ATCC 31749. Degenerate primers were designed according to ETC conserved sequences in other reported species. DNA partial sequences of ETC genes in Agrobacterium sp. ATCC 31749 were cloned by the PCR method using degenerate primers. Based on comparative genomic analysis, nine electron transport elements were ascertained, including NADH ubiquinone oxidoreductase, succinate dehydrogenase complex II, complex III, cytochrome c, ubiquinone biosynthesis protein ubiB, cytochrome d terminal oxidase, cytochrome bo terminal oxidase, cytochrome cbb (3)-type terminal oxidase and cytochrome caa (3)-type terminal oxidase. Similarity and phylogenetic analyses of these genes revealed that among fully sequenced Agrobacterium species, Agrobacterium sp. ATCC 31749 is closest to Agrobacterium tumefaciens C58. Based on these results a comprehensive ETC model for Agrobacterium sp. ATCC 31749 is proposed.
The significance of gtf genes in caries expression: a rapid identification of Streptococcus mutans from dental plaque of child patients.

PubMed

Mishra, Apurva; Pandey, Ramesh K; Manickam, Natesan

2015-01-01

Rapid phylogenetic and functional gene (gtfB) identification of S. mutans from the dental plaque derived from children. Dental plaque collected from fifteen patients of age group 7-12 underwent centrifugation followed by genomic DNA extraction for S. mutans. Genomic DNA was processed with S. mutans specific primers in suitable PCR condtions for phylogenetic and functional gene (gtfB) identification. The yield and results were confirmed by agarose gel electrophoresis. 1% agarose gel electrophoresis depicts the positive PCR amplification at 1,485 bp when compared with standard 1 kbp indicating the presence of S. mutans in the test sample. Another PCR reaction was set using gtfB primers specific for S. mutans for functional gene identification. 1.2% agarose gel electrophoresis was done and a positive amplication was observed at 192 bp when compared to 100 bp standards. With the advancement in molecular biology techniques, PCR based identification and quantification of the bacterial load can be done within hours using species-specific primers and DNA probes. Thus, this technique may reduce the laboratory time spend in conventional culture methods, reduces the possibility of colony identification errors and is more sensitive to culture techniques.
Bacterial whole genome-based phylogeny: construction of a new benchmarking dataset and assessment of some existing methods.

PubMed

Ahrenfeldt, Johanne; Skaarup, Carina; Hasman, Henrik; Pedersen, Anders Gorm; Aarestrup, Frank Møller; Lund, Ole

2017-01-05

Whole genome sequencing (WGS) is increasingly used in diagnostics and surveillance of infectious diseases. A major application for WGS is to use the data for identifying outbreak clusters, and there is therefore a need for methods that can accurately and efficiently infer phylogenies from sequencing reads. In the present study we describe a new dataset that we have created for the purpose of benchmarking such WGS-based methods for epidemiological data, and also present an analysis where we use the data to compare the performance of some current methods. Our aim was to create a benchmark data set that mimics sequencing data of the sort that might be collected during an outbreak of an infectious disease. This was achieved by letting an E. coli hypermutator strain grow in the lab for 8 consecutive days, each day splitting the culture in two while also collecting samples for sequencing. The result is a data set consisting of 101 whole genome sequences with known phylogenetic relationship. Among the sequenced samples 51 correspond to internal nodes in the phylogeny because they are ancestral, while the remaining 50 correspond to leaves. We also used the newly created data set to compare three different online available methods that infer phylogenies from whole-genome sequencing reads: NDtree, CSI Phylogeny and REALPHY. One complication when comparing the output of these methods with the known phylogeny is that phylogenetic methods typically build trees where all observed sequences are placed as leafs, even though some of them are in fact ancestral. We therefore devised a method for post processing the inferred trees by collapsing short branches (thus relocating some leafs to internal nodes), and also present two new measures of tree similarity that takes into account the identity of both internal and leaf nodes. Based on this analysis we find that, among the investigated methods, CSI Phylogeny had the best performance, correctly identifying 73% of all branches in the tree and 71% of all clades. We have made all data from this experiment (raw sequencing reads, consensus whole-genome sequences, as well as descriptions of the known phylogeny in a variety of formats) publicly available, with the hope that other groups may find this data useful for benchmarking and exploring the performance of epidemiological methods. All data is freely available at: https://cge.cbs.dtu.dk/services/evolution_data.php .
The phylogenetic and evolutionary history of a novel alpha-globin-type gene in orangutans (Pongo pygmaeus).

PubMed

Steiper, Michael E; Wolfe, Nathan D; Karesh, William B; Kilbourn, Annelisa M; Bosi, Edwin J; Ruvolo, Maryellen

2006-07-01

The alpha-globin genes are implicated in human resistance to malaria, a disease caused by Plasmodium parasites. This study is the first to analyze DNA sequences from a novel alpha-globin-type gene in orangutans, a species affected by Plasmodium. Phylogenetic methods show that the gene is a duplication of an alpha-globin gene and is located 5' of alpha-2 globin. The alpha-globin-type gene is notable for having four amino acid replacements relative to the orangutan's alpha-1 and alpha-2 globin genes, with no synonymous differences. Pairwise K(a)/K(s) methods and likelihood ratio tests (LRTs) revealed that the evolutionary history of the alpha-globin-type gene has been marked by either neutral or positive evolution, but not purifying selection. A comparative analysis of the amino acid replacements of the alpha-globin-type gene with human hemoglobinopathies and hemoglobin structure showed that two of the four replaced sites are members of the same molecular bond, one that is crucial to the proper functioning of the hemoglobin molecule. This suggested an adaptive evolutionary change. Functionally, this locus may result in a thalassemia-like phenotype in orangutans, possibly as an adaptation to combat Plasmodium.
Molecular phylogenetics of finches and sparrows: consequences of character state removal in cytochrome b sequences.

PubMed

Groth, J G

1998-12-01

The complete mitochondrial cytochrome b genes of 53 genera of oscine passerine birds representing the major groups of finches and some allies were compared. Phylogenetic trees resulting from three levels of character partition removal (no data removed, transitions at third positions of codons removed, and all transitions removed [transversion parsimony]) were generally concordant, and all supported several basic statements regarding relationships of finches and finch-like birds, including: (1) larks (Alaudidae) show no close relationship to any finch group; (2) Peucedramus (olive warbler) is phylogenetically far removed from true wood warblers; (3) a clade consisting of fringillids, passerids, motacillids, and emberizids is supported, and this clade is characterized by evolution of a vestigial 10th wing primary; and (4) Hawaiian honeycreepers are derived from within the cardueline finches. Excluding transition substitutions at third positions of codons resulted in phylogenetic trees similar to, but with greater bootstrap nodal support than, trees derived using either all data (equally weighted) or transversion parsimony. Relative to the shortest trees obtained using all data, the topologies obtained after elimination of third-position transitions showed only slight increases in realized treelength and homoplasy. These increases were negligable compared to increases in overall nodal support; therefore, this partition removal scheme may enhance recovery of deep phylogenetic signal in protein-coding DNA datasets. Copyright 1998 Academic Press.

Is invasion success of Australian trees mediated by their native biogeography, phylogenetic history, or both?

PubMed

Miller, Joseph T; Hui, Cang; Thornhill, Andrew; Gallien, Laure; Le Roux, Johannes J; Richardson, David M

2016-12-30

For a plant species to become invasive it has to progress along the introduction-naturalization-invasion (INI) continuum which reflects the joint direction of niche breadth. Identification of traits that correlate with and drive species invasiveness along the continuum is a major focus of invasion biology. If invasiveness is underlain by heritable traits, and if such traits are phylogenetically conserved, then we would expect non-native species with different introduction status (i.e. position along the INI continuum) to show phylogenetic signal. This study uses two clades that contain a large number of invasive tree species from the genera Acacia and Eucalyptus to test whether geographic distribution and a novel phylogenetic conservation method can predict which species have been introduced, became naturalized, and invasive. Our results suggest that no underlying phylogenetic signal underlie the introduction status for both groups of trees, except for introduced acacias. The more invasive acacia clade contains invasive species that have smoother geographic distributions and are more marginal in the phylogenetic network. The less invasive eucalyptus group contains invasive species that are more clustered geographically, more centrally located in the phylogenetic network and have phylogenetic distances between invasive and non-invasive species that are trending toward the mean pairwise distance. This suggests that highly invasive groups may be identified because they have invasive species with smoother and faster expanding native distributions and are located more to the edges of phylogenetic networks than less invasive groups. Published by Oxford University Press on behalf of the Annals of Botany Company.
Entire plastid phylogeny of the carrot genus (Daucus, Apiaceae):Concordance with nuclear data and mitochondrial and nuclear DNA insertions to the plastid

USDA-ARS?s Scientific Manuscript database

We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results to prior phylogenetic results using plastid, nuclear, and mitochondrial DNA sequences. We obtained, using Illumina sequencing, full plastid sequences of 37 accessions of 20 Daucus taxa and outgrou...
P-type ATPase superfamily: evidence for critical roles for kingdom evolution.

PubMed

Okamura, Hideyuki; Denawa, Masatsugu; Ohniwa, Ryosuke; Takeyasu, Kunio

2003-04-01

The P-type ATPase has become a protein superfamily. On the basis of sequence similarities, the phylogenetic analyses, and substrate specificities, this superfamily can be classified into 5 families and 11 subfamilies. A comparative phylogenetic analysis demonstrates the relationship between the molecular evolution of these subfamilies and the establishment of the kingdoms of living things.
The Cladistic Basis for the Phylogenetic Diversity (PD) Measure Links Evolutionary Features to Environmental Gradients and Supports Broad Applications of Microbial Ecology’s “Phylogenetic Beta Diversity” Framework

PubMed Central

Faith, Daniel P.; Lozupone, Catherine A.; Nipperess, David; Knight, Rob

2009-01-01

The PD measure of phylogenetic diversity interprets branch lengths cladistically to make inferences about feature diversity. PD calculations extend conventional species-level ecological indices to the features level. The “phylogenetic beta diversity” framework developed by microbial ecologists calculates PD-dissimilarities between community localities. Interpretation of these PD-dissimilarities at the feature level explains the framework’s success in producing ordinations revealing environmental gradients. An example gradients space using PD-dissimilarities illustrates how evolutionary features form unimodal response patterns to gradients. This features model supports new application of existing species-level methods that are robust to unimodal responses, plus novel applications relating to climate change, commercial products discovery, and community assembly. PMID:20087461
16S and 23S plastid rDNA phylogenies of Prototheca species and their auxanographic phenotypes.

PubMed

Ewing, Aren; Brubaker, Shane; Somanchi, Aravind; Yu, Esther; Rudenko, George; Reyes, Nina; Espina, Karen; Grossman, Arthur; Franklin, Scott

2014-08-01

Because algae have become more accepted as sources of human nutrition, phylogenetic analysis can help resolve the taxonomy of taxa that have not been well studied. This can help establish algal evolutionary relationships. Here, we compare Auxenochlorella protothecoides and 23 strains of Prototheca based on their complete 16S and partial 23S plastid rDNA sequences along with nutrient utilization (auxanographic) profiles. These data demonstrate that some of the species groupings are not in agreement with the molecular phylogenetic analyses and that auxanographic profiles are poor predictors of phylogenetic relationships.
16S and 23S plastid rDNA phylogenies of Prototheca species and their auxanographic phenotypes1

PubMed Central

Ewing, Aren; Brubaker, Shane; Somanchi, Aravind; Yu, Esther; Rudenko, George; Reyes, Nina; Espina, Karen; Grossman, Arthur; Franklin, Scott

2014-01-01

Because algae have become more accepted as sources of human nutrition, phylogenetic analysis can help resolve the taxonomy of taxa that have not been well studied. This can help establish algal evolutionary relationships. Here, we compare Auxenochlorella protothecoides and 23 strains of Prototheca based on their complete 16S and partial 23S plastid rDNA sequences along with nutrient utilization (auxanographic) profiles. These data demonstrate that some of the species groupings are not in agreement with the molecular phylogenetic analyses and that auxanographic profiles are poor predictors of phylogenetic relationships. PMID:25937672
Preliminary Classification of Novel Hemorrhagic Fever-Causing Viruses Using Sequence-Based PAirwise Sequence Comparison (PASC) Analysis.

PubMed

Bào, Yīmíng; Kuhn, Jens H

2018-01-01

During the last decade, genome sequence-based classification of viruses has become increasingly prominent. Viruses can be even classified based on coding-complete genome sequence data alone. Nevertheless, classification remains arduous as experts are required to establish phylogenetic trees to depict the evolutionary relationships of such sequences for preliminary taxonomic placement. Pairwise sequence comparison (PASC) of genomes is one of several novel methods for establishing relationships among viruses. This method, provided by the US National Center for Biotechnology Information as an open-access tool, circumvents phylogenetics, and yet PASC results are often in agreement with those of phylogenetic analyses. Computationally inexpensive, PASC can be easily performed by non-taxonomists. Here we describe how to use the PASC tool for the preliminary classification of novel viral hemorrhagic fever-causing viruses.
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC

PubMed Central

Satija, Rahul; Novák, Ádám; Miklós, István; Lyngsø, Rune; Hein, Jotun

2009-01-01

Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from PMID:19715598
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC.

PubMed

Satija, Rahul; Novák, Adám; Miklós, István; Lyngsø, Rune; Hein, Jotun

2009-08-28

We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the alpha-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/~satija/BigFoot/
RNA-Seq based phylogeny recapitulates previous phylogeny of the genus Flaveria (Asteraceae) with some modifications.

PubMed

Lyu, Ming-Ju Amy; Gowik, Udo; Kelly, Steve; Covshoff, Sarah; Mallmann, Julia; Westhoff, Peter; Hibberd, Julian M; Stata, Matt; Sage, Rowan F; Lu, Haorong; Wei, Xiaofeng; Wong, Gane Ka-Shu; Zhu, Xin-Guang

2015-06-18

The genus Flaveria has been extensively used as a model to study the evolution of C4 photosynthesis as it contains C3 and C4 species as well as a number of species that exhibit intermediate types of photosynthesis. The current phylogenetic tree of the genus Flaveria contains 21 of the 23 known Flaveria species and has been previously constructed using a combination of morphological data and three non-coding DNA sequences (nuclear encoded ETS, ITS and chloroplast encoded trnL-F). Here we developed a new strategy to update the phylogenetic tree of 16 Flaveria species based on RNA-Seq data. The updated phylogeny is largely congruent with the previously published tree but with some modifications. We propose that the data collection method provided in this study can be used as a generic method for phylogenetic tree reconstruction if the target species has no genomic information. We also showed that a "F. pringlei" genotype recently used in a number of labs may be a hybrid between F. pringlei (C3) and F. angustifolia (C3-C4). We propose that the new strategy of obtaining phylogenetic sequences outlined in this study can be used to construct robust trees in a larger number of taxa. The updated Flaveria phylogenetic tree also supports a hypothesis of stepwise and parallel evolution of C4 photosynthesis in the Flavaria clade.
Detecting Network Communities: An Application to Phylogenetic Analysis

PubMed Central

Andrade, Roberto F. S.; Rocha-Neto, Ivan C.; Santos, Leonardo B. L.; de Santana, Charles N.; Diniz, Marcelo V. C.; Lobão, Thierry Petit; Goés-Neto, Aristóteles; Pinho, Suani T. R.; El-Hani, Charbel N.

2011-01-01

This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. PMID:21573202
Using phylogenetic probes for quantification of stable isotope labeling and microbial community analysis

DOEpatents

Brodie, Eoin L; DeSantis, Todd Z; Karaoz, Ulas; Andersen, Gary L

2014-12-09

Herein is described methods for a high-sensitivity means to measure the incorporation of stable isotope labeled substrates into RNA following stable isotope probing experiments (SIP). RNA is hybridized to a set of probes such as phylogenetic microarrays and isotope incorporation is quantified such as by secondary ion mass spectrometer imaging (NanoSIMS).
Characterization of the complete mitochondrial genome of the hybrid Epinephelus moara♀ × Epinephelus lanceolatus♂, and phylogenetic analysis in subfamily epinephelinae

NASA Astrophysics Data System (ADS)

Gao, Fengtao; Wei, Min; Zhu, Ying; Guo, Hua; Chen, Songlin; Yang, Guanpin

2017-06-01

This study presents the complete mitochondrial genome of the hybrid Epinephelus moara♀× Epinephelus lanceolatus♂. The genome is 16886 bp in length, and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes, a light-strand replication origin and a control region. Additionally, phylogenetic analysis based on the nucleotide sequences of 13 conserved protein-coding genes using the maximum likelihood method indicated that the mitochondrial genome is maternally inherited. This study presents genomic data for studying phylogenetic relationships and breeding of hybrid Epinephelinae.
Synthesis of phylogeny and taxonomy into a comprehensive tree of life

PubMed Central

Hinchliff, Cody E.; Smith, Stephen A.; Allman, James F.; Burleigh, J. Gordon; Chaudhary, Ruchi; Coghill, Lyndon M.; Crandall, Keith A.; Deng, Jiabin; Drew, Bryan T.; Gazis, Romina; Gude, Karl; Hibbett, David S.; Katz, Laura A.; Laughinghouse, H. Dail; McTavish, Emily Jane; Midford, Peter E.; Owen, Christopher L.; Ree, Richard H.; Rees, Jonathan A.; Soltis, Douglas E.; Williams, Tiffani; Cranston, Karen A.

2015-01-01

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips—the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics. PMID:26385966
A new phylogenetic diversity measure generalizing the shannon index and its application to phyllostomid bats.

PubMed

Allen, Benjamin; Kon, Mark; Bar-Yam, Yaneer

2009-08-01

Protecting biodiversity involves preserving the maximum number and abundance of species while giving special attention to species with unique genetic or morphological characteristics. In balancing different priorities, conservation policymakers may consider quantitative measures that compare diversity across ecological communities. To serve this purpose, a measure should increase or decrease with changes in community composition in a way that reflects what is valued, including species richness, evenness, and distinctness. However, counterintuitively, studies have shown that established indices, including those that emphasize average interspecies phylogenetic distance, may increase with the elimination of species. We introduce a new diversity index, the phylogenetic entropy, which generalizes in a natural way the Shannon index to incorporate species relatedness. Phylogenetic entropy favors communities in which highly distinct species are more abundant, but it does not advocate decreasing any species proportion below a community structure-dependent threshold. We contrast the behavior of multiple indices on a community of phyllostomid bats in the Selva Lacandona. The optimal genus distribution for phylogenetic entropy populates all genera in a linear relationship to their total phylogenetic distance to other genera. Two other indices favor eliminating 12 out of the 23 genera.
Synthesis of phylogeny and taxonomy into a comprehensive tree of life.

PubMed

Hinchliff, Cody E; Smith, Stephen A; Allman, James F; Burleigh, J Gordon; Chaudhary, Ruchi; Coghill, Lyndon M; Crandall, Keith A; Deng, Jiabin; Drew, Bryan T; Gazis, Romina; Gude, Karl; Hibbett, David S; Katz, Laura A; Laughinghouse, H Dail; McTavish, Emily Jane; Midford, Peter E; Owen, Christopher L; Ree, Richard H; Rees, Jonathan A; Soltis, Douglas E; Williams, Tiffani; Cranston, Karen A

2015-10-13

Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips-the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.
Morphometric study of phylogenetic and ecologic signals in procyonid (mammalia: carnivora) endocasts.

PubMed

Ahrens, Heather E

2014-12-01

Endocasts provide a proxy for brain morphology but are rarely incorporated in phylogenetic analyses despite the potential for new suites of characters. The phylogeny of Procyonidae, a carnivoran family with relatively limited taxonomic diversity, is not well resolved because morphological and molecular data yield conflicting topologies. The presence of phylogenetic and ecologic signals in the endocasts of procyonids will be determined using three-dimensional geometric morphometrics. Endocasts of seven ingroup species and four outgroup species were digitally rendered and 21 landmarks were collected from the endocast surface. Two phylogenetic hypotheses of Procyonidae will be examined using methods testing for phylogenetic signal in morphometric data. In analyses of all taxa, there is significant phylogenetic signal in brain shape for both the morphological and molecular topologies. However, the analyses of ingroup taxa recover a significant phylogenetic signal for the morphological topology only. These results indicate support for the molecular outgroup topology, but not the ingroup topology given the brain shape data. Further examination of brain shape using principal components analysis and wireframe comparisons suggests procyonids possess more developed areas of the brain associated with motor control, spatial perception, and balance relative to the basal musteloid condition. Within Procyonidae, similar patterns of variation are present, and may be associated with increased arboreality in certain taxa. Thus, brain shape derived from endocasts may be used to test for phylogenetic signal and preliminary analyses suggest an association with behavior and ecology. © 2014 Wiley Periodicals, Inc.
Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

PubMed Central

2010-01-01

Background Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. Results Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization. Conclusions SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites. PMID:20102603
Improving phylogenetic analyses by incorporating additional information from genetic sequence databases.

PubMed

Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A

2009-10-01

Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.
Are Diet Preferences Associated to Skulls Shape Diversification in Xenodontine Snakes?

PubMed Central

Klaczko, Julia; Sherratt, Emma; Setz, Eleonore Z. F.

2016-01-01

Snakes are a highly successful group of vertebrates, within great diversity in habitat, diet, and morphology. The unique adaptations for the snake skull for ingesting large prey in more primitive macrostomatan snakes have been well documented. However, subsequent diversification in snake cranial shape in relation to dietary specializations has rarely been studied (e.g. piscivory in natricine snakes). Here we examine a large clade of snakes with a broad spectrum of diet preferences to test if diet preferences are correlated to shape variation in snake skulls. Specifically, we studied the Xenodontinae snakes, a speciose clade of South American snakes, which show a broad range of diets including invertebrates, amphibians, snakes, lizards, and small mammals. We characterized the skull morphology of 19 species of xenodontine snakes using geometric morphometric techniques, and used phylogenetic comparative methods to test the association between diet and skull morphology. Using phylogenetic partial least squares analysis (PPLS) we show that skull morphology is highly associated with diet preferences in xenodontine snakes. PMID:26886549

Analyses of the radiation of birnaviruses from diverse host phyla and of their evolutionary affinities with other double-stranded RNA and positive strand RNA viruses using robust structure-based multiple sequence alignments and advanced phylogenetic methods

PubMed Central

2013-01-01

Background Birnaviruses form a distinct family of double-stranded RNA viruses infecting animals as different as vertebrates, mollusks, insects and rotifers. With such a wide host range, they constitute a good model for studying the adaptation to the host. Additionally, several lines of evidence link birnaviruses to positive strand RNA viruses and suggest that phylogenetic analyses may provide clues about transition. Results We characterized the genome of a birnavirus from the rotifer Branchionus plicalitis. We used X-ray structures of RNA-dependent RNA polymerases and capsid proteins to obtain multiple structure alignments that allowed us to obtain reliable multiple sequence alignments and we employed “advanced” phylogenetic methods to study the evolutionary relationships between some positive strand and double-stranded RNA viruses. We showed that the rotifer birnavirus genome exhibited an organization remarkably similar to other birnaviruses. As this host was phylogenetically very distant from the other known species targeted by birnaviruses, we revisited the evolutionary pathways within the Birnaviridae family using phylogenetic reconstruction methods. We also applied a number of phylogenetic approaches based on structurally conserved domains/regions of the capsid and RNA-dependent RNA polymerase proteins to study the evolutionary relationships between birnaviruses, other double-stranded RNA viruses and positive strand RNA viruses. Conclusions We show that there is a good correlation between the phylogeny of the birnaviruses and that of their hosts at the phylum level using the RNA-dependent RNA polymerase (genomic segment B) on the one hand and a concatenation of the capsid protein, protease and ribonucleoprotein (genomic segment A) on the other hand. This correlation tends to vanish within phyla. The use of advanced phylogenetic methods and robust structure-based multiple sequence alignments allowed us to obtain a more accurate picture (in terms of probability of the tree topologies) of the evolutionary affinities between double-stranded RNA and positive strand RNA viruses. In particular, we were able to show that there exists a good statistical support for the claims that dsRNA viruses are not monophyletic and that viruses with permuted RdRps belong to a common evolution lineage as previously proposed by other groups. We also propose a tree topology with a good statistical support describing the evolutionary relationships between the Picornaviridae, Caliciviridae, Flaviviridae families and a group including the Alphatetraviridae, Nodaviridae, Permutotretraviridae, Birnaviridae, and Cystoviridae families. PMID:23865988
Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

PubMed Central

Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

2015-01-01

Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487
MaxAlign: maximizing usable data in an alignment.

PubMed

Gouveia-Oliveira, Rodrigo; Sackett, Peter W; Pedersen, Anders G

2007-08-28

The presence of gaps in an alignment of nucleotide or protein sequences is often an inconvenience for bioinformatical studies. In phylogenetic and other analyses, for instance, gapped columns are often discarded entirely from the alignment. MaxAlign is a program that optimizes the alignment prior to such analyses. Specifically, it maximizes the number of nucleotide (or amino acid) symbols that are present in gap-free columns - the alignment area - by selecting the optimal subset of sequences to exclude from the alignment. MaxAlign can be used prior to phylogenetic and bioinformatical analyses as well as in other situations where this form of alignment improvement is useful. In this work we test MaxAlign's performance in these tasks and compare the accuracy of phylogenetic estimates including and excluding gapped columns from the analysis, with and without processing with MaxAlign. In this paper we also introduce a new simple measure of tree similarity, Normalized Symmetric Similarity (NSS) that we consider useful for comparing tree topologies. We demonstrate how MaxAlign is helpful in detecting misaligned or defective sequences without requiring manual inspection. We also show that it is not advisable to exclude gapped columns from phylogenetic analyses unless MaxAlign is used first. Finally, we find that the sequences removed by MaxAlign from an alignment tend to be those that would otherwise be associated with low phylogenetic accuracy, and that the presence of gaps in any given sequence does not seem to disturb the phylogenetic estimates of other sequences. The MaxAlign web-server is freely available online at http://www.cbs.dtu.dk/services/MaxAlign where supplementary information can also be found. The program is also freely available as a Perl stand-alone package.
The morphological state space revisited: what do phylogenetic patterns in homoplasy tell us about the number of possible character states?

PubMed Central

Hoyal Cuthill, Jennifer F.

2015-01-01

Biological variety and major evolutionary transitions suggest that the space of possible morphologies may have varied among lineages and through time. However, most models of phylogenetic character evolution assume that the potential state space is finite. Here, I explore what the morphological state space might be like, by analysing trends in homoplasy (repeated derivation of the same character state). Analyses of ten published character matrices are compared against computer simulations with different state space models: infinite states, finite states, ordered states and an ‘inertial' model, simulating phylogenetic constraints. Of these, only the infinite states model results in evolution without homoplasy, a prediction which is not generally met by real phylogenies. Many authors have interpreted the ubiquity of homoplasy as evidence that the number of evolutionary alternatives is finite. However, homoplasy is also predicted by phylogenetic constraints on the morphological distance that can be traversed between ancestor and descendent. Phylogenetic rarefaction (sub-sampling) shows that finite and inertial state spaces do produce contrasting trends in the distribution of homoplasy. Two clades show trends characteristic of phylogenetic inertia, with decreasing homoplasy (increasing consistency index) as we sub-sample more distantly related taxa. One clade shows increasing homoplasy, suggesting exhaustion of finite states. Different clades may, therefore, show different patterns of character evolution. However, when parsimony uninformative characters are excluded (which may occur without documentation in cladistic studies), it may no longer be possible to distinguish inertial and finite state spaces. Interestingly, inertial models predict that homoplasy should be clustered among comparatively close relatives (parallel evolution), whereas finite state models do not. If morphological evolution is often inertial in nature, then homoplasy (false homology) may primarily occur between close relatives, perhaps being replaced by functional analogy at higher taxonomic scales. PMID:26640650
Limited overlap between phylogenetic HIV and hepatitis C virus clusters illustrates the dynamic sexual network structure of Dutch HIV-infected MSM.

PubMed

Vanhommerig, Joost W; Bezemer, Daniela; Molenkamp, Richard; Van Sighem, Ard I; Smit, Colette; Arends, Joop E; Lauw, Fanny N; Brinkman, Kees; Rijnders, Bart J; Newsum, Astrid M; Bruisten, Sylvia M; Prins, Maria; Van Der Meer, Jan T; Van De Laar, Thijs J; Schinkel, Janke

2017-09-24

MSM are at increased risk for infection with HIV-1 and hepatitis C virus (HCV). Is HIV/HCV coinfection confined to specific HIV transmission networks? A HIV phylogenetic tree was constructed for 5038 HIV-1 subtype B polymerase (pol) sequences obtained from MSM in the AIDS therapy evaluation in the Netherlands cohort. We investigated the existence of HIV clusters with increased HCV prevalence, the HIV phylogenetic density (i.e. the number of potential HIV transmission partners) of HIV/HCV-coinfected MSM compared with HIV-infected MSM without HCV, and the overlap in HIV and HCV phylogenies using HCV nonstructural protein 5B sequences from 183 HIV-infected MSM with acute HCV infection. Five hundred and sixty-three of 5038 (11.2%) HIV-infected MSM tested HCV positive. Phylogenetic analysis revealed 93 large HIV clusters (≥10 MSM), 370 small HIV clusters (2-9 MSM), and 867 singletons with a median HCV prevalence of 11.5, 11.6, and 9.3%, respectively. We identified six large HIV clusters with elevated HCV prevalence (range 23.5-46.2%). Median HIV phylogenetic densities for MSM with HCV (3, interquartile range 1-7) and without HCV (3, interquartile range 1-8) were similar. HCV phylogeny showed 12 MSM-specific HCV clusters (clustersize: 2-39 HCV sequences); 12.7% of HCV infections were part of the same HIV and HCV cluster. We observed few HIV clusters with elevated HCV prevalence, no increase in the HIV phylogenetic density of HIV/HCV-coinfected MSM compared to HIV-infected MSM without HCV, and limited overlap between HIV and HCV phylogenies among HIV/HCV-coinfected MSM. Our data do not support the existence of MSM-specific sexual networks that fuel both the HIV and HCV epidemic.
Soft-tissue anatomy of the extant hominoids: a review and phylogenetic analysis.

PubMed

Gibbs, S; Collard, M; Wood, B

2002-01-01

This paper reports the results of a literature search for information about the soft-tissue anatomy of the extant non-human hominoid genera, Pan, Gorilla, Pongo and Hylobates, together with the results of a phylogenetic analysis of these data plus comparable data for Homo. Information on the four extant non-human hominoid genera was located for 240 out of the 1783 soft-tissue structures listed in the Nomina Anatomica. Numerically these data are biased so that information about some systems (e.g. muscles) and some regions (e.g. the forelimb) are over-represented, whereas other systems and regions (e.g. the veins and the lymphatics of the vascular system, the head region) are either under-represented or not represented at all. Screening to ensure that the data were suitable for use in a phylogenetic analysis reduced the number of eligible soft-tissue structures to 171. These data, together with comparable data for modern humans, were converted into discontinuous character states suitable for phylogenetic analysis and then used to construct a taxon-by-character matrix. This matrix was used in two tests of the hypothesis that soft-tissue characters can be relied upon to reconstruct hominoid phylogenetic relationships. In the first, parsimony analysis was used to identify cladograms requiring the smallest number of character state changes. In the second, the phylogenetic bootstrap was used to determine the confidence intervals of the most parsimonious clades. The parsimony analysis yielded a single most parsimonious cladogram that matched the molecular cladogram. Similarly the bootstrap analysis yielded clades that were compatible with the molecular cladogram; a (Homo, Pan) clade was supported by 95% of the replicates, and a (Gorilla, Pan, Homo) clade by 96%. These are the first hominoid morphological data to provide statistically significant support for the clades favoured by the molecular evidence.
Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions.

PubMed

Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize; Zhao, Yun; Zhao, Hai

2017-01-01

Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela , Landoltia , Lemna , Wolffiella , and Wolffia . This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds.
Evaluating the phylogenetic signal limit from mitogenomes, slow evolving nuclear genes, and the concatenation approach. New insights into the Lacertini radiation using fast evolving nuclear genes and species trees.

PubMed

Mendes, Joana; Harris, D James; Carranza, Salvador; Salvi, Daniele

2016-07-01

Estimating the phylogeny of lacertid lizards, and particularly the tribe Lacertini has been challenging, possibly due to the fast radiation of this group resulting in a hard polytomy. However this is still an open question, as concatenated data primarily from mitochondrial markers have been used so far whereas in a recent phylogeny based on a compilation of these data within a squamate supermatrix the basal polytomy seems to be resolved. In this study, we estimate phylogenetic relationships between all Lacertini genera using for the first time DNA sequences from five fast evolving nuclear genes (acm4, mc1r, pdc, βfib and reln) and two mitochondrial genes (nd4 and 12S). We generated a total of 529 sequences from 88 species and used Maximum Likelihood and Bayesian Inference methods based on concatenated multilocus dataset as well as a coalescent-based species tree approach with the aim of (i) shedding light on the basal relationships of Lacertini (ii) assessing the monophyly of genera which were previously questioned, and (iii) discussing differences between estimates from this and previous studies based on different markers, and phylogenetic methods. Results uncovered (i) a new phylogenetic clade formed by the monotypic genera Archaeolacerta, Zootoca, Teira and Scelarcis; and (ii) support for the monophyly of the Algyroides clade, with two sister species pairs represented by western (A. marchi and A. fitzingeri) and eastern (A. nigropunctatus and A. moreoticus) lineages. In both cases the members of these groups show peculiar morphology and very different geographical distributions, suggesting that they are relictual groups that were once diverse and widespread. They probably originated about 11-13 million years ago during early events of speciation in the tribe, and the split between their members is estimated to be only slightly older. This scenario may explain why mitochondrial markers (possibly saturated at higher divergence levels) or slower nuclear markers used in previous studies (likely lacking enough phylogenetic signal) failed to recover these relationships. Finally, the phylogenetic position of most remaining genera was unresolved, corroborating the hypothesis of a hard polytomy in the Lacertini phylogeny due to a fast radiation. This is in agreement with all previous studies but in sharp contrast with a recent squamate megaphylogeny. We show that the supermatrix approach may provide high support for incorrect nodes that are not supported either by original sequence data or by new data from this study. This finding suggests caution when using megaphylogenies to integrate inter-generic relationships in comparative ecological and evolutionary studies. Copyright © 2016 Elsevier Inc. All rights reserved.
Phylogenetic diversity anomaly in angiosperms between eastern Asia and eastern North America.

PubMed

Qian, Hong; Jin, Yi; Ricklefs, Robert E

2017-10-24

Although eastern Asia (EAS) and eastern North America (ENA) have similar climates, plant species richness in EAS greatly exceeds that in ENA. The degree to which this diversity difference reflects the ages of the floras or their rates of evolutionary diversification has not been quantified. Measures of species diversity that do not incorporate the ages of lineages disregard the evolutionary distinctiveness of species. In contrast, phylogenetic diversity integrates both the number of species and their history of evolutionary diversification. Here we compared species diversity and phylogenetic diversity in a large number of flowering plant (angiosperm) floras distributed across EAS and ENA, two regions with similar contemporary environments and broadly shared floristic history. After accounting for climate and sample area, we found both species diversity and phylogenetic diversity to be significantly higher in EAS than in ENA. When we controlled the number of species statistically, we found that phylogenetic diversity remained substantially higher in EAS than in ENA, although it tended to converge at high latitude. This pattern held independently for herbs, shrubs, and trees. The anomaly in species and phylogenetic diversity likely resulted from differences in regional processes, related in part to high climatic and topographic heterogeneity, and a strong monsoon climate, in EAS. The broad connection between tropical and temperate floras in southern Asia also might have played a role in creating the phylogenetic diversity anomaly.
Onto-phylogenetic aspect of myotomal myogenesis in Chordata.

PubMed

Kiełbówna, Leokadia; Daczewska, Małgorzata

2004-01-01

This paper presents an onto- and phylogenetic aspect of myotoamal myogenesis in Chordata. A comparative analysis of early stages of myotomal myogenesis in Chordata indicates that the myogenic process in this phylum underwent evolutionary changes. The first stage of the process is myogenesis leading to development of mononucleate mature muscle cells, the most advanced stage is formation of multinucleate muscle fibres.
Multilocus phylogenetic analysis of true morels (Morchella) reveals high levels of endemics in Turkey relative ot other regions of Europe

USDA-ARS?s Scientific Manuscript database

The present study was conducted to better understand how the phylogenetic diversity of true morels (Morchella) in Turkey compares with species found in other regions of the world. The current research builds on our recently published survey of 10 Turkish provinces and another of the world in which D...
Phylogenetic Reconstruction as a Broadly Applicable Teaching Tool in the Biology Classroom: The Value of Data in Estimating Likely Answers

ERIC Educational Resources Information Center

Julius, Matthew L.; Schoenfuss, Heiko L.

2006-01-01

This laboratory exercise introduces students to a fundamental tool in evolutionary biology--phylogenetic inference. Students are required to create a data set via observation and through mining preexisting data sets. These student data sets are then used to develop and compare competing hypotheses of vertebrate phylogeny. The exercise uses readily…
Inferring explicit weighted consensus networks to represent alternative evolutionary histories

PubMed Central

2013-01-01

Background The advent of molecular biology techniques and constant increase in availability of genetic material have triggered the development of many phylogenetic tree inference methods. However, several reticulate evolution processes, such as horizontal gene transfer and hybridization, have been shown to blur the species evolutionary history by causing discordance among phylogenies inferred from different genes. Methods To tackle this problem, we hereby describe a new method for inferring and representing alternative (reticulate) evolutionary histories of species as an explicit weighted consensus network which can be constructed from a collection of gene trees with or without prior knowledge of the species phylogeny. Results We provide a way of building a weighted phylogenetic network for each of the following reticulation mechanisms: diploid hybridization, intragenic recombination and complete or partial horizontal gene transfer. We successfully tested our method on some synthetic and real datasets to infer the above-mentioned evolutionary events which may have influenced the evolution of many species. Conclusions Our weighted consensus network inference method allows one to infer, visualize and validate statistically major conflicting signals induced by the mechanisms of reticulate evolution. The results provided by the new method can be used to represent the inferred conflicting signals by means of explicit and easy-to-interpret phylogenetic networks. PMID:24359207
Molecular identification and phylogenetic analysis of important medicinal plant species in genus Paeonia based on rDNA-ITS, matK, and rbcL DNA barcode sequences.

PubMed

Kim, W J; Ji, Y; Choi, G; Kang, Y M; Yang, S; Moon, B C

2016-08-05

This study was performed to identify and analyze the phylogenetic relationship among four herbaceous species of the genus Paeonia, P. lactiflora, P. japonica, P. veitchii, and P. suffruticosa, using DNA barcodes. These four species, which are commonly used in traditional medicine as Paeoniae Radix and Moutan Radicis Cortex, are pharmaceutically defined in different ways in the national pharmacopoeias in Korea, Japan, and China. To authenticate the different species used in these medicines, we evaluated rDNA-internal transcribed spacers (ITS), matK and rbcL regions, which provide information capable of effectively distinguishing each species from one another. Seventeen samples were collected from different geographic regions in Korea and China, and DNA barcode regions were amplified using universal primers. Comparative analyses of these DNA barcode sequences revealed species-specific nucleotide sequences capable of discriminating the four Paeonia species. Among the entire sequences of three barcodes, marker nucleotides were identified at three positions in P. lactiflora, eleven in P. japonica, five in P. veitchii, and 25 in P. suffruticosa. Phylogenetic analyses also revealed four distinct clusters showing homogeneous clades with high resolution at the species level. The results demonstrate that the analysis of these three DNA barcode sequences is a reliable method for identifying the four Paeonia species and can be used to authenticate Paeoniae Radix and Moutan Radicis Cortex at the species level. Furthermore, based on the assessment of amplicon sizes, inter/intra-specific distances, marker nucleotides, and phylogenetic analysis, rDNA-ITS was the most suitable DNA barcode for identification of these species.
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics.

PubMed

Edwards, Scott V; Cloutier, Alison; Baker, Allan J

2017-11-01

Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600-∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biologists.
Australasian sky islands act as a diversity pump facilitating peripheral speciation and complex reversal from narrow endemic to widespread ecological supertramp

PubMed Central

Toussaint, Emmanuel F A; Sagata, Katayo; Surbakti, Suriani; Hendrich, Lars; Balke, Michael

2013-01-01

The Australasian archipelago is biologically extremely diverse as a result of a highly puzzling geological and biological evolution. Unveiling the underlying mechanisms has never been more attainable as molecular phylogenetic and geological methods improve, and has become a research priority considering increasing human-mediated loss of biodiversity. However, studies of finer scaled evolutionary patterns remain rare particularly for megadiverse Melanesian biota. While oceanic islands have received some attention in the region, likewise insular mountain blocks that serve as species pumps remain understudied, even though Australasia, for example, features some of the most spectacular tropical alpine habitats in the World. Here, we sequenced almost 2 kb of mitochondrial DNA from the widespread diving beetle Rhantus suturalis from across Australasia and the Indomalayan Archipelago, including remote New Guinean highlands. Based on expert taxonomy with a multigene phylogenetic backbone study, and combining molecular phylogenetics, phylogeography, divergence time estimation, and historical demography, we recover comparably low geographic signal, but complex phylogenetic relationships and population structure within R. suturalis. Four narrowly endemic New Guinea highland species are subordinated and two populations (New Guinea, New Zealand) seem to constitute cases of ongoing speciation. We reveal repeated colonization of remote mountain chains where haplotypes out of a core clade of very widespread haplotypes syntopically might occur with well-isolated ones. These results are corroborated by a Pleistocene origin approximately 2.4 Ma ago, followed by a sudden demographic expansion 600,000 years ago that may have been initiated through climatic adaptations. This study is a snapshot of the early stages of lineage diversification by peripatric speciation in Australasia, and supports New Guinea sky islands as cradles of evolution, in line with geological evidence suggesting very recent origin of high altitudes in the region. PMID:23610642
Phylogenetic analysis of Helicobacter pylori cagA gene of Turkish isolates and the association with gastric pathology

PubMed Central

2013-01-01

Background The cagA gene is one of the important virulence factors of Helicobacter pylori. The diversity of cagA 5′ conserved region is thought to reflect the phylogenetic relationships between different H. pylori isolates and their association with peptic ulceration. Significant geographical differences among isolates have been reported. The aim of this study is to compare Turkish H. pylori isolates with isolates from different geographical locations and to correlate the association with peptic ulceration. Methods Total of 52 isolates of which 19 were Turkish and 33 from other geographic locations were studied. Gastric antral biopsies collected from 19 Turkish patients (Gastritis = 12, ulcer = 7) were used to amplify the cagA 5′ region by PCR then followed by DNA sequencing. Results The phylogenetic tree displayed 3 groups: A) a mix of 2 sub-groups “Asian” and “African/Anatolian/Asian/European”, B) “Anatolian/European” and C) “American-Indian”. Turkish H. pylori isolates clustered in the mixed sub-group A were mostly from gastritis patients while those clustered in group B were from peptic ulcer patients. A phylogenetic tree constructed for our Turkish isolates detected distinctive features among those from gastritis and ulcer patients. We have found that 2/3 of the gastritis isolates were clustered alone while 1/3 was clustered together with the ulcer isolates. Several amino acids were found to be shared between the later groups but not with the first group of gastritis. Conclusions This study provided an additional insight into the profile of our cagA gene which implies a relationship in geographic locations of the isolates. PMID:24245965
RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language

PubMed Central

Höhna, Sebastian; Landis, Michael J.

2016-01-01

Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com. [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.] PMID:27235697
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics

PubMed Central

Cloutier, Alison; Baker, Allan J.

2017-01-01

Abstract Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600–∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. PMID:28637293
Diversity Measures in Environmental Sequences Are Highly Dependent on Alignment Quality—Data from ITS and New LSU Primers Targeting Basidiomycetes

PubMed Central

Fischer, Christiane; Daniel, Rolf; Wubet, Tesfaye

2012-01-01

The ribosomal DNA comprised of the ITS1-5.8S-ITS2 regions is widely used as a fungal marker in molecular ecology and systematics but cannot be aligned with confidence across genetically distant taxa. In order to study the diversity of Agaricomycotina in forest soils, we designed primers targeting the more alignable 28S (LSU) gene, which should be more useful for phylogenetic analyses of the detected taxa. This paper compares the performance of the established ITS1F/4B primer pair, which targets basidiomycetes, to that of two new pairs. Key factors in the comparison were the diversity covered, off-target amplification, rarefaction at different Operational Taxonomic Unit (OTU) cutoff levels, sensitivity of the method used to process the alignment to missing data and insecure positional homology, and the congruence of monophyletic clades with OTU assignments and BLAST-derived OTU names. The ITS primer pair yielded no off-target amplification but also exhibited the least fidelity to the expected phylogenetic groups. The LSU primers give complementary pictures of diversity, but were more sensitive to modifications of the alignment such as the removal of difficult-to align stretches. The LSU primers also yielded greater numbers of singletons but also had a greater tendency to produce OTUs containing sequences from a wider variety of species as judged by BLAST similarity. We introduced some new parameters to describe alignment heterogeneity based on Shannon entropy and the extent and contents of the OTUs in a phylogenetic tree space. Our results suggest that ITS should not be used when calculating phylogenetic trees from genetically distant sequences obtained from environmental DNA extractions and that it is inadvisable to define OTUs on the basis of very heterogeneous alignments. PMID:22363808

A multi-locus analysis of phylogenetic relationships within grass subfamily Pooideae (Poaceae) inferred from sequences of nuclear single copy gene regions compared with plastid DNA.

PubMed

Hochbach, Anne; Schneider, Julia; Röser, Martin

2015-06-01

To investigate phylogenetic relationships within the grass subfamily Pooideae we studied about 50 taxa covering all recognized tribes, using one plastid DNA (cpDNA) marker (matK gene-3'trnK exon) and for the first time four nuclear single copy gene loci. DNA sequence information from two parts of the nuclear genes topoisomerase 6 (Topo6) spanning the exons 8-13 and 17-19, the exons 9-13 encoding plastid acetyl-CoA-carboxylase (Acc1) and the partial exon 1 of phytochrome B (PhyB) were generated. Individual and nuclear combined data were evaluated using maximum parsimony, maximum likelihood and Bayesian methods. All of the phylogenetic results show Brachyelytrum and the tribe Nardeae as earliest diverging lineages within the subfamily. The 'core' Pooideae (Hordeeae and the Aveneae/Poeae tribe complex) are also strongly supported, as well as the monophyly of the tribes Brachypodieae, Meliceae and Stipeae (except PhyB). The beak grass tribe Diarrheneae and the tribe Duthieeae are not monophyletic in some of the analyses. However, the combined nuclear DNA (nDNA) tree yields the highest resolution and the best delimitation of the tribes, and provides the following evolutionary hypothesis for the tribes: Brachyelytrum, Nardeae, Duthieeae, Meliceae, Stipeae, Diarrheneae, Brachypodieae and the 'core' Pooideae. Within the individual datasets, the phylogenetic trees obtained from Topo6 exon 8-13 shows the most interesting results. The divergent positions of some clone sequences of Ampelodesmos mauritanicus and Trikeraia pappiformis, for instance, may indicate a hybrid origin of these stipoid taxa. Copyright © 2015 Elsevier Inc. All rights reserved.
RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.

PubMed

Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik

2016-07-01

Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Reconstruction of phylogenetic trees of prokaryotes using maximal common intervals.

PubMed

Heydari, Mahdi; Marashi, Sayed-Amir; Tusserkani, Ruzbeh; Sadeghi, Mehdi

2014-10-01

One of the fundamental problems in bioinformatics is phylogenetic tree reconstruction, which can be used for classifying living organisms into different taxonomic clades. The classical approach to this problem is based on a marker such as 16S ribosomal RNA. Since evolutionary events like genomic rearrangements are not included in reconstructions of phylogenetic trees based on single genes, much effort has been made to find other characteristics for phylogenetic reconstruction in recent years. With the increasing availability of completely sequenced genomes, gene order can be considered as a new solution for this problem. In the present work, we applied maximal common intervals (MCIs) in two or more genomes to infer their distance and to reconstruct their evolutionary relationship. Additionally, measures based on uncommon segments (UCS's), i.e., those genomic segments which are not detected as part of any of the MCIs, are also used for phylogenetic tree reconstruction. We applied these two types of measures for reconstructing the phylogenetic tree of 63 prokaryotes with known COG (clusters of orthologous groups) families. Similarity between the MCI-based (resp. UCS-based) reconstructed phylogenetic trees and the phylogenetic tree obtained from NCBI taxonomy browser is as high as 93.1% (resp. 94.9%). We show that in the case of this diverse dataset of prokaryotes, tree reconstruction based on MCI and UCS outperforms most of the currently available methods based on gene orders, including breakpoint distance and DCJ. We additionally tested our new measures on a dataset of 13 closely-related bacteria from the genus Prochlorococcus. In this case, distances like rearrangement distance, breakpoint distance and DCJ proved to be useful, while our new measures are still appropriate for phylogenetic reconstruction. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Conservation threats and the phylogenetic utility of IUCN Red List rankings in Incilius toads.

PubMed

Schachat, Sandra R; Mulcahy, Daniel G; Mendelson, Joseph R

2016-02-01

Phylogenetic analysis of extinction threat is an emerging tool in the field of conservation. However, there are problems with the methods and data as commonly used. Phylogenetic sampling usually extends to the level of family or genus, but International Union for Conservation of Nature (IUCN) rankings are available only for individual species, and, although different species within a taxonomic group may have the same IUCN rank, the species may have been ranked as such for different reasons. Therefore, IUCN rank may not reflect evolutionary history and thus may not be appropriate for use in a phylogenetic context. To be used appropriately, threat-risk data should reflect the cause of extinction threat rather than the IUCN threat ranking. In a case study of the toad genus Incilius, with phylogenetic sampling at the species level (so that the resolution of the phylogeny matches character data from the IUCN Red List), we analyzed causes of decline and IUCN threat rankings by calculating metrics of phylogenetic signal (such as Fritz and Purvis' D). We also analyzed the extent to which cause of decline and threat ranking overlap by calculating phylogenetic correlation between these 2 types of character data. Incilius species varied greatly in both threat ranking and cause of decline; this variability would be lost at a coarser taxonomic resolution. We found far more phylogenetic signal, likely correlated with evolutionary history, for causes of decline than for IUCN threat ranking. Individual causes of decline and IUCN threat rankings were largely uncorrelated on the phylogeny. Our results demonstrate the importance of character selection and taxonomic resolution when extinction threat is analyzed in a phylogenetic context. © 2015 Society for Conservation Biology.
Easy-to-use phylogenetic analysis system for hepatitis B virus infection.

PubMed

Sugiyama, Masaya; Inui, Ayano; Shin-I, Tadasu; Komatsu, Haruki; Mukaide, Motokazu; Masaki, Naohiko; Murata, Kazumoto; Ito, Kiyoaki; Nakanishi, Makoto; Fujisawa, Tomoo; Mizokami, Masashi

2011-10-01

The molecular phylogenetic analysis has been broadly applied to clinical and virological study. However, the appropriate settings and application of calculation parameters are difficult for non-specialists of molecular genetics. In the present study, the phylogenetic analysis tool was developed for the easy determination of genotypes and transmission route. A total of 23 patients of 10 families infected with hepatitis B virus (HBV) were enrolled and expected to undergo intrafamilial transmission. The extracted HBV DNA were amplified and sequenced in a region of the S gene. The software to automatically classify query sequence was constructed and installed on the Hepatitis Virus Database (HVDB). Reference sequences were retrieved from HVDB, which contained major genotypes from A to H. Multiple-alignments using CLUSTAL W were performed before the genetic distance matrix was calculated with the six-parameter method. The phylogenetic tree was output by the neighbor-joining method. User interface using WWW-browser was also developed for intuitive control. This system was named as the easy-to-use phylogenetic analysis system (E-PAS). Twenty-three sera of 10 families were analyzed to evaluate E-PAS. The queries obtained from nine families were genotype C and were located in one cluster per family. However, one patient of a family was classified into the cluster different from her family, suggesting that E-PAS detected the sample distinct from that of her family on the transmission route. The E-PAS to output phylogenetic tree was developed since requisite material was sequence data only. E-PAS could expand to determine HBV genotypes as well as transmission routes. © 2011 The Japan Society of Hepatology.
Methamphetamine injecting is associated with phylogenetic clustering of hepatitis C virus infection among street-involved youth in Vancouver, Canada*

PubMed Central

Cunningham, Evan; Jacka, Brendan; DeBeck, Kora; Applegate, Tanya A; Harrigan, P. Richard; Krajden, Mel; Marshall, Brandon DL; Montaner, Julio; Lima, Viviane Dias; Olmstead, Andrea; Milloy, M-J; Wood, Evan; Grebely, Jason

2015-01-01

Background Among prospective cohorts of people who inject drugs (PWID), phylogenetic clustering of HCV infection has been observed. However, the majority of studies have included older PWID, representing distant transmission events. The aim of this study was to investigate phylogenetic clustering of HCV infection among a cohort of street-involved youth. Methods Data were derived from a prospective cohort of street-involved youth aged 14–26 recruited between 2005 and 2012 in Vancouver, Canada (At Risk Youth Study, ARYS). HCV RNA testing and sequencing (Core-E2) were performed on HCV positive participants. Phylogenetic trees were inferred using maximum likelihood methods and clusters were identified using ClusterPicker (Core-E2 without HVR1, 90% bootstrap threshold, 0.05 genetic distance threshold). Results Among 945 individuals enrolled in ARYS, 16% (n=149, 100% recent injectors) were HCV antibody positive at baseline interview (n=86) or seroconverted during follow-up (n=63). Among HCV antibody positive participants with available samples (n=131), 75% (n=98) had detectable HCV RNA and 66% (n=65, mean age 23, 58% with recent methamphetamine injection, 31% female, 3% HIV+) had available Core-E2 sequences. Of those with Core-E2 sequence, 14% (n=9) were in a cluster (one cluster of three) or pair (two pairs), with all reporting recent methamphetamine injection. Recent methamphetamine injection was associated with membership in a cluster or pair (P=0.009). Conclusion In this study of street-involved youth with HCV infection and recent injecting, 14% demonstrated phylogenetic clustering. Phylogenetic clustering was associated with recent methamphetamine injection, suggesting that methamphetamine drug injection may play an important role in networks of HCV transmission. PMID:25977204
Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset

NASA Technical Reports Server (NTRS)

Zhang, Zhengdong; Willson, Richard C.; Fox, George E.

2002-01-01

MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.
QueTAL: a suite of tools to classify and compare TAL effectors functionally and phylogenetically

PubMed Central

Pérez-Quintero, Alvaro L.; Lamy, Léo; Gordon, Jonathan L.; Escalon, Aline; Cunnac, Sébastien; Szurek, Boris; Gagnevin, Lionel

2015-01-01

Transcription Activator-Like (TAL) effectors from Xanthomonas plant pathogenic bacteria can bind to the promoter region of plant genes and induce their expression. DNA-binding specificity is governed by a central domain made of nearly identical repeats, each determining the recognition of one base pair via two amino acid residues (a.k.a. Repeat Variable Di-residue, or RVD). Knowing how TAL effectors differ from each other within and between strains would be useful to infer functional and evolutionary relationships, but their repetitive nature precludes reliable use of traditional alignment methods. The suite QueTAL was therefore developed to offer tailored tools for comparison of TAL effector genes. The program DisTAL considers each repeat as a unit, transforms a TAL effector sequence into a sequence of coded repeats and makes pair-wise alignments between these coded sequences to construct trees. The program FuncTAL is aimed at finding TAL effectors with similar DNA-binding capabilities. It calculates correlations between position weight matrices of potential target DNA sequence predicted from the RVD sequence, and builds trees based on these correlations. The programs accurately represented phylogenetic and functional relationships between TAL effectors using either simulated or literature-curated data. When using the programs on a large set of TAL effector sequences, the DisTAL tree largely reflected the expected species phylogeny. In contrast, FuncTAL showed that TAL effectors with similar binding capabilities can be found between phylogenetically distant taxa. This suite will help users to rapidly analyse any TAL effector genes of interest and compare them to other available TAL genes and should improve our understanding of TAL effectors evolution. It is available at http://bioinfo-web.mpl.ird.fr/cgi-bin2/quetal/quetal.cgi. PMID:26284082
Evidence for Widespread Reticulate Evolution within Human Duplicons

PubMed Central

Jackson, Michael S. ; Oliver, Karen ; Loveland, Jane ; Humphray, Sean ; Dunham, Ian ; Rocchi, Mariano ; Viggiano, Luigi ; Park, Jonathan P. ; Hurles, Matthew E. ; Santibanez-Koref, Mauro

2005-01-01

Approximately 5% of the human genome consists of segmental duplications that can cause genomic mutations and may play a role in gene innovation. Reticulate evolutionary processes, such as unequal crossing-over and gene conversion, are known to occur within specific duplicon families, but the broader contribution of these processes to the evolution of human duplications remains poorly characterized. Here, we use phylogenetic profiling to analyze multiple alignments of 24 human duplicon families that span >8 Mb of DNA. Our results indicate that none of them are evolving independently, with all alignments showing sharp discontinuities in phylogenetic signal consistent with reticulation. To analyze these results in more detail, we have developed a quartet method that estimates the relative contribution of nucleotide substitution and reticulate processes to sequence evolution. Our data indicate that most of the duplications show a highly significant excess of sites consistent with reticulate evolution, compared with the number expected by nucleotide substitution alone, with 15 of 30 alignments showing a >20-fold excess over that expected. Using permutation tests, we also show that at least 5% of the total sequence shares 100% sequence identity because of reticulation, a figure that includes 74 independent tracts of perfect identity >2 kb in length. Furthermore, analysis of a subset of alignments indicates that the density of reticulation events is as high as 1 every 4 kb. These results indicate that phylogenetic relationships within recently duplicated human DNA can be rapidly disrupted by reticulate evolution. This finding has important implications for efforts to finish the human genome sequence, complicates comparative sequence analysis of duplicon families, and could profoundly influence the tempo of gene-family evolution. PMID:16252241
Partial gene sequences for the A subunit of methyl-coenzyme M reductase (mcrI) as a phylogenetic tool for the family Methanosarcinaceae

NASA Technical Reports Server (NTRS)

Springer, E.; Sachs, M. S.; Woese, C. R.; Boone, D. R.

1995-01-01

Representatives of the family Methanosarcinaceae were analyzed phylogenetically by comparing partial sequences of their methyl-coenzyme M reductase (mcrI) genes. A 490-bp fragment from the A subunit of the gene was selected, amplified by the PCR, cloned, and sequenced for each of 25 strains belonging to the Methanosarcinaceae. The sequences obtained were aligned with the corresponding portions of five previously published sequences, and all of the sequences were compared to determine phylogenetic distances by Fitch distance matrix methods. We prepared analogous trees based on 16S rRNA sequences; these trees corresponded closely to the mcrI trees, although the mcrI sequences of pairs of organisms had 3.01 +/- 0.541 times more changes than the respective pairs of 16S rRNA sequences, suggesting that the mcrI fragment evolved about three times more rapidly than the 16S rRNA gene. The qualitative similarity of the mcrI and 16S rRNA trees suggests that transfer of genetic information between dissimilar organisms has not significantly affected these sequences, although we found inconsistencies between some mcrI distances that we measured and and previously published DNA reassociation data. It is unlikely that multiple mcrI isogenes were present in the organisms that we examined, because we found no major discrepancies in multiple determinations of mcrI sequences from the same organism. Our primers for the PCR also match analogous sites in the previously published mcrII sequences, but all of the sequences that we obtained from members of the Methanosarcinaceae were more closely related to mcrI sequences than to mcrII sequences, suggesting that members of the Methanosarcinaceae do not have distinct mcrII genes.
Physiological, behavioral and biochemical adaptations of intertidal fishes to hypoxia.

PubMed

Richards, Jeffrey G

2011-01-15

Hypoxia survival in fish requires a well-coordinated response to either secure more O(2) from the hypoxic environment or to limit the metabolic consequences of an O(2) restriction at the mitochondria. Although there is a considerable amount of information available on the physiological, behavioral, biochemical and molecular responses of fish to hypoxia, very little research has attempted to determine the adaptive value of these responses. This article will review current attempts to use the phylogenetically corrected comparative method to define physiological and behavioral adaptations to hypoxia in intertidal fish and further identify putatively adaptive biochemical traits that should be investigated in the future. In a group of marine fishes known as sculpins, from the family Cottidae, variation in hypoxia tolerance, measured as a critical O(2) tension (P(crit)), is primarily explained by variation in mass-specific gill surface area, red blood cell hemoglobin-O(2) binding affinity, and to a lesser extent variation in routine O(2) consumption rate (M(O(2))). The most hypoxia-tolerant sculpins consistently show aquatic surface respiration (ASR) and aerial emergence behavior during hypoxia exposure, but no phylogenetically independent relationship has been found between the thresholds for initiating these behaviors and P(crit). At O(2) levels below P(crit), hypoxia survival requires a rapid reorganization of cellular metabolism to suppress ATP consumption to match the limited capacity for O(2)-independent ATP production. Thus, it is reasonable to speculate that the degree of metabolic rate suppression and the quantity of stored fermentable fuel is strongly selected for in hypoxia-tolerant fishes; however, these assertions have not been tested in a phylogenetic comparative model.
Effects of rooting via out-groups on in-group topology in phylogeny.

PubMed

Ackerman, Margareta; Brown, Daniel G; Loker, David

2014-01-01

Users of phylogenetic methods require rooted trees, because the direction of time depends on the placement of the root. While phylogenetic trees are typically rooted by using an out-group, this mechanism is inappropriate when the addition of an out-group changes the in-group topology. We perform a formal analysis of phylogenetic algorithms under the inclusion of distant out-groups. It turns out that linkage-based algorithms (including UPGMA) and a class of bisecting methods do not modify the topology of the in-group when an out-group is included. By contrast, the popular neighbour joining algorithm fails this property in a strong sense: every data set can have its structure destroyed by some arbitrarily distant outlier. Furthermore, including multiple outliers can lead to an arbitrary topology on the in-group. The standard rooting approach that uses out-groups may be fundamentally unsuited for neighbour joining.
Molecular phylogenetic trees - On the validity of the Goodman-Moore augmentation algorithm

NASA Technical Reports Server (NTRS)

Holmquist, R.

1979-01-01

A response is made to the reply of Nei and Tateno (1979) to the letter of Holmquist (1978) supporting the validity of the augmentation algorithm of Moore (1977) in reconstructions of nucleotide substitutions by means of the maximum parsimony principle. It is argued that the overestimation of the augmented numbers of nucleotide substitutions (augmented distances) found by Tateno and Nei (1978) is due to an unrepresentative data sample and that it is only necessary that evolution be stochastically uniform in different regions of the phylogenetic network for the augmentation method to be useful. The importance of the average value of the true distance over all links is explained, and the relative variances of the true and augmented distances are calculated to be almost identical. The effects of topological changes in the phylogenetic tree on the augmented distance and the question of the correctness of ancestral sequences inferred by the method of parsimony are also clarified.
Estimating Bacterial Diversity for Ecological Studies: Methods, Metrics, and Assumptions

PubMed Central

Birtel, Julia; Walser, Jean-Claude; Pichon, Samuel; Bürgmann, Helmut; Matthews, Blake

2015-01-01

Methods to estimate microbial diversity have developed rapidly in an effort to understand the distribution and diversity of microorganisms in natural environments. For bacterial communities, the 16S rRNA gene is the phylogenetic marker gene of choice, but most studies select only a specific region of the 16S rRNA to estimate bacterial diversity. Whereas biases derived from from DNA extraction, primer choice and PCR amplification are well documented, we here address how the choice of variable region can influence a wide range of standard ecological metrics, such as species richness, phylogenetic diversity, β-diversity and rank-abundance distributions. We have used Illumina paired-end sequencing to estimate the bacterial diversity of 20 natural lakes across Switzerland derived from three trimmed variable 16S rRNA regions (V3, V4, V5). Species richness, phylogenetic diversity, community composition, β-diversity, and rank-abundance distributions differed significantly between 16S rRNA regions. Overall, patterns of diversity quantified by the V3 and V5 regions were more similar to one another than those assessed by the V4 region. Similar results were obtained when analyzing the datasets with different sequence similarity thresholds used during sequences clustering and when the same analysis was used on a reference dataset of sequences from the Greengenes database. In addition we also measured species richness from the same lake samples using ARISA Fingerprinting, but did not find a strong relationship between species richness estimated by Illumina and ARISA. We conclude that the selection of 16S rRNA region significantly influences the estimation of bacterial diversity and species distributions and that caution is warranted when comparing data from different variable regions as well as when using different sequencing techniques. PMID:25915756
Bovine leukaemia virus genotypes 5 and 6 are circulating in cattle from the state of São Paulo, Brazil.

PubMed

Gregory, Lilian; Carrillo Gaeta, Natália; Araújo, Jansen; Matsumiya Thomazelli, Luciano; Harakawa, Ricardo; Ikuno, Alice A; Hiromi Okuda, Liria; de Stefano, Eliana; Pituco, Edviges Maristela

2017-12-01

Enzootic bovine leucosis (EBL) is a silent disease caused by a retrovirus [bovine leukaemia virus (BLV)]. BLV is classified into almost 10 genotypes that are distributed in several countries. The present research aimed to describe two BLV gp51 env sequences of strains detected in the state of São Paulo, Brazil and perform a phylogenetic analysis to compare them to other BLV gp51 env sequences of strains around the world. Two bovines from different herds were admitted to the Bovine and Small Ruminant Hospital, School of Veterinary Medicine and Animal Science, University of São Paulo, Brazil. In both, lymphosarcoma was detected and the presence of BLV was confirmed by nested PCR. The neighbour-joining algorithm distance method was used to genotype the BLV sequences by phylogenetic reconstruction, and the maximum likelihood method was used for the phylogenetic reconstruction. The phylogeny estimates were calculated by performing 1000 bootstrap replicates. Analysis of the partial envelope glycoprotein (env) gene sequences from two isolates (25 and 31) revealed two different genotypes of BLV. Isolate 25 clustered with ten genotype 6 isolates from Brazil, Argentina, Thailand and Paraguay. On the other hand, isolate 31 clustered with two genotype 5 isolates (one was also from São Paulo and one was from Costa Rica). The detected genotypes corroborate the results of previous studies conducted in the state of São Paulo, Brazil. The prediction of amino acids showed substitutions, particularly between positions 136 and 150 in 11 out of 13 sequences analysed, including sequences from GenBank. BLV is still important in Brazil and this research should be continued.
Phylogeny of Bacteroides, Prevotella, and Porphyromonas spp. and related bacteria.

PubMed Central

Paster, B J; Dewhirst, F E; Olsen, I; Fraser, G J

1994-01-01

The phylogenetic structure of the bacteroides subgroup of the cytophaga-flavobacter-bacteroides (CFB) phylum was examined by 16S rRNA sequence comparative analysis. Approximately 95% of the 16S rRNA sequence was determined for 36 representative strains of species of Prevotella, Bacteroides, and Porphyromonas and related species by a modified Sanger sequencing method. A phylogenetic tree was constructed from a corrected distance matrix by the neighbor-joining method, and the reliability of tree branching was established by bootstrap analysis. The bacteroides subgroup was divided primarily into three major phylogenetic clusters which contained most of the species examined. The first cluster, termed the prevotella cluster, was composed of 16 species of Prevotella, including P. melaninogenica, P. intermedia, P. nigrescens, and the ruminal species P. ruminicola. Two oral species, P. zoogleoformans and P. heparinolytica, which had been recently placed in the genus Prevotella, did not fall within the prevotella cluster. These two species and six species of Bacteroides, including the type species B. fragilis, formed the second cluster, termed the bacteroides cluster. The third cluster, termed the porphyromonas cluster, was divided into two subclusters. The first contained Porphyromonas gingivalis, P. endodontalis, P. asaccharolytica, P. circumdentaria, P. salivosa, [Bacteroides] levii (the brackets around genus are used to indicate that the species does not belong to the genus by the sensu stricto definition), and [Bacteroides] macacae, and the second subcluster contained [Bacteroides] forsythus and [Bacteroides] distasonis. [Bacteroides] splanchnicus fell just outside the three major clusters but still belonged within the bacteroides subgroup. With few exceptions, the 16 S rRNA data were in overall agreement with previously proposed reclassifications of species of Bacteroides, Prevotella, and Porphyromonas. Suggestions are made to accommodate those species which do not fit previous reclassification schemes. PMID:8300528
Overcoming deep roots, fast rates, and short internodes to resolve the ancient rapid radiation of eupolypod II ferns.

PubMed

Rothfels, Carl J; Larsson, Anders; Kuo, Li-Yaung; Korall, Petra; Chiou, Wen-Liang; Pryer, Kathleen M

2012-05-01

Backbone relationships within the large eupolypod II clade, which includes nearly a third of extant fern species, have resisted elucidation by both molecular and morphological data. Earlier studies suggest that much of the phylogenetic intractability of this group is due to three factors: (i) a long root that reduces apparent levels of support in the ingroup; (ii) long ingroup branches subtended by a series of very short backbone internodes (the "ancient rapid radiation" model); and (iii) significantly heterogeneous lineage-specific rates of substitution. To resolve the eupolypod II phylogeny, with a particular emphasis on the backbone internodes, we assembled a data set of five plastid loci (atpA, atpB, matK, rbcL, and trnG-R) from a sample of 81 accessions selected to capture the deepest divergences in the clade. We then evaluated our phylogenetic hypothesis against potential confounding factors, including those induced by rooting, ancient rapid radiation, rate heterogeneity, and the Bayesian star-tree paradox artifact. While the strong support we inferred for the backbone relationships proved robust to these potential problems, their investigation revealed unexpected model-mediated impacts of outgroup composition, divergent effects of methods for countering the star-tree paradox artifact, and gave no support to concerns about the applicability of the unrooted model to data sets with heterogeneous lineage-specific rates of substitution. This study is among few to investigate these factors with empirical data, and the first to compare the performance of the two primary methods for overcoming the Bayesian star-tree paradox artifact. Among the significant phylogenetic results is the near-complete support along the eupolypod II backbone, the demonstrated paraphyly of Woodsiaceae as currently circumscribed, and the well-supported placement of the enigmatic genera Homalosorus, Diplaziopsis, and Woodsia.
A Phylogenetic Comparative Study of Bantu Kinship Terminology Finds Limited Support for Its Co-Evolution with Social Organisation

PubMed Central

Guillon, Myrtille; Mace, Ruth

2016-01-01

The classification of kin into structured groups is a diverse phenomenon which is ubiquitous in human culture. For populations which are organized into large agropastoral groupings of sedentary residence but not governed within the context of a centralised state, such as our study sample of 83 historical Bantu-speaking groups of sub-Saharan Africa, cultural kinship norms guide all aspects of everyday life and social organization. Such rules operate in part through the use of differing terminological referential systems of familial organization. Although the cross-cultural study of kinship terminology was foundational in Anthropology, few modern studies have made use of statistical advances to further our sparse understanding of the structuring and diversification of terminological systems of kinship over time. In this study we use Bayesian Markov Chain Monte Carlo methods of phylogenetic comparison to investigate the evolution of Bantu kinship terminology and reconstruct the ancestral state and diversification of cousin terminology in this family of sub-Saharan ethnolinguistic groups. Using a phylogenetic tree of Bantu languages, we then test the prominent hypothesis that structured variation in systems of cousin terminology has co-evolved alongside adaptive change in patterns of descent organization, as well as rules of residence. We find limited support for this hypothesis, and argue that the shaping of systems of kinship terminology is a multifactorial process, concluding with possible avenues of future research. PMID:27008364
Phylogenetic Analysis of Myobia musculi (Schranck, 1781) by Using the 18S Small Ribosomal Subunit Sequence

PubMed Central

Feldman, Sanford H; Ntenda, Abraham M

2011-01-01

We used high-fidelity PCR to amplify 2 overlapping regions of the ribosomal gene complex from the rodent fur mite Myobia musculi. The amplicons encompassed a large portion of the mite's ribosomal gene complex spanning 3128 nucleotides containing the entire 18S rRNA, internal transcribed spacer (ITS) 1, 5.8S rRNA, ITS2, and a portion of the 5′-end of the 28S rRNA. M. musculi’s 179-nucleotide 5.8S rRNA nucleotide sequence was not conserved, so this region was identified by conservation of rRNA secondary structure. Maximum likelihood and Bayesian inference phylogenetic analyses were performed by using multiple sequence alignment consisting of 1524 nucleotides of M. musculi 18S rRNA and homologous sequences from 42 prostigmatid mites and the tick Dermacentor andersoni. The phylograms produced by both methods were in agreement regarding terminal, secondary, and some tertiary phylogenetic relationships among mites. Bayesian inference discriminated most infraordinal relationships between Eleutherengona and Parasitengona mites in the suborder Anystina. Basal relationships between suborders Anystina and Eupodina historically determined by comparing differences in anatomic characteristics were less well-supported by our molecular analysis. Our results recapitulated similar 18S rRNA sequence analyses recently reported. Our study supports M. musculi as belonging to the suborder Anystina, infraorder Eleutherenona, and superfamily Cheyletoidea. PMID:22330574
Streptococcus pharyngis sp. nov., a novel streptococcal species isolated from the respiratory tract of wild rabbits.

PubMed

Vela, Ana I; Casas-Díaz, Encarna; Lavín, Santiago; Domínguez, Lucas; Fernández-Garayzábal, Jose F

2015-09-01

Four isolates of an unknown Gram-stain-positive, catalase-negative coccus-shaped organism, isolated from the pharynx of four wild rabbits, were characterized by phenotypic and molecular genetic methods. The micro-organisms were tentatively assigned to the genus Streptococcus based on cellular morphological and biochemical criteria, although the organisms did not appear to correspond to any species with a validly published name. Comparative 16S rRNA gene sequencing confirmed their identification as members of the genus Streptococcus, being most closely related phylogenetically to Streptococcus porcorum 682-03(T) (96.9% 16S rRNA gene sequence similarity). Analysis of rpoB and sodA gene sequences showed divergence values between the novel species and S. porcorum 682-03(T) (the closest phylogenetic relative determined from 16S rRNA gene sequences) of 18.1 and 23.9%, respectively. The novel bacterial isolate could be distinguished from the type strain of S. porcorum by several biochemical characteristics, such as the production of glycyl-tryptophan arylamidase and α-chymotrypsin, and the non-acidification of different sugars. Based on both phenotypic and phylogenetic findings, it is proposed that the unknown bacterium be assigned to a novel species of the genus Streptococcus, and named Streptococcus pharyngis sp. nov. The type strain is DICM10-00796B(T) ( = CECT 8754(T) = CCUG 66496(T)).

Targeting legume loci: A comparison of three methods for target enrichment bait design in Leguminosae phylogenomics.

PubMed

Vatanparast, Mohammad; Powell, Adrian; Doyle, Jeff J; Egan, Ashley N

2018-03-01

The development of pipelines for locus discovery has spurred the use of target enrichment for plant phylogenomics. However, few studies have compared pipelines from locus discovery and bait design, through validation, to tree inference. We compared three methods within Leguminosae (Fabaceae) and present a workflow for future efforts. Using 30 transcriptomes, we compared Hyb-Seq, MarkerMiner, and the Yang and Smith (Y&S) pipelines for locus discovery, validated 7501 baits targeting 507 loci across 25 genera via Illumina sequencing, and inferred gene and species trees via concatenation- and coalescent-based methods. Hyb-Seq discovered loci with the longest mean length. MarkerMiner discovered the most conserved loci with the least flagged as paralogous. Y&S offered the most parsimony-informative sites and putative orthologs. Target recovery averaged 93% across taxa. We optimized our targeted locus set based on a workflow designed to minimize paralog/ortholog conflation and thus present 423 loci for legume phylogenomics. Methods differed across criteria important for phylogenetic marker development. We recommend Hyb-Seq as a method that may be useful for most phylogenomic projects. Our targeted locus set is a resource for future, community-driven efforts to reconstruct the legume tree of life.
Evolutionary profiles from the QR factorization of multiple sequence alignments

PubMed Central

Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-01-01

We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
Reappraisal of the extinct seal “Phoca” vitulinoides from the Neogene of the North Sea Basin, with bearing on its geological age, phylogenetic affinities, and locomotion

PubMed Central

Louwye, Stephen

2017-01-01

Background Discovered on the southern margin of the North Sea Basin, “Phoca” vitulinoides represents one of the best-known extinct species of Phocidae. However, little attention has been given to the species ever since its original 19th century description. Newly discovered material, including the most complete specimen of fossil Phocidae from the North Sea Basin, prompted the redescription of the species. Also, the type material of “Phoca” vitulinoides is lost. Methods “Phoca” vitulinoides is redescribed. Its phylogenetic position among Phocinae is assessed through phylogenetic analysis. Dinoflagellate cyst biostratigraphy is used to determine and reassess the geological age of the species. Myological descriptions of extant taxa are used to infer muscle attachments, and basic comparative anatomy of the gross morphology and biomechanics are applied to reconstruct locomotion. Results Detailed redescription of “Phoca” vitulinoides indicates relatively little affinities with the genus Phoca, but rather asks for the establishment of a new genus: Nanophoca gen. nov. Hence, “Phoca” vitulinoides is recombined into Nanophoca vitulinoides. This reassignment is confirmed by the phylogenetic analysis, grouping the genus Nanophoca and other extinct phocine taxa as stem phocines. Biostratigraphy and lithostratigraphy expand the known stratigraphic range of N. vitulinoides from the late Langhian to the late Serravallian. The osteological anatomy of N. vitulinoides indicates a relatively strong development of muscles used for fore flipper propulsion and increased flexibility for the hind flipper. Discussion The extended stratigraphic range of N. vitulinoides into the middle Miocene confirms relatively early diversification of Phocinae in the North Atlantic. Morphological features on the fore- and hindlimb of the species point toward an increased use of the fore flipper and greater flexibility of the hind flipper as compared to extant Phocinae, clearly indicating less derived locomotor strategies in this Miocene phocine species. Estimations of the overall body size indicate that N. vitulinoides is much smaller than Pusa, the smallest extant genus of Phocinae (and Phocidae), and than most extinct phocines. PMID:28533965
Nuclear introns outperform mitochondrial DNA in inter-specific phylogenetic reconstruction: Lessons from horseshoe bats (Rhinolophidae: Chiroptera).

PubMed

Dool, Serena E; Puechmaille, Sebastien J; Foley, Nicole M; Allegrini, Benjamin; Bastian, Anna; Mutumi, Gregory L; Maluleke, Tinyiko G; Odendaal, Lizelle J; Teeling, Emma C; Jacobs, David S

2016-04-01

Despite many studies illustrating the perils of utilising mitochondrial DNA in phylogenetic studies, it remains one of the most widely used genetic markers for this purpose. Over the last decade, nuclear introns have been proposed as alternative markers for phylogenetic reconstruction. However, the resolution capabilities of mtDNA and nuclear introns have rarely been quantified and compared. In the current study we generated a novel ∼5kb dataset comprising six nuclear introns and a mtDNA fragment. We assessed the relative resolution capabilities of the six intronic fragments with respect to each other, when used in various combinations together, and when compared to the traditionally used mtDNA. We focused on a major clade in the horseshoe bat family (Afro-Palaearctic clade; Rhinolophidae) as our case study. This old, widely distributed and speciose group contains a high level of conserved morphology. This morphological stasis renders the reconstruction of the phylogeny of this group with traditional morphological characters complex. We sampled multiple individuals per species to represent their geographic distributions as best as possible (122 individuals, 24 species, 68 localities). We reconstructed the species phylogeny using several complementary methods (partitioned Maximum Likelihood and Bayesian and Bayesian multispecies-coalescent) and made inferences based on consensus across these methods. We computed pairwise comparisons based on Robinson-Foulds tree distance metric between all Bayesian topologies generated (27,000) for every gene(s) and visualised the tree space using multidimensional scaling (MDS) plots. Using our supported species phylogeny we estimated the ancestral state of key traits of interest within this group, e.g. echolocation peak frequency which has been implicated in speciation. Our results revealed many potential cryptic species within this group, even in taxa where this was not suspected a priori and also found evidence for mtDNA introgression. We demonstrated that by using just two introns one can recover a better supported species tree than when using the mtDNA alone, despite the shorter overall length of the combined introns. Additionally, when combining any single intron with mtDNA, we showed that the result is highly similar to the mtDNA gene tree and far from the true species tree and therefore this approach should be avoided. We caution against the indiscriminate use of mtDNA in phylogenetic studies and advocate for pilot studies to select nuclear introns. The selection of marker type and number is a crucial step that is best based on critical examination of preliminary or previously published data. Based on our findings and previous publications, we recommend the following markers to recover phylogenetic relationships between recently diverged taxa (<20 My) in bats and other mammals: ACOX2, COPS7A, BGN, ROGDI and STAT5A. Copyright © 2016 Elsevier Inc. All rights reserved.
Variance Component Selection With Applications to Microbiome Taxonomic Data.

PubMed

Zhai, Jing; Kim, Juhyun; Knox, Kenneth S; Twigg, Homer L; Zhou, Hua; Zhou, Jin J

2018-01-01

High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Microbiome data are summarized as counts or composition of the bacterial taxa at different taxonomic levels. An important problem is to identify the bacterial taxa that are associated with a response. One method is to test the association of specific taxon with phenotypes in a linear mixed effect model, which incorporates phylogenetic information among bacterial communities. Another type of approaches consider all taxa in a joint model and achieves selection via penalization method, which ignores phylogenetic information. In this paper, we consider regression analysis by treating bacterial taxa at different level as multiple random effects. For each taxon, a kernel matrix is calculated based on distance measures in the phylogenetic tree and acts as one variance component in the joint model. Then taxonomic selection is achieved by the lasso (least absolute shrinkage and selection operator) penalty on variance components. Our method integrates biological information into the variable selection problem and greatly improves selection accuracies. Simulation studies demonstrate the superiority of our methods versus existing methods, for example, group-lasso. Finally, we apply our method to a longitudinal microbiome study of Human Immunodeficiency Virus (HIV) infected patients. We implement our method using the high performance computing language Julia. Software and detailed documentation are freely available at https://github.com/JingZhai63/VCselection.
Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European.

PubMed

Forster, Peter; Toth, Alfred

2003-07-22

Indo-European is the largest and best-documented language family in the world, yet the reconstruction of the Indo-European tree, first proposed in 1863, has remained controversial. Complications may include ascertainment bias when choosing the linguistic data, and disregard for the wave model of 1872 when attempting to reconstruct the tree. Essentially analogous problems were solved in evolutionary genetics by DNA sequencing and phylogenetic network methods, respectively. We now adapt these tools to linguistics, and analyze Indo-European language data, focusing on Celtic and in particular on the ancient Celtic language of Gaul (modern France), by using bilingual Gaulish-Latin inscriptions. Our phylogenetic network reveals an early split of Celtic within Indo-European. Interestingly, the next branching event separates Gaulish (Continental Celtic) from the British (Insular Celtic) languages, with Insular Celtic subsequently splitting into Brythonic (Welsh, Breton) and Goidelic (Irish and Scottish Gaelic). Taken together, the network thus suggests that the Celtic language arrived in the British Isles as a single wave (and then differentiated locally), rather than in the traditional two-wave scenario ("P-Celtic" to Britain and "Q-Celtic" to Ireland). The phylogenetic network furthermore permits the estimation of time in analogy to genetics, and we obtain tentative dates for Indo-European at 8100 BC +/- 1,900 years, and for the arrival of Celtic in Britain at 3200 BC +/- 1,500 years. The phylogenetic method is easily executed by hand and promises to be an informative approach for many problems in historical linguistics.
Flatfish monophyly refereed by the relationship of Psettodes in Carangimorphariae.

PubMed

Shi, Wei; Chen, Shixi; Kong, Xiaoyu; Si, Lizhen; Gong, Li; Zhang, Yanchun; Yu, Hui

2018-05-25

The monophyly of flatfishes has not been supported in many molecular phylogenetic studies. The monophyly of Pleuronectoidei, which comprises all but one family of flatfishes, is broadly supported. However, the Psettodoidei, comprising the single family Psettodidae, is often found to be most closely related to other carangimorphs based on substantial sequencing efforts and diversely analytical methods. In this study, we examined why this particular result is often obtained. The mitogenomes of five flatfishes were determined. Select mitogenomes of representative carangimorph species were further employed for phylogenetic and molecular clock analyses. Our phylogenetic results do not fully support Psettodes as a sister group to pleuronectoids or other carangimorphs. And results also supported the evidence of long-branch attraction between Psettodes and the adjacent clades. Two chronograms, derived from Bayesian relaxed-clock methods, suggest that over a short period in the early Paleocene, a series of important evolutionary events occurred in carangimorphs. Based on insights provided by the molecular clock, we propose the following evolutionary explanation for the difficulty in determining the phylogenetic position of Psettodes: The initial diversification of Psettodes was very close in time to the initial diversification of carangimorphs, and the primary diversification time of pleuronectoids, the other suborder of flatfishes, occurred later than that of some percomorph taxa. Additionally, the clade of Psettodes is long and naked branch, which supports the uncertainty of its phylogenetic placement. Finally, we confirmed the monophyly of flatfishes, which was accepted by most ichthyologists.
YBYRÁ facilitates comparison of large phylogenetic trees.

PubMed

Machado, Denis Jacob

2015-07-01

The number and size of tree topologies that are being compared by phylogenetic systematists is increasing due to technological advancements in high-throughput DNA sequencing. However, we still lack tools to facilitate comparison among phylogenetic trees with a large number of terminals. The "YBYRÁ" project integrates software solutions for data analysis in phylogenetics. It comprises tools for (1) topological distance calculation based on the number of shared splits or clades, (2) sensitivity analysis and automatic generation of sensitivity plots and (3) clade diagnoses based on different categories of synapomorphies. YBYRÁ also provides (4) an original framework to facilitate the search for potential rogue taxa based on how much they affect average matching split distances (using MSdist). YBYRÁ facilitates comparison of large phylogenetic trees and outperforms competing software in terms of usability and time efficiency, specially for large data sets. The programs that comprises this toolkit are written in Python, hence they do not require installation and have minimum dependencies. The entire project is available under an open-source licence at http://www.ib.usp.br/grant/anfibios/researchSoftware.html .
Phylogenetic study of Geitlerinema and Microcystis (Cyanobacteria) using PC-IGS and 16S-23S ITS as markers: investigation of horizontal gene transfer.

PubMed

Piccin-Santos, Viviane; Brandão, Marcelo Mendes; Bittencourt-Oliveira, Maria Do Carmo

2014-08-01

Selection of genes that have not been horizontally transferred for prokaryote phylogenetic inferences is regarded as a challenging task. The markers internal transcribed spacer of ribosomal genes (16S-23S ITS) and phycocyanin intergenic spacer (PC-IGS), based on the operons of ribosomal and phycocyanin genes respectively, are among the most used markers in cyanobacteria. The region of the ribosomal genes has been considered stable, whereas the phycocyanin operon may have undergone horizontal transfer. To investigate the occurrence of horizontal transfer of PC-IGS, phylogenetic trees of Geitlerinema and Microcystis strains were generated using PC-IGS and 16S-23S ITS and compared. Phylogenetic trees based on the two markers were mostly congruent for Geitlerinema and Microcystis, indicating a common evolutionary history among ribosomal and phycocyanin genes with no evidence for horizontal transfer of PC-IGS. Thus, PC-IGS is a suitable marker, along with 16S-23S ITS for phylogenetic studies of cyanobacteria. © 2014 Phycological Society of America.
A Distance Measure for Genome Phylogenetic Analysis

NASA Astrophysics Data System (ADS)

Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.
Estimating evolutionary rates using time-structured data: a general comparison of phylogenetic methods.

PubMed

Duchêne, Sebastián; Geoghegan, Jemma L; Holmes, Edward C; Ho, Simon Y W

2016-11-15

In rapidly evolving pathogens, including viruses and some bacteria, genetic change can accumulate over short time-frames. Accordingly, their sampling times can be used to calibrate molecular clocks, allowing estimation of evolutionary rates. Methods for estimating rates from time-structured data vary in how they treat phylogenetic uncertainty and rate variation among lineages. We compiled 81 virus data sets and estimated nucleotide substitution rates using root-to-tip regression, least-squares dating and Bayesian inference. Although estimates from these three methods were often congruent, this largely relied on the choice of clock model. In particular, relaxed-clock models tended to produce higher rate estimates than methods that assume constant rates. Discrepancies in rate estimates were also associated with high among-lineage rate variation, and phylogenetic and temporal clustering. These results provide insights into the factors that affect the reliability of rate estimates from time-structured sequence data, emphasizing the importance of clock-model testing. sduchene@unimelb.edu.au or garzonsebastian@hotmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sampling strategies for improving tree accuracy and phylogenetic analyses: a case study in ciliate protists, with notes on the genus Paramecium.

PubMed

Yi, Zhenzhen; Strüder-Kypke, Michaela; Hu, Xiaozhong; Lin, Xiaofeng; Song, Weibo

2014-02-01

In order to assess how dataset-selection for multi-gene analyses affects the accuracy of inferred phylogenetic trees in ciliates, we chose five genes and the genus Paramecium, one of the most widely used model protist genera, and compared tree topologies of the single- and multi-gene analyses. Our empirical study shows that: (1) Using multiple genes improves phylogenetic accuracy, even when their one-gene topologies are in conflict with each other. (2) The impact of missing data on phylogenetic accuracy is ambiguous: resolution power and topological similarity, but not number of represented taxa, are the most important criteria of a dataset for inclusion in concatenated analyses. (3) As an example, we tested the three classification models of the genus Paramecium with a multi-gene based approach, and only the monophyly of the subgenus Paramecium is supported. Copyright © 2013 Elsevier Inc. All rights reserved.
The ovary structure and oogenesis in the basal crustaceans and hexapods. Possible phylogenetic significance.

PubMed

Jaglarz, Mariusz K; Kubrakiewicz, Janusz; Bilinski, Szczepan M

2014-07-01

Recent large-scale phylogenetic analyses of exclusively molecular or combined molecular and morphological characters support a close relationship between Crustacea and Hexapoda. The growing consensus on this phylogenetic link is reflected in uniting both taxa under the name Pancrustacea or Tetraconata. Several recent molecular phylogenies have also indicated that the monophyletic hexapods should be nested within paraphyletic crustaceans. However, it is still contentious exactly which crustacean taxon is the sister group to Hexapoda. Among the favored candidates are Branchiopoda, Malacostraca, Remipedia and Xenocarida (Remipedia + Cephalocarida). In this context, we review morphological and ultrastructural features of the ovary architecture and oogenesis in these crustacean groups in search of traits potentially suitable for phylogenetic considerations. We have identified a suite of morphological characters which may prove useful in further comparative studies. Copyright © 2014 Elsevier Ltd. All rights reserved.
Urbanisation and the loss of phylogenetic diversity in birds.

PubMed

Sol, Daniel; Bartomeus, Ignasi; González-Lagos, César; Pavoine, Sandrine

2017-06-01

Despite the recognised conservation value of phylogenetic diversity, little is known about how it is affected by the urbanisation process. Combining a complete avian phylogeny with surveys along urbanisation gradients from five continents, we show that highly urbanised environments supported on average 450 million fewer years of evolutionary history than the surrounding natural environments. This loss was primarily caused by species loss and could have been higher had not been partially compensated by the addition of urban exploiters and some exotic species. Highly urbanised environments also supported fewer evolutionary distinctive species, implying a disproportionate loss of evolutionary history. Compared with highly urbanised environments, changes in phylogenetic richness and evolutionary distinctiveness were less substantial in moderately urbanised environments. Protecting pristine environments is therefore essential for maintaining phylogenetic diversity, but moderate levels of urbanisation still preserve much of the original diversity. © 2017 John Wiley & Sons Ltd/CNRS.
Phylogenetic congruence between subtropical trees and their associated fungi.

PubMed

Liu, Xubing; Liang, Minxia; Etienne, Rampal S; Gilbert, Gregory S; Yu, Shixiao

2016-12-01

Recent studies have detected phylogenetic signals in pathogen-host networks for both soil-borne and leaf-infecting fungi, suggesting that pathogenic fungi may track or coevolve with their preferred hosts. However, a phylogenetically concordant relationship between multiple hosts and multiple fungi in has rarely been investigated. Using next-generation high-throughput DNA sequencing techniques, we analyzed fungal taxa associated with diseased leaves, rotten seeds, and infected seedlings of subtropical trees. We compared the topologies of the phylogenetic trees of the soil and foliar fungi based on the internal transcribed spacer (ITS) region with the phylogeny of host tree species based on matK , rbcL , atpB, and 5.8S genes. We identified 37 foliar and 103 soil pathogenic fungi belonging to the Ascomycota and Basidiomycota phyla and detected significantly nonrandom host-fungus combinations, which clustered on both the fungus phylogeny and the host phylogeny. The explicit evidence of congruent phylogenies between tree hosts and their potential fungal pathogens suggests either diffuse coevolution among the plant-fungal interaction networks or that the distribution of fungal species tracked spatially associated hosts with phylogenetically conserved traits and habitat preferences. Phylogenetic conservatism in plant-fungal interactions within a local community promotes host and parasite specificity, which is integral to the important role of fungi in promoting species coexistence and maintaining biodiversity of forest communities.
Evolution of specialization: a phylogenetic study of host range in the red milkweed beetle (Tetraopes tetraophthalmus).

PubMed

Rasmann, Sergio; Agrawal, Anurag A

2011-06-01

Specialization is common in most lineages of insect herbivores, one of the most diverse groups of organisms on earth. To address how and why specialization is maintained over evolutionary time, we hypothesized that plant defense and other ecological attributes of potential host plants would predict the performance of a specialist root-feeding herbivore (the red milkweed beetle, Tetraopes tetraophthalmus). Using a comparative phylogenetic and functional trait approach, we assessed the determinants of insect host range across 18 species of Asclepias. Larval survivorship decreased with increasing phylogenetic distance from the true host, Asclepias syriaca, suggesting that adaptation to plant traits drives specialization. Among several root traits measured, only cardenolides (toxic defense chemicals) correlated with larval survival, and cardenolides also explained the phylogenetic distance effect in phylogenetically controlled multiple regression analyses. Additionally, milkweed species having a known association with other Tetraopes beetles were better hosts than species lacking Tetraopes herbivores, and milkweeds with specific leaf area values (a trait related to leaf function and habitat affiliation) similar to those of A. syriaca were better hosts than species having divergent values. We thus conclude that phylogenetic distance is an integrated measure of phenotypic and ecological attributes of Asclepias species, especially defensive cardenolides, which can be used to explain specialization and constraints on host shifts over evolutionary time.
Applying a multiobjective metaheuristic inspired by honey bees to phylogenetic inference.

PubMed

Santander-Jiménez, Sergio; Vega-Rodríguez, Miguel A

2013-10-01

The development of increasingly popular multiobjective metaheuristics has allowed bioinformaticians to deal with optimization problems in computational biology where multiple objective functions must be taken into account. One of the most relevant research topics that can benefit from these techniques is phylogenetic inference. Throughout the years, different researchers have proposed their own view about the reconstruction of ancestral evolutionary relationships among species. As a result, biologists often report different phylogenetic trees from a same dataset when considering distinct optimality principles. In this work, we detail a multiobjective swarm intelligence approach based on the novel Artificial Bee Colony algorithm for inferring phylogenies. The aim of this paper is to propose a complementary view of phylogenetics according to the maximum parsimony and maximum likelihood criteria, in order to generate a set of phylogenetic trees that represent a compromise between these principles. Experimental results on a variety of nucleotide data sets and statistical studies highlight the relevance of the proposal with regard to other multiobjective algorithms and state-of-the-art biological methods. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
A comparative study of the inner ear structures of artiodactyls and early cetaceans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Klingshirn, M.A.; Luo, Z.

1994-12-31

It has been suggested that the order Cetacea (whales and porpoises) are closely related to artiodactyls, even-hoofed ungulate mammals such as the pig and cow. Paleontological and molecular data strongly supports this concept of phylogenetic relationships. In a study of DNA sequences of two mitochondrial ribosomal gene segments of cetaceans, the artiodactyls were found to be closest related to Cetaceans. These well accepted studies on the phylogenetic affinities of artiodactyls and cetaceans cause us to conduct a comparative study of the bony structure of the inner ear of these two taxa.
Language evolution and human history: what a difference a date makes.

PubMed

Gray, Russell D; Atkinson, Quentin D; Greenhill, Simon J

2011-04-12

Historical inference is at its most powerful when independent lines of evidence can be integrated into a coherent account. Dating linguistic and cultural lineages can potentially play a vital role in the integration of evidence from linguistics, anthropology, archaeology and genetics. Unfortunately, although the comparative method in historical linguistics can provide a relative chronology, it cannot provide absolute date estimates and an alternative approach, called glottochronology, is fundamentally flawed. In this paper we outline how computational phylogenetic methods can reliably estimate language divergence dates and thus help resolve long-standing debates about human prehistory ranging from the origin of the Indo-European language family to the peopling of the Pacific.
Language evolution and human history: what a difference a date makes

PubMed Central

Gray, Russell D.; Atkinson, Quentin D.; Greenhill, Simon J.

2011-01-01

Historical inference is at its most powerful when independent lines of evidence can be integrated into a coherent account. Dating linguistic and cultural lineages can potentially play a vital role in the integration of evidence from linguistics, anthropology, archaeology and genetics. Unfortunately, although the comparative method in historical linguistics can provide a relative chronology, it cannot provide absolute date estimates and an alternative approach, called glottochronology, is fundamentally flawed. In this paper we outline how computational phylogenetic methods can reliably estimate language divergence dates and thus help resolve long-standing debates about human prehistory ranging from the origin of the Indo-European language family to the peopling of the Pacific. PMID:21357231

Environmental and spatial drivers of taxonomic, functional, and phylogenetic characteristics of bat communities in human-modified landscapes

PubMed Central

Fagan, Matthew E.; Willig, Michael R.

2016-01-01

Background Assembly of species into communities following human disturbance (e.g., deforestation, fragmentation) may be governed by spatial (e.g., dispersal) or environmental (e.g., niche partitioning) mechanisms. Variation partitioning has been used to broadly disentangle spatial and environmental mechanisms, and approaches utilizing functional and phylogenetic characteristics of communities have been implemented to determine the relative importance of particular environmental (or niche-based) mechanisms. Nonetheless, few studies have integrated these quantitative approaches to comprehensively assess the relative importance of particular structuring processes. Methods We employed a novel variation partitioning approach to evaluate the relative importance of particular spatial and environmental drivers of taxonomic, functional, and phylogenetic aspects of bat communities in a human-modified landscape in Costa Rica. Specifically, we estimated the amount of variation in species composition (taxonomic structure) and in two aspects of functional and phylogenetic structure (i.e., composition and dispersion) along a forest loss and fragmentation gradient that are uniquely explained by landscape characteristics (i.e., environment) or space to assess the importance of competing mechanisms. Results The unique effects of space on taxonomic, functional and phylogenetic structure were consistently small. In contrast, landscape characteristics (i.e., environment) played an appreciable role in structuring bat communities. Spatially-structured landscape characteristics explained 84% of the variation in functional or phylogenetic dispersion, and the unique effects of landscape characteristics significantly explained 14% of the variation in species composition. Furthermore, variation in bat community structure was primarily due to differences in dispersion of species within functional or phylogenetic space along the gradient, rather than due to differences in functional or phylogenetic composition. Discussion Variation among bat communities was related to environmental mechanisms, especially niche-based (i.e., environmental) processes, rather than spatial mechanisms. High variation in functional or phylogenetic dispersion, as opposed to functional or phylogenetic composition, suggests that loss or gain of niche space is driving the progressive loss or gain of species with particular traits from communities along the human-modified gradient. Thus, environmental characteristics associated with landscape structure influence functional or phylogenetic aspects of bat communities by effectively altering the ways in which species partition niche space. PMID:27761338
Selecting Species Traits for Biomonitoring Applications in light of Phylogenetic Relationships among Lotic Insects

NASA Astrophysics Data System (ADS)

Poff, N.; Vieira, N. K.; Simmons, M. P.; Olden, J. D.; Kondratieff, B. C.; Finn, D. S.

2005-05-01

The use of species traits as indicators of environmental disturbance is being considered for biomonitoring programs globally. As such, methods to select relevant and informative traits for inclusion in biometrics need to be developed. In this research, we identified 20 traits of aquatic insects within six trait groups: morphology, mobility, life-history strategy, thermal tolerance, feeding guild and ecology (e.g., habitat preference). We constructed phylogenetic trees for 1) all lotic insect species of North America and 2) all Ephemeroptera, Plecoptera and Trichoptera species based on morphology- and molecular-based analyses and classifications. We then measured variability (i.e., plasticity) of the 20 traits and six trait groups across the two phylogenetic trees. Traits with higher degrees of plasticity indicated traits that were less phylogenetically constrained, and were considered informative for biomonitoring purposes. Thermal tolerance, rheophily, body size at maturity and feeding guild showed the highest plasticity across both phylogenetic trees. Two mobility traits, occurrence in drift and adult dispersal distance, showed moderate plasticity. By contrast, adult exiting ability, degree of attachment, adult lifespan and body shape showed low variability and were thus less informative. Plastic species traits that are less phylogenetically constrained may be most useful in detecting community change along environmental gradients.
Detecting and overcoming systematic errors in genome-scale phylogenies.

PubMed

Rodríguez-Ezpeleta, Naiara; Brinkmann, Henner; Roure, Béatrice; Lartillot, Nicolas; Lang, B Franz; Philippe, Hervé

2007-06-01

Genome-scale data sets result in an enhanced resolution of the phylogenetic inference by reducing stochastic errors. However, there is also an increase of systematic errors due to model violations, which can lead to erroneous phylogenies. Here, we explore the impact of systematic errors on the resolution of the eukaryotic phylogeny using a data set of 143 nuclear-encoded proteins from 37 species. The initial observation was that, despite the impressive amount of data, some branches had no significant statistical support. To demonstrate that this lack of resolution is due to a mutual annihilation of phylogenetic and nonphylogenetic signals, we created a series of data sets with slightly different taxon sampling. As expected, these data sets yielded strongly supported but mutually exclusive trees, thus confirming the presence of conflicting phylogenetic and nonphylogenetic signals in the original data set. To decide on the correct tree, we applied several methods expected to reduce the impact of some kinds of systematic error. Briefly, we show that (i) removing fast-evolving positions, (ii) recoding amino acids into functional categories, and (iii) using a site-heterogeneous mixture model (CAT) are three effective means of increasing the ratio of phylogenetic to nonphylogenetic signal. Finally, our results allow us to formulate guidelines for detecting and overcoming phylogenetic artefacts in genome-scale phylogenetic analyses.
DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution.

PubMed

Xia, Xuhua

2018-06-01

DAMBE is a comprehensive software package for genomic and phylogenetic data analysis on Windows, Linux, and Macintosh computers. New functions include imputing missing distances and phylogeny simultaneously (paving the way to build large phage and transposon trees), new bootstrapping/jackknifing methods for PhyPA (phylogenetics from pairwise alignments), and an improved function for fast and accurate estimation of the shape parameter of the gamma distribution for fitting rate heterogeneity over sites. Previous method corrects multiple hits for each site independently. DAMBE's new method uses all sites simultaneously for correction. DAMBE, featuring a user-friendly graphic interface, is freely available from http://dambe.bio.uottawa.ca (last accessed, April 17, 2018).
Fast Construction of Near Parsimonious Hybridization Networks for Multiple Phylogenetic Trees.

PubMed

Mirzaei, Sajad; Wu, Yufeng

2016-01-01

Hybridization networks represent plausible evolutionary histories of species that are affected by reticulate evolutionary processes. An established computational problem on hybridization networks is constructing the most parsimonious hybridization network such that each of the given phylogenetic trees (called gene trees) is "displayed" in the network. There have been several previous approaches, including an exact method and several heuristics, for this NP-hard problem. However, the exact method is only applicable to a limited range of data, and heuristic methods can be less accurate and also slow sometimes. In this paper, we develop a new algorithm for constructing near parsimonious networks for multiple binary gene trees. This method is more efficient for large numbers of gene trees than previous heuristics. This new method also produces more parsimonious results on many simulated datasets as well as a real biological dataset than a previous method. We also show that our method produces topologically more accurate networks for many datasets.
Comparative cytogenetic analysis of some species of the Dendropsophus microcephalus group (Anura, Hylidae) in the light of phylogenetic inferences

PubMed Central

2013-01-01

Background Dendropsophus is a monophyletic anuran genus with a diploid number of 30 chromosomes as an important synapomorphy. However, the internal phylogenetic relationships of this genus are poorly understood. Interestingly, an intriguing interspecific variation in the telocentric chromosome number has been useful in species identification. To address certain uncertainties related to one of the species groups of Dendropsophus, the D. microcephalus group, we carried out a cytogenetic analysis combined with phylogenetic inferences based on mitochondrial sequences, which aimed to aid in the analysis of chromosomal characters. Populations of Dendropsophus nanus, Dendropsophus walfordi, Dendropsophus sanborni, Dendropsophus jimi and Dendropsophus elianeae, ranging from the extreme south to the north of Brazil, were cytogenetically compared. A mitochondrial region of the ribosomal 12S gene from these populations, as well as from 30 other species of Dendropsophus, was used for the phylogenetic inferences. Phylogenetic relationships were inferred using maximum parsimony and Bayesian analyses. Results The species D. nanus and D. walfordi exhibited identical karyotypes (2n = 30; FN = 52), with four pairs of telocentric chromosomes and a NOR located on metacentric chromosome pair 13. In all of the phylogenetic hypotheses, the paraphyly of D. nanus and D. walfordi was inferred. D. sanborni from Botucatu-SP and Torres-RS showed the same karyotype as D. jimi, with 5 pairs of telocentric chromosomes (2n = 30; FN = 50) and a terminal NOR in the long arm of the telocentric chromosome pair 12. Despite their karyotypic similarity, these species were not found to compose a monophyletic group. Finally, the phylogenetic and cytogenetic analyses did not cluster the specimens of D. elianeae according to their geographical occurrence or recognized morphotypes. Conclusions We suggest that a taxonomic revision of the taxa D. nanus and D. walfordi is quite necessary. We also observe that the number of telocentric chromosomes is useful to distinguish among valid species in some cases, although it is unchanged in species that are not necessarily closely related phylogenetically. Therefore, inferences based on this chromosomal character must be made with caution; a proper evolutionary analysis of the karyotypic variation in Dendropsophus depends on further characterization of the telocentric chromosomes found in this group. PMID:23822759
Restriction Fragment Length Polymorphism Analysis Reveals High Levels of Genetic Divergence Among the Light Organ Symbionts of Flashlight Fish.

PubMed

Wolfe, C J; Haygood, M G

1991-08-01

Restriction fragment length polymorphisms within the lux and 16S ribosomal RNA gene regions were used to compare unculturable bacterial light organ symbionts of several anomalopid fish species. The method of Nei and Li (1979) was used to calculate phylogenetic distance from the patterns of restriction fragment lengths of the luxA and 16S rRNA regions. Phylogenetic trees constructed from each distance matrix (luxA and 16S rDNA data) have similar branching orders. The levels of divergence among the symbionts, relative to other culturable luminous bacteria, suggests that the symbionts differ at the level of species among host fish genera. Symbiont relatedness and host geographic location do not seem to be correlated, and the symbionts do not appear to be strains of common, free-living, luminous bacteria. In addition, the small number of hybridizing fragments within the 16S rRNA region of the symbionts, compared with that of the free-living species, suggests a decrease in copy number of rRNA operons relative to free-living species. At this level of investigation, the symbiont phylogeny is consistent with the proposed phylogeny of the host fish family and suggests that each symbiont strain coevolved with its host fish species.
Personality structure and social style in macaques.

PubMed

Adams, Mark James; Majolo, Bonaventura; Ostner, Julia; Schülke, Oliver; De Marco, Arianna; Thierry, Bernard; Engelhardt, Antje; Widdig, Anja; Gerald, Melissa S; Weiss, Alexander

2015-08-01

Why regularities in personality can be described with particular dimensions is a basic question in differential psychology. Nonhuman primates can also be characterized in terms of personality structure. Comparative approaches can help reveal phylogenetic constraints and social and ecological patterns associated with the presence or absence of specific personality dimensions. We sought to determine how different personality structures are related to interspecific variation in social style. Specifically, we examined this question in 6 different species of macaques, because macaque social style is well characterized and can be categorized on a spectrum of despotic (Grade 1) versus tolerant (Grade 4) social styles. We derived personality structures from adjectival ratings of Japanese (Macaca fuscata; Grade 1), Assamese (M. assamensis; Grade 2), Barbary (M. sylvanus; Grade 3), Tonkean (M. tonkeana; Grade 4), and crested (M. nigra; Grade 4) macaques and compared these species with rhesus macaques (M. mulatta; Grade 1) whose personality was previously characterized. Using a nonparametric method, fuzzy set analysis, to identify commonalities in personality dimensions across species, we found that all but 1 species exhibited consistently defined Friendliness and Openness dimensions, but that similarities in personality dimensions capturing aggression and social competence reflect similarities in social styles. These findings suggest that social and phylogenetic relationships contribute to the origin, maintenance, and diversification of personality. (c) 2015 APA, all rights reserved.
On the phylogenetic placement of human T cell leukemia virus type 1 sequences associated with an Andean mummy.

PubMed

Coulthart, Michael B; Posada, David; Crandall, Keith A; Dekaban, Gregory A

2006-03-01

Recently, the putative finding of ancient human T cell leukemia virus type 1 (HTLV-1) long terminal repeat (LTR) DNA sequences in association with a 1500-year-old Chilean mummy has stirred vigorous debate. The debate is based partly on the inherent uncertainties associated with phylogenetic reconstruction when only short sequences of closely related genotypes are available. However, a full analysis of what phylogenetic information is present in the mummy data has not previously been published, leaving open the question of what precisely is the range of admissible interpretation. To fulfill this need, we re-analyzed the mummy data in a new way. We first performed phylogenetic analysis of 188 published LTR DNA sequences from extant strains belonging to the HTLV-1 Cosmopolitan clade, using the method of statistical parsimony which is designed both to optimize phylogenetic resolution among sequences with little evolutionary divergence, and to permit precise mapping of individual sequence mutations onto branches of a divergence network. We then deduced possible phylogenetic positions for the two main categories of published Chilean mummy sequences, based on their published 157-nucleotide LTR sequences. The possible phylogenetic placements for one of the mummy sequence categories are consistent with a modern origin. However, one of these placements for the other mummy sequence category falls very close to the root of the Cosmopolitan clade, consistent with an ancient origin for both this mummy sequence and the Cosmopolitan clade.
pez: phylogenetics for the environmental sciences.

PubMed

Pearse, William D; Cadotte, Marc W; Cavender-Bares, Jeannine; Ives, Anthony R; Tucker, Caroline M; Walker, Steve C; Helmus, Matthew R

2015-09-01

pez is an R package that permits measurement, modelling and simulation of phylogenetic structure in ecological data. pez contains the first implementation of many methods in R, and aggregates existing data structures and methods into a single, coherent package. pez is released under the GPL v3 open-source license, available on the Internet from CRAN (http://cran.r-project.org). The package is under active development, and the authors welcome contributions (see http://github.com/willpearse/pez). will.pearse@gmail.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Molecular, phylogenetic and comparative genomic analysis of the cytokinin oxidase/dehydrogenase gene family in the Poaceae.

PubMed

Mameaux, Sabine; Cockram, James; Thiel, Thomas; Steuernagel, Burkhard; Stein, Nils; Taudien, Stefan; Jack, Peter; Werner, Peter; Gray, John C; Greenland, Andy J; Powell, Wayne

2012-01-01

The genomes of cereals such as wheat (Triticum aestivum) and barley (Hordeum vulgare) are large and therefore problematic for the map-based cloning of agronomicaly important traits. However, comparative approaches within the Poaceae permit transfer of molecular knowledge between species, despite their divergence from a common ancestor sixty million years ago. The finding that null variants of the rice gene cytokinin oxidase/dehydrogenase 2 (OsCKX2) result in large yield increases provides an opportunity to explore whether similar gains could be achieved in other Poaceae members. Here, phylogenetic, molecular and comparative analyses of CKX families in the sequenced grass species rice, brachypodium, sorghum, maize and foxtail millet, as well as members identified from the transcriptomes/genomes of wheat and barley, are presented. Phylogenetic analyses define four Poaceae CKX clades. Comparative analyses showed that CKX phylogenetic groupings can largely be explained by a combination of local gene duplication, and the whole-genome duplication event that predates their speciation. Full-length OsCKX2 homologues in barley (HvCKX2.1, HvCKX2.2) and wheat (TaCKX2.3, TaCKX2.4, TaCKX2.5) are characterized, with comparative analysis at the DNA, protein and genetic/physical map levels suggesting that true CKX2 orthologs have been identified. Furthermore, our analysis shows CKX2 genes in barley and wheat have undergone a Triticeae-specific gene-duplication event. Finally, by identifying ten of the eleven CKX genes predicted to be present in barley by comparative analyses, we show that next-generation sequencing approaches can efficiently determine the gene space of large-genome crops. Together, this work provides the foundation for future functional investigation of CKX family members within the Poaceae. © 2011 National Institute of Agricultural Botany (NIAB). Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Phylogenetic Analyses of Shigella and Enteroinvasive Escherichia coli for the Identification of Molecular Epidemiological Markers: Whole-Genome Comparative Analysis Does Not Support Distinct Genera Designation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pettengill, Emily A.; Pettengill, James B.; Binet, Rachel

As a leading cause of bacterial dysentery, Shigella represents a significant threat to public health and food safety. Related, but often overlooked, enteroinvasive Escherichia coli (EIEC) can also cause dysentery. Current typing methods have limited ability to identify and differentiate between these pathogens despite the need for rapid and accurate identification of pathogens for clinical treatment and outbreak response. We present a comprehensive phylogeny of Shigella and EIEC using whole genome sequencing of 169 samples, constituting unparalleled strain diversity, and observe a lack of monophyly between Shigella and EIEC and among Shigella taxonomic groups. The evolutionary relationships in the phylogenymore » are supported by analyses of population structure and hierarchical clustering patterns of translated gene homolog abundance. Lastly, we identified a panel of 404 single nucleotide polymorphism (SNP) markers specific to each phylogenetic cluster for more accurate identification of Shigella and EIEC. Our findings show that Shigella and EIEC are not distinct evolutionary groups within the E. coli genus and, thus, EIEC as a group is not the ancestor to Shigella. The multiple analyses presented provide evidence for reconsidering the taxonomic placement of Shigella. The SNP markers offer more discriminatory power to molecular epidemiological typing methods involving these bacterial pathogens.« less
Phylogenetic Analyses of Shigella and Enteroinvasive Escherichia coli for the Identification of Molecular Epidemiological Markers: Whole-Genome Comparative Analysis Does Not Support Distinct Genera Designation

DOE PAGES

Pettengill, Emily A.; Pettengill, James B.; Binet, Rachel

2016-01-19

As a leading cause of bacterial dysentery, Shigella represents a significant threat to public health and food safety. Related, but often overlooked, enteroinvasive Escherichia coli (EIEC) can also cause dysentery. Current typing methods have limited ability to identify and differentiate between these pathogens despite the need for rapid and accurate identification of pathogens for clinical treatment and outbreak response. We present a comprehensive phylogeny of Shigella and EIEC using whole genome sequencing of 169 samples, constituting unparalleled strain diversity, and observe a lack of monophyly between Shigella and EIEC and among Shigella taxonomic groups. The evolutionary relationships in the phylogenymore » are supported by analyses of population structure and hierarchical clustering patterns of translated gene homolog abundance. Lastly, we identified a panel of 404 single nucleotide polymorphism (SNP) markers specific to each phylogenetic cluster for more accurate identification of Shigella and EIEC. Our findings show that Shigella and EIEC are not distinct evolutionary groups within the E. coli genus and, thus, EIEC as a group is not the ancestor to Shigella. The multiple analyses presented provide evidence for reconsidering the taxonomic placement of Shigella. The SNP markers offer more discriminatory power to molecular epidemiological typing methods involving these bacterial pathogens.« less
Ortholog-based screening and identification of genes related to intracellular survival.

PubMed

Yang, Xiaowen; Wang, Jiawei; Bing, Guoxia; Bie, Pengfei; De, Yanyan; Lyu, Yanli; Wu, Qingmin

2018-04-20

Bioinformatics and comparative genomics analysis methods were used to predict unknown pathogen genes based on homology with identified or functionally clustered genes. In this study, the genes of common pathogens were analyzed to screen and identify genes associated with intracellular survival through sequence similarity, phylogenetic tree analysis and the λ-Red recombination system test method. The total 38,952 protein-coding genes of common pathogens were divided into 19,775 clusters. As demonstrated through a COG analysis, information storage and processing genes might play an important role intracellular survival. Only 19 clusters were present in facultative intracellular pathogens, and not all were present in extracellular pathogens. Construction of a phylogenetic tree selected 18 of these 19 clusters. Comparisons with the DEG database and previous research revealed that seven other clusters are considered essential gene clusters and that seven other clusters are associated with intracellular survival. Moreover, this study confirmed that clusters screened by orthologs with similar function could be replaced with an approved uvrY gene and its orthologs, and the results revealed that the usg gene is associated with intracellular survival. The study improves the current understanding of intracellular pathogens characteristics and allows further exploration of the intracellular survival-related gene modules in these pathogens. Copyright © 2018. Published by Elsevier B.V.
Whole-proteome phylogeny of large dsDNA viruses and parvoviruses through a composition vector method related to dynamical language model

PubMed Central

2010-01-01

Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size. PMID:20565983
The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae)

PubMed Central

McCann, Jamie; Stuessy, Tod F.; Villaseñor, Jose L.; Weiss-Schneeweiss, Hanna

2016-01-01

Chromosome number change (polyploidy and dysploidy) plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae) as model group, we assess the impact of reconstruction method (maximum parsimony, maximum likelihood, Bayesian methods), branch length model (phylograms versus chronograms) and phylogenetic uncertainty (topological and branch length uncertainty) on the inference of chromosome number evolution. We also address the suitability of the maximum clade credibility (MCC) tree as single representative topology for chromosome number reconstruction. Each of the listed factors causes considerable incongruence among chromosome number reconstructions. Discrepancies between inferences on the MCC tree from those made by integrating over a set of trees are moderate for ancestral chromosome numbers, but severe for the difference of chromosome gains and losses, a measure of the directionality of dysploidy. Therefore, reliance on single trees, such as the MCC tree, is strongly discouraged and model averaging, taking both phylogenetic and model uncertainty into account, is recommended. For studying chromosome number evolution, dedicated models implemented in the program ChromEvol and ordered maximum parsimony may be most appropriate. Chromosome number evolution in Melampodium follows a pattern of bidirectional dysploidy (starting from x = 11 to x = 9 and x = 14, respectively) with no prevailing direction. PMID:27611687
The Impact of Reconstruction Methods, Phylogenetic Uncertainty and Branch Lengths on Inference of Chromosome Number Evolution in American Daisies (Melampodium, Asteraceae).

PubMed

McCann, Jamie; Schneeweiss, Gerald M; Stuessy, Tod F; Villaseñor, Jose L; Weiss-Schneeweiss, Hanna

2016-01-01

Chromosome number change (polyploidy and dysploidy) plays an important role in plant diversification and speciation. Investigating chromosome number evolution commonly entails ancestral state reconstruction performed within a phylogenetic framework, which is, however, prone to uncertainty, whose effects on evolutionary inferences are insufficiently understood. Using the chromosomally diverse plant genus Melampodium (Asteraceae) as model group, we assess the impact of reconstruction method (maximum parsimony, maximum likelihood, Bayesian methods), branch length model (phylograms versus chronograms) and phylogenetic uncertainty (topological and branch length uncertainty) on the inference of chromosome number evolution. We also address the suitability of the maximum clade credibility (MCC) tree as single representative topology for chromosome number reconstruction. Each of the listed factors causes considerable incongruence among chromosome number reconstructions. Discrepancies between inferences on the MCC tree from those made by integrating over a set of trees are moderate for ancestral chromosome numbers, but severe for the difference of chromosome gains and losses, a measure of the directionality of dysploidy. Therefore, reliance on single trees, such as the MCC tree, is strongly discouraged and model averaging, taking both phylogenetic and model uncertainty into account, is recommended. For studying chromosome number evolution, dedicated models implemented in the program ChromEvol and ordered maximum parsimony may be most appropriate. Chromosome number evolution in Melampodium follows a pattern of bidirectional dysploidy (starting from x = 11 to x = 9 and x = 14, respectively) with no prevailing direction.
The floral transcriptomes of four bamboo species (Bambusoideae; Poaceae): support for common ancestry among woody bamboos.

PubMed

Wysocki, William P; Ruiz-Sanchez, Eduardo; Yin, Yanbin; Duvall, Melvin R

2016-05-20

Next-generation sequencing now allows for total RNA extracts to be sequenced in non-model organisms such as bamboos, an economically and ecologically important group of grasses. Bamboos are divided into three lineages, two of which are woody perennials with bisexual flowers, which undergo gregarious monocarpy. The third lineage, which are herbaceous perennials, possesses unisexual flowers that undergo annual flowering events. Transcriptomes were assembled using both reference-based and de novo methods. These two methods were tested by characterizing transcriptome content using sequence alignment to previously characterized reference proteomes and by identifying Pfam domains. Because of the striking differences in floral morphology and phenology between the herbaceous and woody bamboo lineages, MADS-box genes, transcription factors that control floral development and timing, were characterized and analyzed in this study. Transcripts were identified using phylogenetic methods and categorized as A, B, C, D or E-class genes, which control floral development, or SOC or SVP-like genes, which control the timing of flowering events. Putative nuclear orthologues were also identified in bamboos to use as phylogenetic markers. Instances of gene copies exhibiting topological patterns that correspond to shared phenotypes were observed in several gene families including floral development and timing genes. Alignments and phylogenetic trees were generated for 3,878 genes and for all genes in a concatenated analysis. Both the concatenated analysis and those of 2,412 separate gene trees supported monophyly among the woody bamboos, which is incongruent with previous phylogenetic studies using plastid markers.
Cyber infrastructure for Fusarium: three integrated platforms supporting strain identification, phylogenetics, comparative genomics and knowledge sharing.

PubMed

Park, Bongsoo; Park, Jongsun; Cheong, Kyeong-Chae; Choi, Jaeyoung; Jung, Kyongyong; Kim, Donghan; Lee, Yong-Hwan; Ward, Todd J; O'Donnell, Kerry; Geiser, David M; Kang, Seogchan

2011-01-01

The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate species identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on well-preserved culture collections, have established a robust foundation for Fusarium classification. Genomes of four Fusarium species have been published with more being currently sequenced. The Cyber infrastructure for Fusarium (CiF; http://www.fusariumdb.org/) was built to support archiving and utilization of rapidly increasing data and knowledge and consists of Fusarium-ID, Fusarium Comparative Genomics Platform (FCGP) and Fusarium Community Platform (FCP). The Fusarium-ID archives phylogenetic marker sequences from most known species along with information associated with characterized isolates and supports strain identification and phylogenetic analyses. The FCGP currently archives five genomes from four species. Besides supporting genome browsing and analysis, the FCGP presents computed characteristics of multiple gene families and functional groups. The Cart/Favorite function allows users to collect sequences from Fusarium-ID and the FCGP and analyze them later using multiple tools without requiring repeated copying-and-pasting of sequences. The FCP is designed to serve as an online community forum for sharing and preserving accumulated experience and knowledge to support future research and education.
Comparative Study of Lectin Domains in Model Species: New Insights into Evolutionary Dynamics

PubMed Central

Van Holle, Sofie; De Schutter, Kristof; Eggermont, Lore; Tsaneva, Mariya; Dang, Liuyi; Van Damme, Els J. M.

2017-01-01

Lectins are present throughout the plant kingdom and are reported to be involved in diverse biological processes. In this study, we provide a comparative analysis of the lectin families from model species in a phylogenetic framework. The analysis focuses on the different plant lectin domains identified in five representative core angiosperm genomes (Arabidopsis thaliana, Glycine max, Cucumis sativus, Oryza sativa ssp. japonica and Oryza sativa ssp. indica). The genomes were screened for genes encoding lectin domains using a combination of Basic Local Alignment Search Tool (BLAST), hidden Markov models, and InterProScan analysis. Additionally, phylogenetic relationships were investigated by constructing maximum likelihood phylogenetic trees. The results demonstrate that the majority of the lectin families are present in each of the species under study. Domain organization analysis showed that most identified proteins are multi-domain proteins, owing to the modular rearrangement of protein domains during evolution. Most of these multi-domain proteins are widespread, while others display a lineage-specific distribution. Furthermore, the phylogenetic analyses reveal that some lectin families evolved to be similar to the phylogeny of the plant species, while others share a closer evolutionary history based on the corresponding protein domain architecture. Our results yield insights into the evolutionary relationships and functional divergence of plant lectins. PMID:28587095

Some links on this page may take you to non-federal websites. Their policies may differ from this site.