Science.gov

Sample records for alignment phylogenetic tree

  1. Detecting phylogenetic breakpoints and discordance from genome-wide alignments for species tree reconstruction.

    PubMed

    Ané, Cécile

    2011-01-01

    With the easy acquisition of sequence data, it is now possible to obtain and align whole genomes across multiple related species or populations. In this work, I assess the performance of a statistical method to reconstruct the whole distribution of phylogenetic trees along the genome, estimate the proportion of the genome for which a given clade is true, and infer a concordance tree that summarizes the dominant vertical inheritance pattern. There are two main issues when dealing with whole-genome alignments, as opposed to multiple genes: the size of the data and the detection of recombination breakpoints. These breakpoints partition the genomic alignment into phylogenetically homogeneous loci, where sites within a given locus all share the same phylogenetic tree topology. To delimitate these loci, I describe here a method based on the minimum description length (MDL) principle, implemented with dynamic programming for computational efficiency. Simulations show that combining MDL partitioning with Bayesian concordance analysis provides an efficient and robust way to estimate both the vertical inheritance signal and the horizontal phylogenetic signal. The method performed well both in the presence of incomplete lineage sorting and in the presence of horizontal gene transfer. A high level of systematic bias was found here, highlighting the need for good individual tree building methods, which form the basis for more elaborate gene tree/species tree reconciliation methods. PMID:21362638

  2. Phylogenetic Inference From Conserved sites Alignments

    SciTech Connect

    grundy, W.N.; Naylor, G.J.P.

    1999-08-15

    Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements.

  3. A Universal Phylogenetic Tree.

    ERIC Educational Resources Information Center

    Offner, Susan

    2001-01-01

    Presents a universal phylogenetic tree suitable for use in high school and college-level biology classrooms. Illustrates the antiquity of life and that all life is related, even if it dates back 3.5 billion years. Reflects important evolutionary relationships and provides an exciting way to learn about the history of life. (SAH)

  4. Interim Report on Multiple Sequence Alignments and TaqMan Signature Mapping to Phylogenetic Trees

    SciTech Connect

    Gardner, S; Jaing, C

    2012-03-27

    The goal of this project is to develop forensic genotyping assays for select agent viruses, addressing a significant capability gap for the viral bioforensics and law enforcement community. We used a multipronged approach combining bioinformatics analysis, PCR-enriched samples, microarrays and TaqMan assays to develop high resolution and cost effective genotyping methods for strain level forensic discrimination of viruses. We have leveraged substantial experience and efficiency gained through year 1 on software development, SNP discovery, TaqMan signature design and phylogenetic signature mapping to scale up the development of forensics signatures in year 2. In this report, we have summarized the Taqman signature development for South American hemorrhagic fever viruses, tick-borne encephalitis viruses and henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus and Japanese encephalitis virus.

  5. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  6. Visualizing phylogenetic trees using TreeView.

    PubMed

    Page, Roderic D M

    2002-08-01

    TreeView provides a simple way to view the phylogenetic trees produced by a range of programs, such as PAUP*, PHYLIP, TREE-PUZZLE, and ClustalX. While some phylogenetic programs (such as the Macintosh version of PAUP*) have excellent tree printing facilities, many programs do not have the ability to generate publication quality trees. TreeView addresses this need. The program can read and write a range of tree file formats, display trees in a variety of styles, print trees, and save the tree as a graphic file. Protocols in this unit cover both displaying and printing a tree. Support protocols describe how to download and install TreeView, and how to display bootstrap values in trees generated by ClustalX and PAUP*. PMID:18792942

  7. PoInTree: a polar and interactive phylogenetic tree.

    PubMed

    Carreras, Marco; Marco, Cerreras; Gianti, Eleonora; Eleonora, Gianti; Sartori, Luca; Luca, Sartori; Plyte, Simon Edward; Edward, Plyte Simon; Isacchi, Antonella; Antonella, Isacchi; Bosotti, Roberta; Roberta, Bosotti

    2005-02-01

    PoInTree (Polar and Interactive Tree) is an application that allows to build, visualize and customize phylogenetic trees in a polar interactive and highly flexible view. It takes as input a FASTA file or multiple alignment formats. Phylogenetic tree calculation is based on a sequence distance method and utilizes the Neighbor Joining (NJ) algorithm. It also allows displaying precalculated trees of the major protein families based on Pfam classification. In PoInTree, nodes can be dynamically opened and closed and distances between genes are graphically represented. Tree root can be centered on a selected leaf. Text search mechanism, color-coding and labeling display are integrated. The visualizer can be connected to an Oracle database containing information on sequences and other biological data, helping to guide their interpretation within a given protein family across multiple species. The application is written in Borland Delphi and based on VCL Teechart Pro 6 graphical component (Steema software). PMID:16144524

  8. Interpreting the universal phylogenetic tree

    NASA Technical Reports Server (NTRS)

    Woese, C. R.

    2000-01-01

    The universal phylogenetic tree not only spans all extant life, but its root and earliest branchings represent stages in the evolutionary process before modern cell types had come into being. The evolution of the cell is an interplay between vertically derived and horizontally acquired variation. Primitive cellular entities were necessarily simpler and more modular in design than are modern cells. Consequently, horizontal gene transfer early on was pervasive, dominating the evolutionary dynamic. The root of the universal phylogenetic tree represents the first stage in cellular evolution when the evolving cell became sufficiently integrated and stable to the erosive effects of horizontal gene transfer that true organismal lineages could exist.

  9. Interpreting the universal phylogenetic tree

    PubMed Central

    Woese, Carl R.

    2000-01-01

    The universal phylogenetic tree not only spans all extant life, but its root and earliest branchings represent stages in the evolutionary process before modern cell types had come into being. The evolution of the cell is an interplay between vertically derived and horizontally acquired variation. Primitive cellular entities were necessarily simpler and more modular in design than are modern cells. Consequently, horizontal gene transfer early on was pervasive, dominating the evolutionary dynamic. The root of the universal phylogenetic tree represents the first stage in cellular evolution when the evolving cell became sufficiently integrated and stable to the erosive effects of horizontal gene transfer that true organismal lineages could exist. PMID:10900003

  10. Multipolar consensus for phylogenetic trees.

    PubMed

    Bonnard, Cécile; Berry, Vincent; Lartillot, Nicolas

    2006-10-01

    Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html PMID:17060203

  11. New substitution models for rooting phylogenetic trees

    PubMed Central

    Williams, Tom A.; Heaps, Sarah E.; Cherlin, Svetlana; Nye, Tom M. W.; Boys, Richard J.; Embley, T. Martin

    2015-01-01

    The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made. PMID:26323766

  12. New substitution models for rooting phylogenetic trees.

    PubMed

    Williams, Tom A; Heaps, Sarah E; Cherlin, Svetlana; Nye, Tom M W; Boys, Richard J; Embley, T Martin

    2015-09-26

    The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made. PMID:26323766

  13. On Tree-Based Phylogenetic Networks.

    PubMed

    Zhang, Louxin

    2016-07-01

    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa. This answers two problems posted by Francis and Steel recently. A byproduct is a computer program for generating random binary phylogenetic networks under the uniform distribution model. PMID:27228397

  14. Transforming phylogenetic networks: Moving beyond tree space.

    PubMed

    Huber, Katharina T; Moulton, Vincent; Wu, Taoyang

    2016-09-01

    Phylogenetic networks are a generalization of phylogenetic trees that are used to represent reticulate evolution. Unrooted phylogenetic networks form a special class of such networks, which naturally generalize unrooted phylogenetic trees. In this paper we define two operations on unrooted phylogenetic networks, one of which is a generalization of the well-known nearest-neighbor interchange (NNI) operation on phylogenetic trees. We show that any unrooted phylogenetic network can be transformed into any other such network using only these operations. This generalizes the well-known fact that any phylogenetic tree can be transformed into any other such tree using only NNI operations. It also allows us to define a generalization of tree space and to define some new metrics on unrooted phylogenetic networks. To prove our main results, we employ some fascinating new connections between phylogenetic networks and cubic graphs that we have recently discovered. Our results should be useful in developing new strategies to search for optimal phylogenetic networks, a topic that has recently generated some interest in the literature, as well as for providing new ways to compare networks. PMID:27224010

  15. A perl package and an alignment tool for phylogenetic networks

    PubMed Central

    Cardona, Gabriel; Rosselló, Francesc; Valiente, Gabriel

    2008-01-01

    Background Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of evolutionary events acting at the population level, like recombination between genes, hybridization between lineages, and lateral gene transfer. While most phylogenetics tools implement a wide range of algorithms on phylogenetic trees, there exist only a few applications to work with phylogenetic networks, none of which are open-source libraries, and they do not allow for the comparative analysis of phylogenetic networks by computing distances between them or aligning them. Results In order to improve this situation, we have developed a Perl package that relies on the BioPerl bundle and implements many algorithms on phylogenetic networks. We have also developed a Java applet that makes use of the aforementioned Perl package and allows the user to make simple experiments with phylogenetic networks without having to develop a program or Perl script by him or herself. Conclusion The Perl package is available as part of the BioPerl bundle, and can also be downloaded. A web-based application is also available (see availability and requirements). The Perl package includes full documentation of all its features. PMID:18371228

  16. The space of ultrametric phylogenetic trees.

    PubMed

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. PMID:27188249

  17. Using tree diversity to compare phylogenetic heuristics

    PubMed Central

    Sul, Seung-Jin; Matthews, Suzanne; Williams, Tiffani L

    2009-01-01

    Background Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Results Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Conclusion Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees—especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest. PMID:19426451

  18. Terrestrial apes and phylogenetic trees

    PubMed Central

    Arsuaga, Juan Luis

    2010-01-01

    The image that best expresses Darwin’s thinking is the tree of life. However, Darwin’s human evolutionary tree lacked almost everything because only the Neanderthals were known at the time and they were considered one extreme expression of our own species. Darwin believed that the root of the human tree was very deep and in Africa. It was not until 1962 that the root was shown to be much more recent in time and definitively in Africa. On the other hand, some neo-Darwinians believed that our family tree was not a tree, because there were no branches, but, rather, a straight stem. The recent years have witnessed spectacular discoveries in Africa that take us close to the origin of the human tree and in Spain at Atapuerca that help us better understand the origin of the Neanderthals as well as our own species. The final form of the tree, and the number of branches, remains an object of passionate debate. PMID:20445090

  19. Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

    PubMed Central

    Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118

  20. Relating phylogenetic trees to transmission trees of infectious disease outbreaks.

    PubMed

    Ypma, Rolf J F; van Ballegooijen, W Marijn; Wallinga, Jacco

    2013-11-01

    Transmission events are the fundamental building blocks of the dynamics of any infectious disease. Much about the epidemiology of a disease can be learned when these individual transmission events are known or can be estimated. Such estimations are difficult and generally feasible only when detailed epidemiological data are available. The genealogy estimated from genetic sequences of sampled pathogens is another rich source of information on transmission history. Optimal inference of transmission events calls for the combination of genetic data and epidemiological data into one joint analysis. A key difficulty is that the transmission tree, which describes the transmission events between infected hosts, differs from the phylogenetic tree, which describes the ancestral relationships between pathogens sampled from these hosts. The trees differ both in timing of the internal nodes and in topology. These differences become more pronounced when a higher fraction of infected hosts is sampled. We show how the phylogenetic tree of sampled pathogens is related to the transmission tree of an outbreak of an infectious disease, by the within-host dynamics of pathogens. We provide a statistical framework to infer key epidemiological and mutational parameters by simultaneously estimating the phylogenetic tree and the transmission tree. We test the approach using simulations and illustrate its use on an outbreak of foot-and-mouth disease. The approach unifies existing methods in the emerging field of phylodynamics with transmission tree reconstruction methods that are used in infectious disease epidemiology. PMID:24037268

  1. Genome-Scale Phylogenetics: Inferring the Plant Tree of Life from 18,896 Gene Trees

    PubMed Central

    Burleigh, J. Gordon; Bansal, Mukul S.; Eulenstein, Oliver; Hartmann, Stefanie; Wehe, André; Vision, Todd J.

    2011-01-01

    Phylogenetic analyses using genome-scale data sets must confront incongruence among gene trees, which in plants is exacerbated by frequent gene duplications and losses. Gene tree parsimony (GTP) is a phylogenetic optimization criterion in which a species tree that minimizes the number of gene duplications induced among a set of gene trees is selected. The run time performance of previous implementations has limited its use on large-scale data sets. We used new software that incorporates recent algorithmic advances to examine the performance of GTP on a plant data set consisting of 18,896 gene trees containing 510,922 protein sequences from 136 plant taxa (giving a combined alignment length of >2.9 million characters). The relationships inferred from the GTP analysis were largely consistent with previous large-scale studies of backbone plant phylogeny and resolved some controversial nodes. The placement of taxa that were present in few gene trees generally varied the most among GTP bootstrap replicates. Excluding these taxa either before or after the GTP analysis revealed high levels of phylogenetic support across plants. The analyses supported magnoliids sister to a eudicot + monocot clade and did not support the eurosid I and II clades. This study presents a nuclear genomic perspective on the broad-scale phylogenic relationships among plants, and it demonstrates that nuclear genes with a history of duplication and loss can be phylogenetically informative for resolving the plant tree of life. PMID:21186249

  2. Constructing Student Problems in Phylogenetic Tree Construction.

    ERIC Educational Resources Information Center

    Brewer, Steven D.

    Evolution is often equated with natural selection and is taught from a primarily functional perspective while comparative and historical approaches, which are critical for developing an appreciation of the power of evolutionary theory, are often neglected. This report describes a study of expert problem-solving in phylogenetic tree construction.…

  3. Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference

    PubMed Central

    Tan, Ge; Muffato, Matthieu; Ledergerber, Christian; Herrero, Javier; Goldman, Nick; Gil, Manuel; Dessimoz, Christophe

    2015-01-01

    Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms. PMID:26031838

  4. Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference.

    PubMed

    Tan, Ge; Muffato, Matthieu; Ledergerber, Christian; Herrero, Javier; Goldman, Nick; Gil, Manuel; Dessimoz, Christophe

    2015-09-01

    Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms. PMID:26031838

  5. Quantifying MCMC exploration of phylogenetic tree space.

    PubMed

    Whidden, Chris; Matsen, Frederick A

    2015-05-01

    In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks. PMID:25631175

  6. Quantifying MCMC Exploration of Phylogenetic Tree Space

    PubMed Central

    Whidden, Chris; Matsen, Frederick A.

    2015-01-01

    In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks. PMID:25631175

  7. PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.

    PubMed

    Xia, Xuhua

    2016-09-01

    While pairwise sequence alignment (PSA) by dynamic programming is guaranteed to generate one of the optimal alignments, multiple sequence alignment (MSA) of highly divergent sequences often results in poorly aligned sequences, plaguing all subsequent phylogenetic analysis. One way to avoid this problem is to use only PSA to reconstruct phylogenetic trees, which can only be done with distance-based methods. I compared the accuracy of this new computational approach (named PhyPA for phylogenetics by pairwise alignment) against the maximum likelihood method using MSA (the ML+MSA approach), based on nucleotide, amino acid and codon sequences simulated with different topologies and tree lengths. I present a surprising discovery that the fast PhyPA method consistently outperforms the slow ML+MSA approach for highly diverged sequences even when all optimization options were turned on for the ML+MSA approach. Only when sequences are not highly diverged (i.e., when a reliable MSA can be obtained) does the ML+MSA approach outperforms PhyPA. The true topologies are always recovered by ML with the true alignment from the simulation. However, with MSA derived from alignment programs such as MAFFT or MUSCLE, the recovered topology consistently has higher likelihood than that for the true topology. Thus, the failure to recover the true topology by the ML+MSA is not because of insufficient search of tree space, but by the distortion of phylogenetic signal by MSA methods. I have implemented in DAMBE PhyPA and two approaches making use of multi-gene data sets to derive phylogenetic support for subtrees equivalent to resampling techniques such as bootstrapping and jackknifing. PMID:27377322

  8. A Phylogenetic Analysis of the Brassicales Clade Based on an Alignment-Free Sequence Comparison Method

    PubMed Central

    Hatje, Klas; Kollmar, Martin

    2012-01-01

    Phylogenetic analyses reveal the evolutionary derivation of species. A phylogenetic tree can be inferred from multiple sequence alignments of proteins or genes. The alignment of whole genome sequences of higher eukaryotes is a computational intensive and ambitious task as is the computation of phylogenetic trees based on these alignments. To overcome these limitations, we here used an alignment-free method to compare genomes of the Brassicales clade. For each nucleotide sequence a Chaos Game Representation (CGR) can be computed, which represents each nucleotide of the sequence as a point in a square defined by the four nucleotides as vertices. Each CGR is therefore a unique fingerprint of the underlying sequence. If the CGRs are divided by grid lines each grid square denotes the occurrence of oligonucleotides of a specific length in the sequence (Frequency Chaos Game Representation, FCGR). Here, we used distance measures between FCGRs to infer phylogenetic trees of Brassicales species. Three types of data were analyzed because of their different characteristics: (A) Whole genome assemblies as far as available for species belonging to the Malvidae taxon. (B) EST data of species of the Brassicales clade. (C) Mitochondrial genomes of the Rosids branch, a supergroup of the Malvidae. The trees reconstructed based on the Euclidean distance method are in general agreement with single gene trees. The Fitch–Margoliash and Neighbor joining algorithms resulted in similar to identical trees. Here, for the first time we have applied the bootstrap re-sampling concept to trees based on FCGRs to determine the support of the branchings. FCGRs have the advantage that they are fast to calculate, and can be used as additional information to alignment based data and morphological characteristics to improve the phylogenetic classification of species in ambiguous cases. PMID:22952468

  9. BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC

    PubMed Central

    Satija, Rahul; Novák, Ádám; Miklós, István; Lyngsø, Rune; Hein, Jotun

    2009-01-01

    Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from PMID:19715598

  10. Visualising very large phylogenetic trees in three dimensional hyperbolic space

    PubMed Central

    Hughes, Timothy; Hyun, Young; Liberles, David A

    2004-01-01

    Background Common existing phylogenetic tree visualisation tools are not able to display readable trees with more than a few thousand nodes. These existing methodologies are based in two dimensional space. Results We introduce the idea of visualising phylogenetic trees in three dimensional hyperbolic space with the Walrus graph visualisation tool and have developed a conversion tool that enables the conversion of standard phylogenetic tree formats to Walrus' format. With Walrus, it becomes possible to visualise and navigate phylogenetic trees with more than 100,000 nodes. Conclusion Walrus enables desktop visualisation of very large phylogenetic trees in 3 dimensional hyperbolic space. This application is potentially useful for visualisation of the tree of life and for functional genomics derivatives, like The Adaptive Evolution Database (TAED). PMID:15117420

  11. DACTAL: divide-and-conquer trees (almost) without alignments

    PubMed Central

    Nelesen, Serita; Liu, Kevin; Wang, Li-San; Linder, C. Randal; Warnow, Tandy

    2012-01-01

    Motivation: While phylogenetic analyses of datasets containing 1000–5000 sequences are challenging for existing methods, the estimation of substantially larger phylogenies poses a problem of much greater complexity and scale. Methods: We present DACTAL, a method for phylogeny estimation that produces trees from unaligned sequence datasets without ever needing to estimate an alignment on the entire dataset. DACTAL combines iteration with a novel divide-and-conquer approach, so that each iteration begins with a tree produced in the prior iteration, decomposes the taxon set into overlapping subsets, estimates trees on each subset, and then combines the smaller trees into a tree on the full taxon set using a new supertree method. We prove that DACTAL is guaranteed to produce the true tree under certain conditions. We compare DACTAL to SATé and maximum likelihood trees on estimated alignments using simulated and real datasets with 1000–27 643 taxa. Results: Our studies show that on average DACTAL yields more accurate trees than the two-phase methods we studied on very large datasets that are difficult to align, and has approximately the same accuracy on the easier datasets. The comparison to SATé shows that both have the same accuracy, but that DACTAL achieves this accuracy in a fraction of the time. Furthermore, DACTAL can analyze larger datasets than SATé, including a dataset with almost 28 000 sequences. Availability: DACTAL source code and results of dataset analyses are available at www.cs.utexas.edu/users/phylo/software/dactal. Contact: tandy@cs.utexas.edu PMID:22689772

  12. Student Interpretations of Phylogenetic Trees in an Introductory Biology Course

    ERIC Educational Resources Information Center

    Dees, Jonathan; Momsen, Jennifer L.; Niemi, Jarad; Montplaisir, Lisa

    2014-01-01

    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa…

  13. Local search for the generalized tree alignment problem

    PubMed Central

    2013-01-01

    Background A phylogeny postulates shared ancestry relationships among organisms in the form of a binary tree. Phylogenies attempt to answer an important question posed in biology: what are the ancestor-descendent relationships between organisms? At the core of every biological problem lies a phylogenetic component. The patterns that can be observed in nature are the product of complex interactions, constrained by the template that our ancestors provide. The problem of simultaneous tree and alignment estimation under Maximum Parsimony is known in combinatorial optimization as the Generalized Tree Alignment Problem (GTAP). The GTAP is the Steiner Tree Problem for the sequence edit distance. Like many biologically interesting problems, the GTAP is NP-Hard. Typically the Steiner Tree is presented under the Manhattan or the Hamming distances. Results Experimentally, the accuracy of the GTAP has been subjected to evaluation. Results show that phylogenies selected using the GTAP from unaligned sequences are competitive with the best methods and algorithms available. Here, we implement and explore experimentally existing and new local search heuristics for the GTAP using simulated and real data. Conclusions The methods presented here improve by more than three orders of magnitude in execution time the best local search heuristics existing to date when applied to real data. PMID:23441880

  14. Multiple sequence alignment: a major challenge to large-scale phylogenetics

    PubMed Central

    Liu, Kevin; Linder, C. Randal; Warnow, Tandy

    2011-01-01

    Over the last decade, dramatic advances have been made in developing methods for large-scale phylogeny estimation, so that it is now feasible for investigators with moderate computational resources to obtain reasonable solutions to maximum likelihood and maximum parsimony, even for datasets with a few thousand sequences. There has also been progress on developing methods for multiple sequence alignment, so that greater alignment accuracy (and subsequent improvement in phylogenetic accuracy) is now possible through automated methods. However, these methods have not been tested under conditions that reflect properties of datasets confronted by large-scale phylogenetic estimation projects. In this paper we report on a study that compares several alignment methods on a benchmark collection of nucleotide sequence datasets of up to 78,132 sequences. We show that as the number of sequences increases, the number of alignment methods that can analyze the datasets decreases. Furthermore, the most accurate alignment methods are unable to analyze the very largest datasets we studied, so that only moderately accurate alignment methods can be used on the largest datasets. As a result, alignments computed for large datasets have relatively large error rates, and maximum likelihood phylogenies computed on these alignments also have high error rates. Therefore, the estimation of highly accurate multiple sequence alignments is a major challenge for Tree of Life projects, and more generally for large-scale systematics studies. PMID:21113338

  15. Student Interpretations of Phylogenetic Trees in an Introductory Biology Course

    PubMed Central

    Dees, Jonathan; Niemi, Jarad; Montplaisir, Lisa

    2014-01-01

    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa relatedness on phylogenetic trees, to measure the prevalence of correct taxa-relatedness interpretations, and to determine how student reasoning and correctness change in response to instruction and over time. Counting synapomorphies and nodes between taxa were the most common forms of incorrect reasoning, which presents a pedagogical dilemma concerning labeled synapomorphies on phylogenetic trees. Students also independently generated an alternative form of correct reasoning using monophyletic groups, the use of which decreased in popularity over time. Approximately half of all students were able to correctly interpret taxa relatedness on phylogenetic trees, and many memorized correct reasoning without understanding its application. Broad initial instruction that allowed students to generate inferences on their own contributed very little to phylogenetic tree understanding, while targeted instruction on evolutionary relationships improved understanding to some extent. Phylogenetic trees, which can directly affect student understanding of evolution, appear to offer introductory biology instructors a formidable pedagogical challenge. PMID:25452489

  16. Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty

    PubMed Central

    Md Mukarram Hossain, A.S.; Blackburne, Benjamin P.; Shah, Abhijeet; Whelan, Simon

    2015-01-01

    Evolutionary studies usually use a two-step process to investigate sequence data. Step one estimates a multiple sequence alignment (MSA) and step two applies phylogenetic methods to ask evolutionary questions of that MSA. Modern phylogenetic methods infer evolutionary parameters using maximum likelihood or Bayesian inference, mediated by a probabilistic substitution model that describes sequence change over a tree. The statistical properties of these methods mean that more data directly translates to an increased confidence in downstream results, providing the substitution model is adequate and the MSA is correct. Many studies have investigated the robustness of phylogenetic methods in the presence of substitution model misspecification, but few have examined the statistical properties of those methods when the MSA is unknown. This simulation study examines the statistical properties of the complete two-step process when inferring sequence divergence and the phylogenetic tree topology. Both nucleotide and amino acid analyses are negatively affected by the alignment step, both through inaccurate guide tree estimates and through overfitting to that guide tree. For many alignment tools these effects become more pronounced when additional sequences are added to the analysis. Nucleotide sequences are particularly susceptible, with MSA errors leading to statistical support for long-branch attraction artifacts, which are usually associated with gross substitution model misspecification. Amino acid MSAs are more robust, but do tend to arbitrarily resolve multifurcations in favor of the guide tree. No inference strategies produce consistently accurate estimates of divergence between sequences, although amino acid MSAs are again more accurate than their nucleotide counterparts. We conclude with some practical suggestions about how to limit the effect of MSA uncertainty on evolutionary inference. PMID:26139831

  17. TCS: a web server for multiple sequence alignment evaluation and phylogenetic reconstruction

    PubMed Central

    Chang, Jia-Ming; Di Tommaso, Paolo; Lefort, Vincent; Gascuel, Olivier; Notredame, Cedric

    2015-01-01

    This article introduces the Transitive Consistency Score (TCS) web server; a service making it possible to estimate the local reliability of protein multiple sequence alignments (MSAs) using the TCS index. The evaluation can be used to identify the aligned positions most likely to contain structurally analogous residues and also most likely to support an accurate phylogenetic reconstruction. The TCS scoring scheme has been shown to be accurate predictor of structural alignment correctness among commonly used methods. It has also been shown to outperform common filtering schemes like Gblocks or trimAl when doing MSA post-processing prior to phylogenetic tree reconstruction. The web server is available from http://tcoffee.crg.cat/tcs. PMID:25855806

  18. W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis.

    PubMed

    Trifinopoulos, Jana; Nguyen, Lam-Tung; von Haeseler, Arndt; Minh, Bui Quang

    2016-07-01

    This article presents W-IQ-TREE, an intuitive and user-friendly web interface and server for IQ-TREE, an efficient phylogenetic software for maximum likelihood analysis. W-IQ-TREE supports multiple sequence types (DNA, protein, codon, binary and morphology) in common alignment formats and a wide range of evolutionary models including mixture and partition models. W-IQ-TREE performs fast model selection, partition scheme finding, efficient tree reconstruction, ultrafast bootstrapping, branch tests, and tree topology tests. All computations are conducted on a dedicated computer cluster and the users receive the results via URL or email. W-IQ-TREE is available at http://iqtree.cibiv.univie.ac.at It is free and open to all users and there is no login requirement. PMID:27084950

  19. Phylogenetic tree construction based on 2D graphical representation

    NASA Astrophysics Data System (ADS)

    Liao, Bo; Shan, Xinzhou; Zhu, Wen; Li, Renfa

    2006-04-01

    A new approach based on the two-dimensional (2D) graphical representation of the whole genome sequence [Bo Liao, Chem. Phys. Lett., 401(2005) 196.] is proposed to analyze the phylogenetic relationships of genomes. The evolutionary distances are obtained through measuring the differences among the 2D curves. The fuzzy theory is used to construct phylogenetic tree. The phylogenetic relationships of H5N1 avian influenza virus illustrate the utility of our approach.

  20. Discriminating the effects of phylogenetic hypothesis, tree resolution and clade age estimates on phylogenetic signal measurements.

    PubMed

    Seger, G D S; Duarte, L D S; Debastiani, V J; Kindel, A; Jarenkow, J A

    2013-09-01

    Understanding how species traits evolved over time is the central question to comprehend assembly rules that govern the phylogenetic structure of communities. The measurement of phylogenetic signal (PS) in ecologically relevant traits is a first step to understand phylogenetically structured community patterns. The different methods available to estimate PS make it difficult to choose which is most appropriate. Furthermore, alternative phylogenetic tree hypotheses, node resolution and clade age estimates might influence PS measurements. In this study, we evaluated to what extent these parameters affect different methods of PS analysis, and discuss advantages and disadvantages when selecting which method to use. We measured fruit/seed traits and flowering/fruiting phenology of endozoochoric species occurring in Southern Brazilian Araucaria forests and evaluated their PS using Mantel regressions, phylogenetic eigenvector regressions (PVR) and K statistic. Mantel regressions always gave less significant results compared to PVR and K statistic in all combinations of phylogenetic trees constructed. Moreover, a better phylogenetic resolution affected PS, independently of the method used to estimate it. Morphological seed traits tended to show higher PS than diaspores traits, while PS in flowering/fruiting phenology depended mostly on the method used to estimate it. This study demonstrates that different PS estimates are obtained depending on the chosen method and the phylogenetic tree resolution. This finding has implications for inferences on phylogenetic niche conservatism or ecological processes determining phylogenetic community structure. PMID:23368095

  1. Colloquium paper: terrestrial apes and phylogenetic trees.

    PubMed

    Arsuaga, Juan Luis

    2010-05-11

    The image that best expresses Darwin's thinking is the tree of life. However, Darwin's human evolutionary tree lacked almost everything because only the Neanderthals were known at the time and they were considered one extreme expression of our own species. Darwin believed that the root of the human tree was very deep and in Africa. It was not until 1962 that the root was shown to be much more recent in time and definitively in Africa. On the other hand, some neo-Darwinians believed that our family tree was not a tree, because there were no branches, but, rather, a straight stem. The recent years have witnessed spectacular discoveries in Africa that take us close to the origin of the human tree and in Spain at Atapuerca that help us better understand the origin of the Neanderthals as well as our own species. The final form of the tree, and the number of branches, remains an object of passionate debate. PMID:20445090

  2. Improving multiple sequence alignment by using better guide trees

    PubMed Central

    2015-01-01

    Progressive sequence alignment is one of the most commonly used method for multiple sequence alignment. Roughly speaking, the method first builds a guide tree, and then aligns the sequences progressively according to the topology of the tree. It is believed that guide trees are very important to progressive alignment; a better guide tree will give an alignment with higher accuracy. Recently, we have proposed an adaptive method for constructing guide trees. This paper studies the quality of the guide trees constructed by such method. Our study showed that our adaptive method can be used to improve the accuracy of many different progressive MSA tools. In fact, we give evidences showing that the guide trees constructed by the adaptive method are among the best. PMID:25859903

  3. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference.

    PubMed

    Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis

    2016-01-01

    Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal

  4. A Genome-Scale Investigation of How Sequence, Function, and Tree-Based Gene Properties Influence Phylogenetic Inference

    PubMed Central

    Shen, Xing-Xing; Salichos, Leonidas; Rokas, Antonis

    2016-01-01

    Molecular phylogenetic inference is inherently dependent on choices in both methodology and data. Many insightful studies have shown how choices in methodology, such as the model of sequence evolution or optimality criterion used, can strongly influence inference. In contrast, much less is known about the impact of choices in the properties of the data, typically genes, on phylogenetic inference. We investigated the relationships between 52 gene properties (24 sequence-based, 19 function-based, and 9 tree-based) with each other and with three measures of phylogenetic signal in two assembled data sets of 2,832 yeast and 2,002 mammalian genes. We found that most gene properties, such as evolutionary rate (measured through the percent average of pairwise identity across taxa) and total tree length, were highly correlated with each other. Similarly, several gene properties, such as gene alignment length, Guanine-Cytosine content, and the proportion of tree distance on internal branches divided by relative composition variability (treeness/RCV), were strongly correlated with phylogenetic signal. Analysis of partial correlations between gene properties and phylogenetic signal in which gene evolutionary rate and alignment length were simultaneously controlled, showed similar patterns of correlations, albeit weaker in strength. Examination of the relative importance of each gene property on phylogenetic signal identified gene alignment length, alongside with number of parsimony-informative sites and variable sites, as the most important predictors. Interestingly, the subsets of gene properties that optimally predicted phylogenetic signal differed considerably across our three phylogenetic measures and two data sets; however, gene alignment length and RCV were consistently included as predictors of all three phylogenetic measures in both yeasts and mammals. These results suggest that a handful of sequence-based gene properties are reliable predictors of phylogenetic signal

  5. Phylogenetic classification and the universal tree.

    PubMed

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished. PMID:10381871

  6. Inferring Epidemic Contact Structure from Phylogenetic Trees

    PubMed Central

    Leventhal, Gabriel E.; Kouyos, Roger; Stadler, Tanja; von Wyl, Viktor; Yerly, Sabine; Böni, Jürg; Cellerai, Cristina; Klimkait, Thomas; Günthard, Huldrych F.; Bonhoeffer, Sebastian

    2012-01-01

    Contact structure is believed to have a large impact on epidemic spreading and consequently using networks to model such contact structure continues to gain interest in epidemiology. However, detailed knowledge of the exact contact structure underlying real epidemics is limited. Here we address the question whether the structure of the contact network leaves a detectable genetic fingerprint in the pathogen population. To this end we compare phylogenies generated by disease outbreaks in simulated populations with different types of contact networks. We find that the shape of these phylogenies strongly depends on contact structure. In particular, measures of tree imbalance allow us to quantify to what extent the contact structure underlying an epidemic deviates from a null model contact network and illustrate this in the case of random mixing. Using a phylogeny from the Swiss HIV epidemic, we show that this epidemic has a significantly more unbalanced tree than would be expected from random mixing. PMID:22412361

  7. Which Phylogenetic Networks are Merely Trees with Additional Arcs?

    PubMed

    Francis, Andrew R; Steel, Mike

    2015-09-01

    A binary phylogenetic network may or may not be obtainable from a tree by the addition of directed edges (arcs) between tree arcs. Here, we establish a precise and easily tested criterion (based on "2-SAT") that efficiently determines whether or not any given network can be realized in this way. Moreover, the proof provides a polynomial-time algorithm for finding one or more trees (when they exist) on which the network can be based. A number of interesting consequences are presented as corollaries; these lead to some further relevant questions and observations, which we outline in the conclusion. PMID:26070685

  8. Reconstruction of phylogenetic trees using the ant colony optimization paradigm.

    PubMed

    Perretto, Mauricio; Lopes, Heitor Silvério

    2005-01-01

    We developed a new approach for the reconstruction of phylogenetic trees using ant colony optimization metaheuristics. A tree is constructed using a fully connected graph and the problem is approached similarly to the well-known traveling salesman problem. This methodology was used to develop an algorithm for constructing a phylogenetic tree using a pheromone matrix. Two data sets were tested with the algorithm: complete mitochondrial genomes from mammals and DNA sequences of the p53 gene from several eutherians. This new methodology was found to be superior to other well-known softwares, at least for this data set. These results are very promising and suggest more efforts for further developments. PMID:16342043

  9. Tree phylogenetic diversity promotes host-parasitoid interactions.

    PubMed

    Staab, Michael; Bruelheide, Helge; Durka, Walter; Michalski, Stefan; Purschke, Oliver; Zhu, Chao-Dong; Klein, Alexandra-Maria

    2016-07-13

    Evidence from grassland experiments suggests that a plant community's phylogenetic diversity (PD) is a strong predictor of ecosystem processes, even stronger than species richness per se This has, however, never been extended to species-rich forests and host-parasitoid interactions. We used cavity-nesting Hymenoptera and their parasitoids collected in a subtropical forest as a model system to test whether hosts, parasitoids, and their interactions are influenced by tree PD and a comprehensive set of environmental variables, including tree species richness. Parasitism rate and parasitoid abundance were positively correlated with tree PD. All variables describing parasitoids decreased with elevation, and were, except parasitism rate, dependent on host abundance. Quantitative descriptors of host-parasitoid networks were independent of the environment. Our study indicates that host-parasitoid interactions in species-rich forests are related to the PD of the tree community, which influences parasitism rates through parasitoid abundance. We show that effects of tree community PD are much stronger than effects of tree species richness, can cascade to high trophic levels, and promote trophic interactions. As during habitat modification phylogenetic information is usually lost non-randomly, even species-rich habitats may not be able to continuously provide the ecosystem process parasitism if the evolutionarily most distinct plant lineages vanish. PMID:27383815

  10. Probability Steiner trees and maximum parsimony in phylogenetic analysis.

    PubMed

    Weng, J F; Mareels, I; Thomas, D A

    2012-06-01

    The phylogenetic tree (PT) problem has been studied by a number of researchers as an application of the Steiner tree problem, a well-known network optimisation problem. Of all the methods developed for phylogenies the maximum parsimony (MP) method is a simple and commonly used method because it relies on directly observable changes in the input nucleotide or amino acid sequences. In this paper we show that the non-uniqueness of the evolutionary pathways in the MP method leads us to consider a new model of PTs. In this so-called probability representation model, for each site a node in a PT is modelled by a probability distribution of nucleotide or amino acid states, and hence the PT at a given site is a probability Steiner tree, i.e. a Steiner tree in a high-dimensional vector space. In spite of the generality of the probability representation model, in this paper we restrict our study to constructing probability phylogenetic trees (PPT) using the parsimony criterion, as well as discussing and comparing our approach with the classical MP method. We show that for a given input set although the optimal topology as well as the total tree length of the PPT is the same as the PT constructed by the classical MP method, the inferred ancestral states and branch lengths are different and the results given by our method provide a plausible alternative to the classical ones. PMID:21706222

  11. Testing robustness of relative complexity measure method constructing robust phylogenetic trees for Galanthus L. Using the relative complexity measure

    PubMed Central

    2013-01-01

    Background Most phylogeny analysis methods based on molecular sequences use multiple alignment where the quality of the alignment, which is dependent on the alignment parameters, determines the accuracy of the resulting trees. Different parameter combinations chosen for the multiple alignment may result in different phylogenies. A new non-alignment based approach, Relative Complexity Measure (RCM), has been introduced to tackle this problem and proven to work in fungi and mitochondrial DNA. Result In this work, we present an application of the RCM method to reconstruct robust phylogenetic trees using sequence data for genus Galanthus obtained from different regions in Turkey. Phylogenies have been analyzed using nuclear and chloroplast DNA sequences. Results showed that, the tree obtained from nuclear ribosomal RNA gene sequences was more robust, while the tree obtained from the chloroplast DNA showed a higher degree of variation. Conclusions Phylogenies generated by Relative Complexity Measure were found to be robust and results of RCM were more reliable than the compared techniques. Particularly, to overcome MSA-based problems, RCM seems to be a reasonable way and a good alternative to MSA-based phylogenetic analysis. We believe our method will become a mainstream phylogeny construction method especially for the highly variable sequence families where the accuracy of the MSA heavily depends on the alignment parameters. PMID:23323678

  12. Walking tree heuristics for biological string alignment, gene location, and phylogenies

    NASA Astrophysics Data System (ADS)

    Cull, P.; Holloway, J. L.; Cavener, J. D.

    1999-03-01

    Basic biological information is stored in strings of nucleic acids (DNA, RNA) or amino acids (proteins). Teasing out the meaning of these strings is a central problem of modern biology. Matching and aligning strings brings out their shared characteristics. Although string matching is well-understood in the edit-distance model, biological strings with transpositions and inversions violate this model's assumptions. We propose a family of heuristics called walking trees to align biologically reasonable strings. Both edit-distance and walking tree methods can locate specific genes within a large string when the genes' sequences are given. When we attempt to match whole strings, the walking tree matches most genes, while the edit-distance method fails. We also give examples in which the walking tree matches substrings even if they have been moved or inverted. The edit-distance method was not designed to handle these problems. We include an example in which the walking tree "discovered" a gene. Calculating scores for whole genome matches gives a method for approximating evolutionary distance. We show two evolutionary trees for the picornaviruses which were computed by the walking tree heuristic. Both of these trees show great similarity to previously constructed trees. The point of this demonstration is that WHOLE genomes can be matched and distances calculated. The first tree was created on a Sequent parallel computer and demonstrates that the walking tree heuristic can be efficiently parallelized. The second tree was created using a network of work stations and demonstrates that there is suffient parallelism in the phylogenetic tree calculation that the sequential walking tree can be used effectively on a network.

  13. Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction.

    PubMed

    Yang, Kuan; Zhang, Liqing

    2008-03-01

    Phylogenetic tree reconstruction requires construction of a multiple sequence alignment (MSA) from sequences. Computationally, it is difficult to achieve an optimal MSA for many sequences. Moreover, even if an optimal MSA is obtained, it may not be the true MSA that reflects the evolutionary history of the underlying sequences. Therefore, errors can be introduced during MSA construction which in turn affects the subsequent phylogenetic tree construction. In order to circumvent this issue, we extend the application of the k-tuple distance to phylogenetic tree reconstruction. The k-tuple distance between two sequences is the sum of the differences in frequency, over all possible tuples of length k, between the sequences and can be estimated without MSAs. It has been traditionally used to build a fast 'guide tree' to assist the construction of MSAs. Using the 1470 simulated sets of sequences generated under different evolutionary scenarios, the neighbor-joining trees and BioNJ trees, we compared the performance of the k-tuple distance with four commonly used distance estimators including Jukes-Cantor, Kimura, F84 and Tamura-Nei. These four distance estimators fall into the category of model-based distance estimators, as each of them takes account of a specific substitution model in order to compute the distance between a pair of already aligned sequences. Results show that trees constructed from the k-tuple distance are more accurate than those from other distances most time; when the divergence between underlying sequences is high, the tree accuracy could be twice or higher using the k-tuple distance than other estimators. Furthermore, as the k-tuple distance voids the need for constructing an MSA, it can save tremendous amount of time for phylogenetic tree reconstructions when the data include a large number of sequences. PMID:18296485

  14. Why abundant tropical tree species are phylogenetically old.

    PubMed

    Wang, Shaopeng; Chen, Anping; Fang, Jingyun; Pacala, Stephen W

    2013-10-01

    Neutral models of species diversity predict patterns of abundance for communities in which all individuals are ecologically equivalent. These models were originally developed for Panamanian trees and successfully reproduce observed distributions of abundance. Neutral models also make macroevolutionary predictions that have rarely been evaluated or tested. Here we show that neutral models predict a humped or flat relationship between species age and population size. In contrast, ages and abundances of tree species in the Panamanian Canal watershed are found to be positively correlated, which falsifies the models. Speciation rates vary among phylogenetic lineages and are partially heritable from mother to daughter species. Variable speciation rates in an otherwise neutral model lead to a demographic advantage for species with low speciation rate. This demographic advantage results in a positive correlation between species age and abundance, as found in the Panamanian tropical forest community. PMID:24043767

  15. Network dynamics of eukaryotic LTR retroelements beyond phylogenetic trees

    PubMed Central

    Llorens, Carlos; Muñoz-Pomer, Alfonso; Bernad, Lucia; Botella, Hector; Moya, Andrés

    2009-01-01

    Background Sequencing projects have allowed diverse retroviruses and LTR retrotransposons from different eukaryotic organisms to be characterized. It is known that retroviruses and other retro-transcribing viruses evolve from LTR retrotransposons and that this whole system clusters into five families: Ty3/Gypsy, Retroviridae, Ty1/Copia, Bel/Pao and Caulimoviridae. Phylogenetic analyses usually show that these split into multiple distinct lineages but what is yet to be understood is how deep evolution occurred in this system. Results We combined phylogenetic and graph analyses to investigate the history of LTR retroelements both as a tree and as a network. We used 268 non-redundant LTR retroelements, many of them introduced for the first time in this work, to elucidate all possible LTR retroelement phylogenetic patterns. These were superimposed over the tree of eukaryotes to investigate the dynamics of the system, at distinct evolutionary times. Next, we investigated phenotypic features such as duplication and variability of amino acid motifs, and several differences in genomic ORF organization. Using this information we characterized eight reticulate evolution markers to construct phenotypic network models. Conclusion The evolutionary history of LTR retroelements can be traced as a time-evolving network that depends on phylogenetic patterns, epigenetic host-factors and phenotypic plasticity. The Ty1/Copia and the Ty3/Gypsy families represent the oldest patterns in this network that we found mimics eukaryotic macroevolution. The emergence of the Bel/Pao, Retroviridae and Caulimoviridae families in this network can be related with distinct inflations of the Ty3/Gypsy family, at distinct evolutionary times. This suggests that Ty3/Gypsy ancestors diversified much more than their Ty1/Copia counterparts, at distinct geological eras. Consistent with the principle of preferential attachment, the connectivities among phenotypic markers, taken as network

  16. Phylogenetic Tree Reconstruction Accuracy and Model Fit when Proportions of Variable Sites Change across the Tree

    PubMed Central

    Grievink, Liat Shavit; Penny, David; Hendy, Michael D.; Holland, Barbara R.

    2010-01-01

    Commonly used phylogenetic models assume a homogeneous process through time in all parts of the tree. However, it is known that these models can be too simplistic as they do not account for nonhomogeneous lineage-specific properties. In particular, it is now widely recognized that as constraints on sequences evolve, the proportion and positions of variable sites can vary between lineages causing heterotachy. The extent to which this model misspecification affects tree reconstruction is still unknown. Here, we evaluate the effect of changes in the proportions and positions of variable sites on model fit and tree estimation. We consider 5 current models of nucleotide sequence evolution in a Bayesian Markov chain Monte Carlo framework as well as maximum parsimony (MP). We show that for a tree with 4 lineages where 2 nonsister taxa undergo a change in the proportion of variable sites tree reconstruction under the best-fitting model, which is chosen using a relative test, often results in the wrong tree. In this case, we found that an absolute test of model fit is a better predictor of tree estimation accuracy. We also found further evidence that MP is not immune to heterotachy. In addition, we show that increased sampling of taxa that have undergone a change in proportion and positions of variable sites is critical for accurate tree reconstruction. PMID:20525636

  17. How Ecology and Landscape Dynamics Shape Phylogenetic Trees.

    PubMed

    Gascuel, Fanny; Ferrière, Régis; Aguilée, Robin; Lambert, Amaury

    2015-07-01

    Whether biotic or abiotic factors are the dominant drivers of clade diversification is a long-standing question in evolutionary biology. The ubiquitous patterns of phylogenetic imbalance and branching slowdown have been taken as supporting the role of ecological niche filling and spatial heterogeneity in ecological features, and thus of biotic processes, in diversification. However, a proper theoretical assessment of the relative roles of biotic and abiotic factors in macroevolution requires models that integrate both types of factors, and such models have been lacking. In this study, we use an individual-based model to investigate the temporal patterns of diversification driven by ecological speciation in a stochastically fluctuating geographic landscape. The model generates phylogenies whose shape evolves as the clade ages. Stabilization of tree shape often occurs after ecological saturation, revealing species turnover caused by competition and demographic stochasticity. In the initial phase of diversification (allopatric radiation into an empty landscape), trees tend to be unbalanced and branching slows down. As diversification proceeds further due to landscape dynamics, balance and branching tempo may increase and become positive. Three main conclusions follow. First, the phylogenies of ecologically saturated clades do not always exhibit branching slowdown. Branching slowdown requires that competition be wide or heterogeneous across the landscape, or that the characteristics of landscape dynamics vary geographically. Conversely, branching acceleration is predicted under narrow competition or frequent local catastrophes. Second, ecological heterogeneity does not necessarily cause phylogenies to be unbalanced--short time in geographical isolation or frequent local catastrophes may lead to balanced trees despite spatial heterogeneity. Conversely, unbalanced trees can emerge without spatial heterogeneity, notably if competition is wide. Third, short isolation time

  18. Characterization of a branch of the phylogenetic tree

    SciTech Connect

    Samuel, Stuart A.; Weng, Gezhi

    2003-04-11

    We use a combination of analytic models and computer simulations to gain insight into the dynamics of evolution. Our results suggest that certain interesting phenomena should eventually emerge from the fossil record. For example, there should be a ''tortoise and hare effect'': Those genera with the smallest species death rate are likely to survive much longer than genera with large species birth and death rates. A complete characterization of the behavior of a branch of the phylogenetic tree corresponding to a genus and accurate mathematical representations of the various stages are obtained. We apply our results to address certain controversial issues that have arisen in paleontology such as the importance of punctuated equilibrium and whether unique Cambrian phyla have survived to the present.

  19. Characterization of a branch of the phylogenetic tree.

    PubMed

    Samuel, Stuart A; Weng, Gezhi

    2003-02-21

    We use a combination of analytic models and computer simulations to gain insight into the dynamics of evolution. Our results suggest that certain interesting phenomena should eventually emerge from the fossil record. For example, there should be a "tortoise and hare effect": those genera with the smallest species death rate are likely to survive much longer than genera with large species birth and death rates. A complete characterization of the behavior of a branch of the phylogenetic tree corresponding to a genus and accurate mathematical representations of the various stages are obtained. We apply our results to address certain controversial issues that have arisen in paleontology such as the importance of punctuated equilibrium and whether unique Cambrian phyla have survived to the present. PMID:12623281

  20. Large-Scale Multiple Sequence Alignment and Tree Estimation Using SATé

    PubMed Central

    Liu, Kevin; Warnow, Tandy

    2016-01-01

    SATé is a method for estimating multiple sequence alignments and trees that has been shown to produce highly accurate results for datasets with large numbers of sequences. Running SATé using its default settings is very simple, but improved accuracy can be obtained by modifying its algorithmic parameters. We provide a detailed introduction to the algorithmic approach used by SATé, and instructions for running a SATé analysis using the GUI under default settings. We also provide a discussion of how to modify these settings to obtain improved results, and how to use SATé in a phylogenetic analysis pipeline. PMID:24170406

  1. Phylogenetics.

    PubMed

    Sleator, Roy D

    2011-04-01

    The recent rapid expansion in the DNA and protein databases, arising from large-scale genomic and metagenomic sequence projects, has forced significant development in the field of phylogenetics: the study of the evolutionary relatedness of the planet's inhabitants. Advances in phylogenetic analysis have greatly transformed our view of the landscape of evolutionary biology, transcending the view of the tree of life that has shaped evolutionary theory since Darwinian times. Indeed, modern phylogenetic analysis no longer focuses on the restricted Darwinian-Mendelian model of vertical gene transfer, but must also consider the significant degree of lateral gene transfer, which connects and shapes almost all living things. Herein, I review the major tree-building methods, their strengths, weaknesses and future prospects. PMID:21249334

  2. AMY-tree: an algorithm to use whole genome SNP calling for Y chromosomal phylogenetic applications

    PubMed Central

    2013-01-01

    Background Due to the rapid progress of next-generation sequencing (NGS) facilities, an explosion of human whole genome data will become available in the coming years. These data can be used to optimize and to increase the resolution of the phylogenetic Y chromosomal tree. Moreover, the exponential growth of known Y chromosomal lineages will require an automatic determination of the phylogenetic position of an individual based on whole genome SNP calling data and an up to date Y chromosomal tree. Results We present an automated approach, ‘AMY-tree’, which is able to determine the phylogenetic position of a Y chromosome using a whole genome SNP profile, independently from the NGS platform and SNP calling program, whereby mistakes in the SNP calling or phylogenetic Y chromosomal tree are taken into account. Moreover, AMY-tree indicates ambiguities within the present phylogenetic tree and points out new Y-SNPs which may be phylogenetically relevant. The AMY-tree software package was validated successfully on 118 whole genome SNP profiles of 109 males with different origins. Moreover, support was found for an unknown recurrent mutation, wrong reported mutation conversions and a large amount of new interesting Y-SNPs. Conclusions Therefore, AMY-tree is a useful tool to determine the Y lineage of a sample based on SNP calling, to identify Y-SNPs with yet unknown phylogenetic position and to optimize the Y chromosomal phylogenetic tree in the future. AMY-tree will not add lineages to the existing phylogenetic tree of the Y-chromosome but it is the first step to analyse whole genome SNP profiles in a phylogenetic framework. PMID:23405914

  3. Impact of gene family evolutionary histories on phylogenetic species tree inference by gene tree parsimony.

    PubMed

    Shi, Tao

    2016-03-01

    Complicated history of gene duplication and loss brings challenge to molecular phylogenetic inference, especially in deep phylogenies. However, phylogenomic approaches, such as gene tree parsimony (GTP), show advantage over some other approaches in its ability to use gene families with duplications. GTP searches the 'optimal' species tree by minimizing the total cost of biological events such as duplications, but accuracy of GTP and phylogenetic signal in the context of different gene families with distinct histories of duplication and loss are unclear. To evaluate how different evolutionary properties of different gene families can impact on species tree inference, 3900 gene families from seven angiosperms encompassing a wide range of gene content, lineage-specific expansions and contractions were analyzed. It was found that the gene content and total duplication number in a gene family strongly influence species tree inference accuracy, with the highest accuracy achieved at either very low or very high gene content (or duplication number) and lowest accuracy centered in intermediate gene content (or duplication number), as the relationship can fit a binomial regression. Besides, for gene families of similar level of average gene content, those with relatively higher lineage-specific expansion or duplication rates tend to show lower accuracy. Additional correlation tests support that high accuracy for those gene families with large gene content may rely on abundant ancestral copies to provide many subtrees to resolve conflicts, whereas high accuracy for single or low copy gene families are just subject to sequence substitution per se. Very low accuracy reached by gene families of intermediate gene content or duplication number can be due to insufficient subtrees to resolve the conflicts from loss of alternative copies. As these evolutionary properties can significantly influence species tree accuracy, I discussed the potential weighting of the duplication cost by

  4. Self-Organized Criticality in Phylogenetic-Like Tree Growths

    NASA Astrophysics Data System (ADS)

    Vandewalle, N.; Ausloos, M.

    1995-08-01

    A simple stochastic model of Darwinistic evolution generating phylogenetic-like trees is developed. The model is based on a branching process taking competition-correlation effects into account. In presence of finite and short range correlations, the process self-organizes into a critical steady-state in which intermittent bursts of activity of all sizes are generated. On a geological-like time scale, this behaviour agrees with punctuated equilibrium features of biological evolution. The simulated phylogenetic-like trees are found to be self-similar. The dynamics of the transient regimes show a power law decrease of the order parameter towards the 0^+ value which characterizes an unstable critical state. The genetic range k of competition-correlations between living species is found to be a relevant parameter which determines the universality class of the evolution process. An infinite competition-correlation range destroys however the self-organized critical behaviour. The fractal dimension D_f of the phylogenetic-like trees increases from 2.0 to infinity as k goes from 1 to infinity. The critical exponent tau of avalanche size-distribution decreases from about 3/2 (for k=1) and reaches about 1.2 for k=10. A hyperscaling relation seems to relate the various universality classes. Through a Un simple modèle stochastique d'évolution Darwinienne engendrant des arbres phylogénétiques est développé. Le modèle est basé sur un processus de branchement tenant compte d'effets de compétitions et de corrélations. En présence de corrélations à courte portée, le processus s'auto-organise dans un état critique caractérisé par l'intermittence d'explosions d'activité de toutes tailles. Sur une échelle pseudo-géologique, ce comportement est en accord avec les caractéristiques ponctualistes de l'évolution biologique. Les arbres phylogénétiques simulés sont auto-similaires. La dynamique des régimes transitoires montre une décroissance en loi de puissance du

  5. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses

    PubMed Central

    Capella-Gutiérrez, Salvador; Silla-Martínez, José M.; Gabaldón, Toni

    2009-01-01

    Summary: Multiple sequence alignments are central to many areas of bioinformatics. It has been shown that the removal of poorly aligned regions from an alignment increases the quality of subsequent analyses. Such an alignment trimming phase is complicated in large-scale phylogenetic analyses that deal with thousands of alignments. Here, we present trimAl, a tool for automated alignment trimming, which is especially suited for large-scale phylogenetic analyses. trimAl can consider several parameters, alone or in multiple combinations, for selecting the most reliable positions in the alignment. These include the proportion of sequences with a gap, the level of amino acid similarity and, if several alignments for the same set of sequences are provided, the level of consistency across different alignments. Moreover, trimAl can automatically select the parameters to be used in each specific alignment so that the signal-to-noise ratio is optimized. Availability: trimAl has been written in C++, it is portable to all platforms. trimAl is freely available for download (http://trimal.cgenomics.org) and can be used online through the Phylemon web server (http://phylemon2.bioinfo.cipf.es/). Supplementary Material is available at http://trimal.cgenomics.org/publications. Contact: tgabaldon@crg.es PMID:19505945

  6. Community Phylogenetics: Assessing Tree Reconstruction Methods and the Utility of DNA Barcodes

    PubMed Central

    Boyle, Elizabeth E.; Adamowicz, Sarah J.

    2015-01-01

    Studies examining phylogenetic community structure have become increasingly prevalent, yet little attention has been given to the influence of the input phylogeny on metrics that describe phylogenetic patterns of co-occurrence. Here, we examine the influence of branch length, tree reconstruction method, and amount of sequence data on measures of phylogenetic community structure, as well as the phylogenetic signal (Pagel’s λ) in morphological traits, using Trichoptera larval communities from Churchill, Manitoba, Canada. We find that model-based tree reconstruction methods and the use of a backbone family-level phylogeny improve estimations of phylogenetic community structure. In addition, trees built using the barcode region of cytochrome c oxidase subunit I (COI) alone accurately predict metrics of phylogenetic community structure obtained from a multi-gene phylogeny. Input tree did not alter overall conclusions drawn for phylogenetic signal, as significant phylogenetic structure was detected in two body size traits across input trees. As the discipline of community phylogenetics continues to expand, it is important to investigate the best approaches to accurately estimate patterns. Our results suggest that emerging large datasets of DNA barcode sequences provide a vast resource for studying the structure of biological communities. PMID:26110886

  7. Construction of molecular evolutionary phylogenetic trees from DNA sequences based on minimum complexity principle.

    PubMed

    Ren, F; Tanaka, H; Gojobori, T

    1995-02-01

    Ever since the discovery of a molecular clock, many methods have been developed to reconstruct the molecular evolutionary phylogenetic trees. In this paper, we deal with the problem from the viewpoint of an inductive inference and apply Rissanen's minimum description length principle to extract the minimum complexity phylogenetic tree. Our method describes the complexity of the molecular phylogenetic tree by three terms which are related to the tree topology, the sum of the branch lengths and the difference between the model and the data measured by logarithmic likelihood. Five mitochondrial DNA sequences, from the human, the common chimpanzee, the pygmy chimpanzee, the gorilla and the orangutan, are used for investigating the validity of this method. It is suggested that this method might be superior to the traditional method in that it still shows good accuracy even near the root of phylogenetic trees. PMID:7796581

  8. Edge-Related Loss of Tree Phylogenetic Diversity in the Severely Fragmented Brazilian Atlantic Forest

    PubMed Central

    Santos, Bráulio A.; Arroyo-Rodríguez, Víctor; Moreno, Claudia E.; Tabarelli, Marcelo

    2010-01-01

    Deforestation and forest fragmentation are known major causes of nonrandom extinction, but there is no information about their impact on the phylogenetic diversity of the remaining species assemblages. Using a large vegetation dataset from an old hyper-fragmented landscape in the Brazilian Atlantic rainforest we assess whether the local extirpation of tree species and functional impoverishment of tree assemblages reduce the phylogenetic diversity of the remaining tree assemblages. We detected a significant loss of tree phylogenetic diversity in forest edges, but not in core areas of small (<80 ha) forest fragments. This was attributed to a reduction of 11% in the average phylogenetic distance between any two randomly chosen individuals from forest edges; an increase of 17% in the average phylogenetic distance to closest non-conspecific relative for each individual in forest edges; and to the potential manifestation of late edge effects in the core areas of small forest remnants. We found no evidence supporting fragmentation-induced phylogenetic clustering or evenness. This could be explained by the low phylogenetic conservatism of key life-history traits corresponding to vulnerable species. Edge effects must be reduced to effectively protect tree phylogenetic diversity in the severely fragmented Brazilian Atlantic forest. PMID:20838613

  9. Unrooted unordered homeomorphic subtree alignment of RNA trees.

    PubMed

    Milo, Nimrod; Zakov, Shay; Katzenelson, Erez; Bachmat, Eitan; Dinitz, Yefim; Ziv-Ukelson, Michal

    2013-01-01

    : We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining global or local, ordered or unordered, and rooted or unrooted tree alignments. Our algorithm generalizes previous algorithms that either solved the problem in an asymmetric manner, or were restricted to the rooted and/or ordered cases. Focusing here on the most general unrooted unordered case, we show that for input trees T and S, our algorithm has an O(nTnS + min(dT,dS)LTLS) time complexity, where nT,LT and dT are the number of nodes, the number of leaves, and the maximum node degree in T, respectively (satisfying dT ≤ LT ≤ nT), and similarly for nS,LS and dS with respect to the tree S. This improves the time complexity of previous algorithms for less general variants of the problem.In order to obtain this time bound for HSA, we developed new algorithms for a generalized variant of the Min-Cost Bipartite Matching problem (MCM), as well as to two derivatives of this problem, entitled All-Cavity-MCM and All-Pairs-Cavity-MCM. For two input sets of size n and m, where n ≤ m, MCM and both its cavity derivatives are solved in O(n3 + nm) time, without the usage of priority queues (e.g. Fibonacci heaps) or other complex data structures. This gives the first cubic time algorithm for All-Pairs-Cavity-MCM, and improves the running times of MCM and All-Cavity-MCM problems in the unbalanced case where n ≪ m.We implemented the algorithm (in all modes mentioned above) as a graphical software tool which computes and displays similarities between secondary structures of RNA given as input, and employed it to a preliminary experiment in which we ran all-against-all inter-family pairwise alignments of RNAse P and Hammerhead RNA

  10. [A bird's eye view of the algorithms and software packages for reconstructing phylogenetic trees].

    PubMed

    Zhang, Li-Na; Rong, Chang-He; He, Yuan; Guan, Qiong; He, Bin; Zhu, Xing-Wen; Liu, Jia-Ni; Chen, Hong-Ju

    2013-12-01

    The prototype phylogenetic tree, i.e., evolutionary "tree" or "tree of life", was first conceived by Charles Darwin in his seminal book "The Origin of Species", and its reconstructions have been approached by generations of biologists ever since. In this article, we briefly reviewed the major algorithms and software packages for reconstructing phylogenetic trees. Specifically we discuss four categories of phylogeny algorithms including distance-matrix, maximum parsimony, maximum likelihood, and Bayesian framework, as well as software packages (PHYLIP, MEGA, MrBayes) based on them. PMID:24415699

  11. A novel approach to phylogenetic tree construction using stochastic optimization and clustering

    PubMed Central

    Qin, Ling; Chen, Yixin; Pan, Yi; Chen, Ling

    2006-01-01

    Background The problem of inferring the evolutionary history and constructing the phylogenetic tree with high performance has become one of the major problems in computational biology. Results A new phylogenetic tree construction method from a given set of objects (proteins, species, etc.) is presented. As an extension of ant colony optimization, this method proposes an adaptive phylogenetic clustering algorithm based on a digraph to find a tree structure that defines the ancestral relationships among the given objects. Conclusion Our phylogenetic tree construction method is tested to compare its results with that of the genetic algorithm (GA). Experimental results show that our algorithm converges much faster and also achieves higher quality than GA. PMID:17217517

  12. Reverse transcriptase domain sequences from tree peony (Paeonia suffruticosa) long terminal repeat retrotransposons: sequence characterization and phylogenetic analysis

    PubMed Central

    Guo, Da-Long; Hou, Xiao-Gai; Jia, Tian

    2014-01-01

    Tree peony is an important horticultural plant worldwide of great ornamental and medicinal value. Long terminal repeat retrotransposons (LTR-retrotransposons) are the major components of most plant genomes and can substantially impact the genome in many ways. It is therefore crucial to understand their sequence characteristics, genetic distribution and transcriptional activity; however, no information about them is available in tree peony. Ty1-copia-like reverse transcriptase sequences were amplified from tree peony genomic DNA by polymerase chain reaction (PCR) with degenerate oligonucleotide primers corresponding to highly conserved domains of the Ty1-copia-like retrotransposons in this study. PCR fragments of roughly 270 bp were isolated and cloned, and 33 sequences were obtained. According to alignment and phylogenetic analysis, all sequences were divided into six families. The observed difference in the degree of nucleotide sequence similarity is an indication for high level of sequence heterogeneity among these clones. Most of these sequences have a frame shift, a stop codon, or both. Dot-blot analysis revealed distribution of these sequences in all the studied tree peony species. However, different hybridization signals were detected among them, which is in agreement with previous systematics studies. Reverse transcriptase PCR (RT-PCR) indicated that Ty1-copia retrotransposons in tree peony were transcriptionally inactive. The results provide basic genetic and evolutionary information of tree peony genome, and will provide valuable information for the further utilization of retrotransposons in tree peony. PMID:26019529

  13. PhySortR: a fast, flexible tool for sorting phylogenetic trees in R.

    PubMed

    Stephens, Timothy G; Bhattacharya, Debashish; Ragan, Mark A; Chan, Cheong Xin

    2016-01-01

    A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, particularly robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for classifying phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa based on tip labels (i.e., leaves) on a tree, with customisable options to assess clades within the context of the whole tree. Using simulated and empirical datasets, we demonstrate the potential and scalability of PhySortR in analysis of thousands of phylogenetic trees without a priori assumption of tree-rooting, and in yielding readily interpretable trees that unambiguously satisfy the query. PhySortR is a command-line tool that is freely available and easily automatable. PMID:27190724

  14. PhySortR: a fast, flexible tool for sorting phylogenetic trees in R

    PubMed Central

    Stephens, Timothy G.; Bhattacharya, Debashish; Ragan, Mark A.

    2016-01-01

    A frequent bottleneck in interpreting phylogenomic output is the need to screen often thousands of trees for features of interest, particularly robust clades of specific taxa, as evidence of monophyletic relationship and/or reticulated evolution. Here we present PhySortR, a fast, flexible R package for classifying phylogenetic trees. Unlike existing utilities, PhySortR allows for identification of both exclusive and non-exclusive clades uniting the target taxa based on tip labels (i.e., leaves) on a tree, with customisable options to assess clades within the context of the whole tree. Using simulated and empirical datasets, we demonstrate the potential and scalability of PhySortR in analysis of thousands of phylogenetic trees without a priori assumption of tree-rooting, and in yielding readily interpretable trees that unambiguously satisfy the query. PhySortR is a command-line tool that is freely available and easily automatable. PMID:27190724

  15. EvolView, an online tool for visualizing, annotating and managing phylogenetic trees.

    PubMed

    Zhang, Huangkai; Gao, Shenghan; Lercher, Martin J; Hu, Songnian; Chen, Wei-Hua

    2012-07-01

    EvolView is a web application for visualizing, annotating and managing phylogenetic trees. First, EvolView is a phylogenetic tree viewer and customization tool; it visualizes trees in various formats, customizes them through built-in functions that can link information from external datasets, and exports the customized results to publication-ready figures. Second, EvolView is a tree and dataset management tool: users can easily organize related trees into distinct projects, add new datasets to trees and edit and manage existing trees and datasets. To make EvolView easy to use, it is equipped with an intuitive user interface. With a free account, users can save data and manipulations on the EvolView server. EvolView is freely available at: http://www.evolgenius.info/evolview.html. PMID:22695796

  16. Universal Artifacts Affect the Branching of Phylogenetic Trees, Not Universal Scaling Laws

    PubMed Central

    Altaba, Cristian R.

    2009-01-01

    Background The superficial resemblance of phylogenetic trees to other branching structures allows searching for macroevolutionary patterns. However, such trees are just statistical inferences of particular historical events. Recent meta-analyses report finding regularities in the branching pattern of phylogenetic trees. But is this supported by evidence, or are such regularities just methodological artifacts? If so, is there any signal in a phylogeny? Methodology In order to evaluate the impact of polytomies and imbalance on tree shape, the distribution of all binary and polytomic trees of up to 7 taxa was assessed in tree-shape space. The relationship between the proportion of outgroups and the amount of imbalance introduced with them was assessed applying four different tree-building methods to 100 combinations from a set of 10 ingroup and 9 outgroup species, and performing covariance analyses. The relevance of this analysis was explored taking 61 published phylogenies, based on nucleic acid sequences and involving various taxa, taxonomic levels, and tree-building methods. Principal Findings All methods of phylogenetic inference are quite sensitive to the artifacts introduced by outgroups. However, published phylogenies appear to be subject to a rather effective, albeit rather intuitive control against such artifacts. The data and methods used to build phylogenetic trees are varied, so any meta-analysis is subject to pitfalls due to their uneven intrinsic merits, which translate into artifacts in tree shape. The binary branching pattern is an imposition of methods, and seldom reflects true relationships in intraspecific analyses, yielding artifactual polytomies in short trees. Above the species level, the departure of real trees from simplistic random models is caused at least by two natural factors –uneven speciation and extinction rates; and artifacts such as choice of taxa included in the analysis, and imbalance introduced by outgroups and basal paraphyletic

  17. Climate-driven extinctions shape the phylogenetic structure of temperate tree floras.

    PubMed

    Eiserhardt, Wolf L; Borchsenius, Finn; Plum, Christoffer M; Ordonez, Alejandro; Svenning, Jens-Christian

    2015-03-01

    When taxa go extinct, unique evolutionary history is lost. If extinction is selective, and the intrinsic vulnerabilities of taxa show phylogenetic signal, more evolutionary history may be lost than expected under random extinction. Under what conditions this occurs is insufficiently known. We show that late Cenozoic climate change induced phylogenetically selective regional extinction of northern temperate trees because of phylogenetic signal in cold tolerance, leading to significantly and substantially larger than random losses of phylogenetic diversity (PD). The surviving floras in regions that experienced stronger extinction are phylogenetically more clustered, indicating that non-random losses of PD are of increasing concern with increasing extinction severity. Using simulations, we show that a simple threshold model of survival given a physiological trait with phylogenetic signal reproduces our findings. Our results send a strong warning that we may expect future assemblages to be phylogenetically and possibly functionally depauperate if anthropogenic climate change affects taxa similarly. PMID:25604755

  18. Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications

    PubMed Central

    Cvicek, Vaclav; Goddard, William A.; Abrol, Ravinder

    2016-01-01

    The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs

  19. Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications.

    PubMed

    Cvicek, Vaclav; Goddard, William A; Abrol, Ravinder

    2016-03-01

    The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs

  20. An Alignment-Free Approach for Eukaryotic ITS2 Annotation and Phylogenetic Inference

    PubMed Central

    Hidalgo-Yanes, Pedro I.; Pérez-Castillo, Yunierkis; Molina-Ruiz, Reinaldo; Marchal, Kathleen; Vasconcelos, Vítor; Antunes, Agostinho

    2011-01-01

    The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp

  1. Phylogenetic Impoverishment of Amazonian Tree Communities in an Experimentally Fragmented Forest Landscape

    PubMed Central

    Santos, Bráulio A.; Tabarelli, Marcelo; Melo, Felipe P. L.; Camargo, José L. C.; Andrade, Ana; Laurance, Susan G.; Laurance, William F.

    2014-01-01

    Amazonian rainforests sustain some of the richest tree communities on Earth, but their ecological and evolutionary responses to human threats remain poorly known. We used one of the largest experimental datasets currently available on tree dynamics in fragmented tropical forests and a recent phylogeny of angiosperms to test whether tree communities have lost phylogenetic diversity since their isolation about two decades previously. Our findings revealed an overall trend toward phylogenetic impoverishment across the experimentally fragmented landscape, irrespective of whether tree communities were in 1-ha, 10-ha, or 100-ha forest fragments, near forest edges, or in continuous forest. The magnitude of the phylogenetic diversity loss was low (<2% relative to before-fragmentation values) but widespread throughout the study landscape, occurring in 32 of 40 1-ha plots. Consistent with this loss in phylogenetic diversity, we observed a significant decrease of 50% in phylogenetic dispersion since forest isolation, irrespective of plot location. Analyses based on tree genera that have significantly increased (28 genera) or declined (31 genera) in abundance and basal area in the landscape revealed that increasing genera are more phylogenetically related than decreasing ones. Also, the loss of phylogenetic diversity was greater in tree communities where increasing genera proliferated and decreasing genera reduced their importance values, suggesting that this taxonomic replacement is partially underlying the phylogenetic impoverishment at the landscape scale. This finding has clear implications for the current debate about the role human-modified landscapes play in sustaining biodiversity persistence and key ecosystem services, such as carbon storage. Although the generalization of our findings to other fragmented tropical forests is uncertain, it could negatively affect ecosystem productivity and stability and have broader impacts on coevolved organisms. PMID:25409011

  2. Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa

    PubMed Central

    2011-01-01

    Background Failure to account for covariation patterns in helical regions of ribosomal RNA (rRNA) genes has the potential to misdirect the estimation of the phylogenetic signal of the data. Furthermore, the extremes of length variation among taxa, combined with regional substitution rate variation can mislead the alignment of rRNA sequences and thus distort subsequent tree reconstructions. However, recent developments in phylogenetic methodology now allow a comprehensive integration of secondary structures in alignment and tree reconstruction analyses based on rRNA sequences, which has been shown to correct some of these problems. Here, we explore the potentials of RNA substitution models and the interactions of specific model setups with the inherent pattern of covariation in rRNA stems and substitution rate variation among loop regions. Results We found an explicit impact of RNA substitution models on tree reconstruction analyses. The application of specific RNA models in tree reconstructions is hampered by interaction between the appropriate modelling of covarying sites in stem regions, and excessive homoplasy in some loop regions. RNA models often failed to recover reasonable trees when single-stranded regions are excessively homoplastic, because these regions contribute a greater proportion of the data when covarying sites are essentially downweighted. In this context, the RNA6A model outperformed all other models, including the more parametrized RNA7 and RNA16 models. Conclusions Our results depict a trade-off between increased accuracy in estimation of interdependencies in helical regions with the risk of magnifying positions lacking phylogenetic signal. We can therefore conclude that caution is warranted when applying rRNA covariation models, and suggest that loop regions be independently screened for phylogenetic signal, and eliminated when they are indistinguishable from random noise. In addition to covariation and homoplasy, other factors, like non

  3. The impact of rRNA secondary structure consideration in alignment and tree reconstruction: simulated data and a case study on the phylogeny of hexapods.

    PubMed

    Letsch, Harald O; Kück, Patrick; Stocsits, Roman R; Misof, Bernhard

    2010-11-01

    The use of secondary structures has been advocated to improve both the alignment and the tree reconstruction processes of ribosomal RNA (rRNA) data sets. We used simulated and empirical rRNA data to test the impact of secondary structure consideration in both steps of molecular phylogenetic analyses. A simulation approach was used to generate realistic rRNA data sets based on real 16S, 18S, and 28S sequences and structures in combination with different branch length and topologies. Alignment and tree reconstruction performance of four recent structural alignment methods was compared with exclusively sequence-based approaches. As empirical data, we used a hexapod rRNA data set to study the influence of nucleotide interdependencies in sequence alignment and tree reconstruction. Structural alignment methods delivered significantly better sequence alignments compared with pure sequence-based methods. Also, structural alignment methods delivered better trees judged by topological congruence to simulation base trees. However, the advantage of structural alignments was less pronounced and even vanished in several instances. For simulated data, application of mixed RNA/DNA models to stems and loops, respectively, led to significantly shorter branches. The application of mixed RNA/DNA models in the hexapod analyses delivered partly implausible relationships. This can be interpreted as a stronger sensitivity of mixed model setups to nonphylogenetic signal. Secondary structure consideration clearly influenced sequence alignment and tree reconstruction of ribosomal genes. Although sequence alignment quality can considerably be improved by the use of secondary structure information, the application of mixed models in tree reconstructions needs further studies to understand the observed effects. PMID:20530152

  4. An approximately unbiased test of phylogenetic tree selection.

    PubMed

    Shimodaira, Hidetoshi

    2002-06-01

    An approximately unbiased (AU) test that uses a newly devised multiscale bootstrap technique was developed for general hypothesis testing of regions in an attempt to reduce test bias. It was applied to maximum-likelihood tree selection for obtaining the confidence set of trees. The AU test is based on the theory of Efron et al. (Proc. Natl. Acad. Sci. USA 93:13429-13434; 1996), but the new method provides higher-order accuracy yet simpler implementation. The AU test, like the Shimodaira-Hasegawa (SH) test, adjusts the selection bias overlooked in the standard use of the bootstrap probability and Kishino-Hasegawa tests. The selection bias comes from comparing many trees at the same time and often leads to overconfidence in the wrong trees. The SH test, though safe to use, may exhibit another type of bias such that it appears conservative. Here I show that the AU test is less biased than other methods in typical cases of tree selection. These points are illustrated in a simulation study as well as in the analysis of mammalian mitochondrial protein sequences. The theoretical argument provides a simple formula that covers the bootstrap probability test, the Kishino-Hasegawa test, the AU test, and the Zharkikh-Li test. A practical suggestion is provided as to which test should be used under particular circumstances. PMID:12079646

  5. Phylo.io: Interactive Viewing and Comparison of Large Phylogenetic Trees on the Web

    PubMed Central

    Robinson, Oscar; Dylus, David; Dessimoz, Christophe

    2016-01-01

    Phylogenetic trees are pervasively used to depict evolutionary relationships. Increasingly, researchers need to visualize large trees and compare multiple large trees inferred for the same set of taxa (reflecting uncertainty in the tree inference or genuine discordance among the loci analyzed). Existing tree visualization tools are however not well suited to these tasks. In particular, side-by-side comparison of trees can prove challenging beyond a few dozen taxa. Here, we introduce Phylo.io, a web application to visualize and compare phylogenetic trees side-by-side. Its distinctive features are: highlighting of similarities and differences between two trees, automatic identification of the best matching rooting and leaf order, scalability to large trees, high usability, multiplatform support via standard HTML5 implementation, and possibility to store and share visualizations. The tool can be freely accessed at http://phylo.io and can easily be embedded in other web servers. The code for the associated JavaScript library is available at https://github.com/DessimozLab/phylo-io under an MIT open source license. PMID:27189561

  6. Phylo.io: Interactive Viewing and Comparison of Large Phylogenetic Trees on the Web.

    PubMed

    Robinson, Oscar; Dylus, David; Dessimoz, Christophe

    2016-08-01

    Phylogenetic trees are pervasively used to depict evolutionary relationships. Increasingly, researchers need to visualize large trees and compare multiple large trees inferred for the same set of taxa (reflecting uncertainty in the tree inference or genuine discordance among the loci analyzed). Existing tree visualization tools are however not well suited to these tasks. In particular, side-by-side comparison of trees can prove challenging beyond a few dozen taxa. Here, we introduce Phylo.io, a web application to visualize and compare phylogenetic trees side-by-side. Its distinctive features are: highlighting of similarities and differences between two trees, automatic identification of the best matching rooting and leaf order, scalability to large trees, high usability, multiplatform support via standard HTML5 implementation, and possibility to store and share visualizations. The tool can be freely accessed at http://phylo.io and can easily be embedded in other web servers. The code for the associated JavaScript library is available at https://github.com/DessimozLab/phylo-io under an MIT open source license. PMID:27189561

  7. Analyzing phylogenetic trees with timed and probabilistic model checking: the lactose persistence case study.

    PubMed

    Requeno, José Ignacio; Colom, José Manuel

    2014-01-01

    Model checking is a generic verification technique that allows the phylogeneticist to focus on models and specifications instead of on implementation issues. Phylogenetic trees are considered as transition systems over which we interrogate phylogenetic questions written as formulas of temporal logic. Nonetheless, standard logics become insufficient for certain practices of phylogenetic analysis since they do not allow the inclusion of explicit time and probabilities. The aim of this paper is to extend the application of model checking techniques beyond qualitative phylogenetic properties and adapt the existing logical extensions and tools to the field of phylogeny. The introduction of time and probabilities in phylogenetic specifications is motivated by the study of a real example: the analysis of the ratio of lactose intolerance in some populations and the date of appearance of this phenotype. PMID:25339082

  8. A simulation approach for change-points on phylogenetic trees.

    PubMed

    Persing, Adam; Jasra, Ajay; Beskos, Alexandros; Balding, David; De Iorio, Maria

    2015-01-01

    We observe n sequences at each of m sites and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a transdimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the transdimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques. PMID:25506749

  9. Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses

    PubMed Central

    Lanfear, Robert; Hua, Xia; Warren, Dan L.

    2016-01-01

    Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794

  10. PhyloPen: Phylogenetic Tree Browsing Using a Pen and Touch Interface

    PubMed Central

    Wehrer, Anthony; Yee, Andrew; Lisle, Curtis; Hughes, Charles

    2015-01-01

    Phylogenetic trees are used by researchers across multiple fields of study to display historical relationships between organisms or genes. Trees are used to examine the speciation process in evolutionary biology, to classify families of viruses in epidemiology, to demonstrate co-speciation in host and pathogen studies, and to explore genetic changes occurring during the disease process in cancer, among other applications. Due to their complexity and the amount of data they present in visual form, phylogenetic trees have generally been difficult to render for publication and challenging to directly interact with in digital form. To address these limitations, we developed PhyloPen, an experimental novel multi-touch and pen application that renders a phylogenetic tree and allows users to interactively navigate within the tree, examining nodes, branches, and auxiliary information, and annotate the tree for note-taking and collaboration. We present a discussion of the interactions implemented in PhyloPen and the results of a formative study that examines how the application was received after use by practicing biologists -- faculty members and graduate students in the discipline. These results are to be later used for a fully supported implementation of the software where the community will be welcomed to participate in its development. PMID:26693078

  11. Evolview v2: an online visualization and management tool for customized and annotated phylogenetic trees.

    PubMed

    He, Zilong; Zhang, Huangkai; Gao, Shenghan; Lercher, Martin J; Chen, Wei-Hua; Hu, Songnian

    2016-07-01

    Evolview is an online visualization and management tool for customized and annotated phylogenetic trees. It allows users to visualize phylogenetic trees in various formats, customize the trees through built-in functions and user-supplied datasets and export the customization results to publication-ready figures. Its 'dataset system' contains not only the data to be visualized on the tree, but also 'modifiers' that control various aspects of the graphical annotation. Evolview is a single-page application (like Gmail); its carefully designed interface allows users to upload, visualize, manipulate and manage trees and datasets all in a single webpage. Developments since the last public release include a modern dataset editor with keyword highlighting functionality, seven newly added types of annotation datasets, collaboration support that allows users to share their trees and datasets and various improvements of the web interface and performance. In addition, we included eleven new 'Demo' trees to demonstrate the basic functionalities of Evolview, and five new 'Showcase' trees inspired by publications to showcase the power of Evolview in producing publication-ready figures. Evolview is freely available at: http://www.evolgenius.info/evolview/. PMID:27131786

  12. Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis

    PubMed Central

    2012-01-01

    Background Recently, various evolution-related journals adopted policies to encourage or require archiving of phylogenetic trees and associated data. Such attention to practices that promote sharing of data reflects rapidly improving information technology, and rapidly expanding potential to use this technology to aggregate and link data from previously published research. Nevertheless, little is known about current practices, or best practices, for publishing trees and associated data so as to promote re-use. Findings Here we summarize results of an ongoing analysis of current practices for archiving phylogenetic trees and associated data, current practices of re-use, and current barriers to re-use. We find that the technical infrastructure is available to support rudimentary archiving, but the frequency of archiving is low. Currently, most phylogenetic knowledge is not easily re-used due to a lack of archiving, lack of awareness of best practices, and lack of community-wide standards for formatting data, naming entities, and annotating data. Most attempts at data re-use seem to end in disappointment. Nevertheless, we find many positive examples of data re-use, particularly those that involve customized species trees generated by grafting to, and pruning from, a much larger tree. Conclusions The technologies and practices that facilitate data re-use can catalyze synthetic and integrative research. However, success will require engagement from various stakeholders including individual scientists who produce or consume shareable data, publishers, policy-makers, technology developers and resource-providers. The critical challenges for facilitating re-use of phylogenetic trees and associated data, we suggest, include: a broader commitment to public archiving; more extensive use of globally meaningful identifiers; development of user-friendly technology for annotating, submitting, searching, and retrieving data and their metadata; and development of a minimum reporting

  13. Climate Change Impacts on the Tree of Life: Changes in Phylogenetic Diversity Illustrated for Acropora Corals

    PubMed Central

    Faith, Daniel P.; Richards, Zoe T.

    2012-01-01

    The possible loss of whole branches from the tree of life is a dramatic, but under-studied, biological implication of climate change. The tree of life represents an evolutionary heritage providing both present and future benefits to humanity, often in unanticipated ways. Losses in this evolutionary (evo) life-support system represent losses in “evosystem” services, and are quantified using the phylogenetic diversity (PD) measure. High species-level biodiversity losses may or may not correspond to high PD losses. If climate change impacts are clumped on the phylogeny, then loss of deeper phylogenetic branches can mean disproportionately large PD loss for a given degree of species loss. Over time, successive species extinctions within a clade each may imply only a moderate loss of PD, until the last species within that clade goes extinct, and PD drops precipitously. Emerging methods of “phylogenetic risk analysis” address such phylogenetic tipping points by adjusting conservation priorities to better reflect risk of such worst-case losses. We have further developed and explored this approach for one of the most threatened taxonomic groups, corals. Based on a phylogenetic tree for the corals genus Acropora, we identify cases where worst-case PD losses may be avoided by designing risk-averse conservation priorities. We also propose spatial heterogeneity measures changes to assess possible changes in the geographic distribution of corals PD. PMID:24832524

  14. Building Phylogenetic Trees from DNA Sequence Data: Investigating Polar Bear and Giant Panda Ancestry.

    ERIC Educational Resources Information Center

    Maier, Caroline Alexandra

    2001-01-01

    Presents an activity in which students seek answers to questions about evolutionary relationships by using genetic databases and bioinformatics software. Students build genetic distance matrices and phylogenetic trees based on molecular sequence data using web-based resources. Provides a flowchart of steps involved in accessing, retrieving, and…

  15. Building a Phylogenetic Tree of the Human and Ape Superfamily Using DNA-DNA Hybridization Data

    ERIC Educational Resources Information Center

    Maier, Caroline Alexander

    2004-01-01

    The study describes the process of DNA-DNA hybridization and the history of its use by Sibley and Alquist in simple, straightforward, and interesting language that students easily understand to create their own phylogenetic tree of the hominoid superfamily. They calibrate the DNA clock and use it to estimate the divergence dates of the various…

  16. Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees.

    PubMed

    Matsuya, Akihiro; Sakate, Ryuichi; Kawahara, Yoshihiro; Koyanagi, Kanako O; Sato, Yoshiharu; Fujii, Yasuyuki; Yamasaki, Chisato; Habara, Takuya; Nakaoka, Hajime; Todokoro, Fusano; Yamaguchi, Kaori; Endo, Toshinori; Oota, Satoshi; Makalowski, Wojciech; Ikeo, Kazuho; Suzuki, Yoshiyuki; Hanada, Kousuke; Hashimoto, Katsuyuki; Hirai, Momoki; Iwama, Hisakazu; Saitou, Naruya; Hiraki, Aiko T; Jin, Lihua; Kaneko, Yayoi; Kanno, Masako; Murakami, Katsuhiko; Noda, Akiko Ogura; Saichi, Naomi; Sanbonmatsu, Ryoko; Suzuki, Mami; Takeda, Jun-ichi; Tanaka, Masayuki; Gojobori, Takashi; Imanishi, Tadashi; Itoh, Takeshi

    2008-01-01

    Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (http://h-invitational.jp/), we constructed a fully curated database of evolutionary features of human genes, called 'Evola'. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In 'd(N)/d(S) view', natural selection on genes can be analyzed between human and other species. In 'Locus maps', all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at http://www.h-invitational.jp/evola/. PMID:17982176

  17. Species tree estimation for a deep phylogenetic divergence in the New World monkeys (Primates: Platyrrhini).

    PubMed

    Perez, S Ivan; Klaczko, Julia; dos Reis, Sérgio F

    2012-11-01

    The estimation of a robust phylogeny is a necessary first step in understanding the biological diversification of the platyrrhines. Although the most recent phylogenies are generally robust, they differ from one another in the relationship between Aotus and other genera as well as in the relationship between Pitheciidae and other families. Here, we used coding and non-coding sequences to infer the species tree and embedded gene trees of the platyrrhine genera using the Bayesian Markov chain Monte Carlo method for the multispecies coalescent (*BEAST) for the first time and to compared the results with those of a Bayesian concatenated phylogenetic analysis. Our species tree, based on all available sequences, shows a closer phylogenetic relationship between Atelidae and Cebidae and a closer relationship between Aotus and the Cebidae clade. The posterior probabilities are lower for these conflictive tree nodes compared to those in the concatenated analysis; this finding could be explained by some gene trees showing no concordant topologies between Aotus and the other genera. Moreover, the topology of our species tree also differs from the findings of previous molecular and morphological studies regarding the position of Aotus. The existence of discrepancies between morphological data, gene trees and the species tree is widely reported and can be related to processes such as incomplete lineage sorting or selection. Although these processes are common in species trees with low divergence, they can also occur in species trees with deep and rapid divergence. The sources of the inconsistency of morphological and molecular traits with the species tree could be a main focus of further research on platyrrhines. PMID:22841656

  18. Do Triplets Have Enough Information to Construct the Multi-Labeled Phylogenetic Tree?

    PubMed Central

    Hassanzadeh, Reza; Eslahchi, Changiz; Sung, Wing-Kin

    2014-01-01

    The evolutionary history of certain species such as polyploids are modeled by a generalization of phylogenetic trees called multi-labeled phylogenetic trees, or MUL trees for short. One problem that relates to inferring a MUL tree is how to construct the smallest possible MUL tree that is consistent with a given set of rooted triplets, or SMRT problem for short. This problem is NP-hard. There is one algorithm for the SMRT problem which is exact and runs in time, where is the number of taxa. In this paper, we show that the SMRT does not seem to be an appropriate solution from the biological point of view. Indeed, we present a heuristic algorithm named MTRT for this problem and execute it on some real and simulated datasets. The results of MTRT show that triplets alone cannot provide enough information to infer the true MUL tree. So, it is inappropriate to infer a MUL tree using triplet information alone and considering the minimum number of duplications. Finally, we introduce some new problems which are more suitable from the biological point of view. PMID:25080217

  19. Simultaneous alignment and folding of 28S rRNA sequences uncovers phylogenetic signal in structure variation.

    PubMed

    Letsch, Harald O; Greve, Carola; Kück, Patrick; Fleck, Günther; Stocsits, Roman R; Misof, Bernhard

    2009-12-01

    Secondary structure models of mitochondrial and nuclear (r)RNA sequences are frequently applied to aid the alignment of these molecules in phylogenetic analyses. Additionally, it is often speculated that structure variation of (r)RNA sequences might profitably be used as phylogenetic markers. The benefit of these approaches depends on the reliability of structure models. We used a recently developed approach to show that reliable inference of large (r)RNA secondary structures as a prerequisite of simultaneous sequence and structure alignment is feasible. The approach iteratively establishes local structure constraints of each sequence and infers fully folded individual structures by constrained MFE optimization. A comparison of structure edit distances of individual constraints and fully folded structures showed pronounced phylogenetic signal in fully folded structures. As model sequences we characterized secondary structures of 28S rRNA sequences of selected insects and examined their phylogenetic signal according to established phylogenetic hypotheses. PMID:19654047

  20. Molecular Phylogenetics and Systematics of the Bivalve Family Ostreidae Based on rRNA Sequence-Structure Models and Multilocus Species Tree

    PubMed Central

    Salvi, Daniele; Macali, Armando; Mariottini, Paolo

    2014-01-01

    The bivalve family Ostreidae has a worldwide distribution and includes species of high economic importance. Phylogenetics and systematic of oysters based on morphology have proved difficult because of their high phenotypic plasticity. In this study we explore the phylogenetic information of the DNA sequence and secondary structure of the nuclear, fast-evolving, ITS2 rRNA and the mitochondrial 16S rRNA genes from the Ostreidae and we implemented a multi-locus framework based on four loci for oyster phylogenetics and systematics. Sequence-structure rRNA models aid sequence alignment and improved accuracy and nodal support of phylogenetic trees. In agreement with previous molecular studies, our phylogenetic results indicate that none of the currently recognized subfamilies, Crassostreinae, Ostreinae, and Lophinae, is monophyletic. Single gene trees based on Maximum likelihood (ML) and Bayesian (BA) methods and on sequence-structure ML were congruent with multilocus trees based on a concatenated (ML and BA) and coalescent based (BA) approaches and consistently supported three main clades: (i) Crassostrea, (ii) Saccostrea, and (iii) an Ostreinae-Lophinae lineage. Therefore, the subfamily Crassotreinae (including Crassostrea), Saccostreinae subfam. nov. (including Saccostrea and tentatively Striostrea) and Ostreinae (including Ostreinae and Lophinae taxa) are recognized. Based on phylogenetic and biogeographical evidence the Asian species of Crassostrea from the Pacific Ocean are assigned to Magallana gen. nov., whereas an integrative taxonomic revision is required for the genera Ostrea and Dendostrea. This study pointed out the suitability of the ITS2 marker for DNA barcoding of oyster and the relevance of using sequence-structure rRNA models and features of the ITS2 folding in molecular phylogenetics and taxonomy. The multilocus approach allowed inferring a robust phylogeny of Ostreidae providing a broad molecular perspective on their systematics. PMID:25250663

  1. An algorithm for constructing parsimonious hybridization networks with multiple phylogenetic trees.

    PubMed

    Wu, Yufeng

    2013-10-01

    A phylogenetic network is a model for reticulate evolution. A hybridization network is one type of phylogenetic network for a set of discordant gene trees and "displays" each gene tree. A central computational problem on hybridization networks is: given a set of gene trees, reconstruct the minimum (i.e., most parsimonious) hybridization network that displays each given gene tree. This problem is known to be NP-hard, and existing approaches for this problem are either heuristics or making simplifying assumptions (e.g., work with only two input trees or assume some topological properties). In this article, we develop an exact algorithm (called PIRNC) for inferring the minimum hybridization networks from multiple gene trees. The PIRNC algorithm does not rely on structural assumptions (e.g., the so-called galled networks). To the best of our knowledge, PIRNC is the first exact algorithm implemented for this formulation. When the number of reticulation events is relatively small (say, four or fewer), PIRNC runs reasonably efficient even for moderately large datasets. For building more complex networks, we also develop a heuristic version of PIRNC called PIRNCH. Simulation shows that PIRNCH usually produces networks with fewer reticulation events than those by an existing method. PIRNC and PIRNCH have been implemented as part of the software package called PIRN and is available online. PMID:24093230

  2. Effect of co-evolving amino acid residues on topology of phylogenetic trees.

    PubMed

    Sherbakov, D Yu; Triboy, T I

    2007-12-01

    The presence in proteins of amino acid residues that change in concert during evolution is associated with keeping constant the protein spatial structure and functions. As in the case with morphological features, correlated substitutions may become the cause of homoplasies--the independent evolution of identical non-homological adaptations. Our data obtained on model phylogenetic trees and corresponding sets of sequences have shown that the presence of correlated substitutions distorts the results of phylogenetic reconstructions. A method for accounting for co-evolving amino acid residues in phylogenetic analysis is proposed. According to this method, only a single site from the group of correlated amino acid positions should remain, whereas other positions should not be used in further phylogenetic analysis. Simulations performed have shown that replacement on the average of 8% of variable positions in a pair of model sequences by coordinately evolving amino acid residues is able to change the tree topology. The removal of such amino acid residues from sequences before phylogenetic analysis restores the correct topology. PMID:18205620

  3. AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees

    PubMed Central

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php. PMID:24892935

  4. Local-scale Partitioning of Functional and Phylogenetic Beta Diversity in a Tropical Tree Assemblage

    PubMed Central

    Yang, Jie; Swenson, Nathan G.; Zhang, Guocheng; Ci, Xiuqin; Cao, Min; Sha, Liqing; Li, Jie; Ferry Slik, J. W.; Lin, Luxiang

    2015-01-01

    The relative degree to which stochastic and deterministic processes underpin community assembly is a central problem in ecology. Quantifying local-scale phylogenetic and functional beta diversity may shed new light on this problem. We used species distribution, soil, trait and phylogenetic data to quantify whether environmental distance, geographic distance or their combination are the strongest predictors of phylogenetic and functional beta diversity on local scales in a 20-ha tropical seasonal rainforest dynamics plot in southwest China. The patterns of phylogenetic and functional beta diversity were generally consistent. The phylogenetic and functional dissimilarity between subplots (10 × 10 m, 20 × 20 m, 50 × 50 m and 100 × 100 m) was often higher than that expected by chance. The turnover of lineages and species function within habitats was generally slower than that across habitats. Partitioning the variation in phylogenetic and functional beta diversity showed that environmental distance was generally a better predictor of beta diversity than geographic distance thereby lending relatively more support for deterministic environmental filtering over stochastic processes. Overall, our results highlight that deterministic processes play a stronger role than stochastic processes in structuring community composition in this diverse assemblage of tropical trees. PMID:26235237

  5. Local-scale Partitioning of Functional and Phylogenetic Beta Diversity in a Tropical Tree Assemblage.

    PubMed

    Yang, Jie; Swenson, Nathan G; Zhang, Guocheng; Ci, Xiuqin; Cao, Min; Sha, Liqing; Li, Jie; Ferry Slik, J W; Lin, Luxiang

    2015-01-01

    The relative degree to which stochastic and deterministic processes underpin community assembly is a central problem in ecology. Quantifying local-scale phylogenetic and functional beta diversity may shed new light on this problem. We used species distribution, soil, trait and phylogenetic data to quantify whether environmental distance, geographic distance or their combination are the strongest predictors of phylogenetic and functional beta diversity on local scales in a 20-ha tropical seasonal rainforest dynamics plot in southwest China. The patterns of phylogenetic and functional beta diversity were generally consistent. The phylogenetic and functional dissimilarity between subplots (10 × 10 m, 20 × 20 m, 50 × 50 m and 100 × 100 m) was often higher than that expected by chance. The turnover of lineages and species function within habitats was generally slower than that across habitats. Partitioning the variation in phylogenetic and functional beta diversity showed that environmental distance was generally a better predictor of beta diversity than geographic distance thereby lending relatively more support for deterministic environmental filtering over stochastic processes. Overall, our results highlight that deterministic processes play a stronger role than stochastic processes in structuring community composition in this diverse assemblage of tropical trees. PMID:26235237

  6. Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees.

    PubMed

    Rabosky, Daniel L

    2014-01-01

    A number of methods have been developed to infer differential rates of species diversification through time and among clades using time-calibrated phylogenetic trees. However, we lack a general framework that can delineate and quantify heterogeneous mixtures of dynamic processes within single phylogenies. I developed a method that can identify arbitrary numbers of time-varying diversification processes on phylogenies without specifying their locations in advance. The method uses reversible-jump Markov Chain Monte Carlo to move between model subspaces that vary in the number of distinct diversification regimes. The model assumes that changes in evolutionary regimes occur across the branches of phylogenetic trees under a compound Poisson process and explicitly accounts for rate variation through time and among lineages. Using simulated datasets, I demonstrate that the method can be used to quantify complex mixtures of time-dependent, diversity-dependent, and constant-rate diversification processes. I compared the performance of the method to the MEDUSA model of rate variation among lineages. As an empirical example, I analyzed the history of speciation and extinction during the radiation of modern whales. The method described here will greatly facilitate the exploration of macroevolutionary dynamics across large phylogenetic trees, which may have been shaped by heterogeneous mixtures of distinct evolutionary processes. PMID:24586858

  7. Automatic Detection of Key Innovations, Rate Shifts, and Diversity-Dependence on Phylogenetic Trees

    PubMed Central

    Rabosky, Daniel L.

    2014-01-01

    A number of methods have been developed to infer differential rates of species diversification through time and among clades using time-calibrated phylogenetic trees. However, we lack a general framework that can delineate and quantify heterogeneous mixtures of dynamic processes within single phylogenies. I developed a method that can identify arbitrary numbers of time-varying diversification processes on phylogenies without specifying their locations in advance. The method uses reversible-jump Markov Chain Monte Carlo to move between model subspaces that vary in the number of distinct diversification regimes. The model assumes that changes in evolutionary regimes occur across the branches of phylogenetic trees under a compound Poisson process and explicitly accounts for rate variation through time and among lineages. Using simulated datasets, I demonstrate that the method can be used to quantify complex mixtures of time-dependent, diversity-dependent, and constant-rate diversification processes. I compared the performance of the method to the MEDUSA model of rate variation among lineages. As an empirical example, I analyzed the history of speciation and extinction during the radiation of modern whales. The method described here will greatly facilitate the exploration of macroevolutionary dynamics across large phylogenetic trees, which may have been shaped by heterogeneous mixtures of distinct evolutionary processes. PMID:24586858

  8. Phylogenetic Stability, Tree Shape, and Character Compatibility: A Case Study Using Early Tetrapods.

    PubMed

    Bernardi, Massimo; Angielczyk, Kenneth D; Mitchell, Jonathan S; Ruta, Marcello

    2016-09-01

    Phylogenetic tree shape varies as the evolutionary processes affecting a clade change over time. In this study, we examined an empirical phylogeny of fossil tetrapods during several time intervals, and studied how temporal constraints manifested in patterns of tree imbalance and character change. The results indicate that the impact of temporal constraints on tree shape is minimal and highlights the stability through time of the reference tetrapod phylogeny. Unexpected values of imbalance for Mississippian and Pennsylvanian time slices strongly support the hypothesis that the Carboniferous was a period of explosive tetrapod radiation. Several significant diversification shifts take place in the Mississippian and underpin increased terrestrialization among the earliest limbed vertebrates. Character incompatibility is relatively high at the beginning of tetrapod history, but quickly decreases to a relatively stable lower level, relative to a null distribution based on constant rates of character change. This implies that basal tetrapods had high, but declining, rates of homoplasy early in their evolutionary history, although the origin of Lissamphibia is an exception to this trend. The time slice approach is a powerful method of phylogenetic analysis and a useful tool for assessing the impact of combining extinct and extant taxa in phylogenetic analyses of large and speciose clades. PMID:27288479

  9. Phylogenetic isolation of host trees affects assembly of local Heteroptera communities

    PubMed Central

    Vialatte, A.; Bailey, R. I.; Vasseur, C.; Matocq, A.; Gossner, M. M.; Everhart, D.; Vitrac, X.; Belhadj, A.; Ernoult, A.; Prinzing, A.

    2010-01-01

    A host may be physically isolated in space and then may correspond to a geographical island, but it may also be separated from its local neighbours by hundreds of millions of years of evolutionary history, and may form in this case an evolutionarily distinct island. We test how this affects the assembly processes of the host's colonizers, this question being until now only invoked at the scale of physically distinct islands or patches. We studied the assembly of true bugs in crowns of oaks surrounded by phylogenetically more or less closely related trees. Despite the short distances (less than 150 m) between phylogenetically isolated and non-isolated trees, we found major differences between their Heteroptera faunas. We show that phylogenetically isolated trees support smaller numbers and fewer species of Heteroptera, an increasing proportion of phytophages and a decreasing proportion of omnivores, and proportionally more non-host-specialists. These differences were not due to changes in the nutritional quality of the trees, i.e. species sorting, which we accounted for. Comparison with predictions from meta-community theories suggests that the assembly of local Heteroptera communities may be strongly driven by independent metapopulation processes at the level of the individual species. We conclude that the assembly of communities on hosts separated from their neighbours by long periods of evolutionary history is qualitatively and quantitatively different from that on hosts established surrounded by closely related trees. Potentially, the biotic selection pressure on a host might thus change with the evolutionary proximity of the surrounding hosts. PMID:20335208

  10. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu.

    PubMed

    Manwell, C; Baker, C M

    1980-01-01

    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species. PMID:7458002

  11. Phylogenetic tree derived from bacterial, cytosol and organelle 5S rRNA sequences.

    PubMed Central

    Küntzel, H; Heidrich, M; Piechulla, B

    1981-01-01

    A phylogenetic tree was constructed by computer analysis of 47 completely determined 5S rRNA sequences. The wheat mitochondrial sequence is significantly more related to prokaryotic than to eukaryotic sequences, and its affinity to that of the thermophilic Gram-negative bacterium Thermus aquaticus is comparable to the affinity between Anacystis nidulans and chloroplastic sequences. This strongly supports the idea of an endosymbiotic origin of plant mitochondria. A comparison of the plant cytosol and chloroplast sub-trees suggests a similar rate of nucleotide substitution in nuclear genes and chloroplastic genes. Other features of the tree are a common precursor of protozoa and metazoa, which appears to be more related to the fungal than to the plant protosequence, and an early divergence of the archebacterial sequence (Halobacterium cutirubrum) from the prokaryotic branch. PMID:6785727

  12. Phylogenetically diverse AM fungi from Ecuador strongly improve seedling growth of native potential crop trees.

    PubMed

    Schüßler, Arthur; Krüger, Claudia; Urgiles, Narcisa

    2016-04-01

    In many deforested regions of the tropics, afforestation with native tree species could valorize a growing reservoir of degraded, previously overused and abandoned land. The inoculation of tropical tree seedlings with arbuscular mycorrhizal fungi (AM fungi) can improve tree growth and viability, but efficiency may depend on plant and AM fungal genotype. To study such effects, seven phylogenetically diverse AM fungi, native to Ecuador, from seven genera and a non-native AM fungus (Rhizophagus irregularis DAOM197198) were used to inoculate the tropical potential crop tree (PCT) species Handroanthus chrysanthus (synonym Tabebuia chrysantha), Cedrela montana, and Heliocarpus americanus. Twenty-four plant-fungus combinations were studied in five different fertilization and AMF inoculation treatments. Numerous plant growth parameters and mycorrhizal root colonization were assessed. The inoculation with any of the tested AM fungi improved seedling growth significantly and in most cases reduced plant mortality. Plants produced up to threefold higher biomass, when compared to the standard nursery practice. AM fungal inoculation alone or in combination with low fertilization both outperformed full fertilization in terms of plant growth promotion. Interestingly, root colonization levels for individual fungi strongly depended on the host tree species, but surprisingly the colonization strength did not correlate with plant growth promotion. The combination of AM fungal inoculation with a low dosage of slow release fertilizer improved PCT seedling performance strongest, but also AM fungal treatments without any fertilization were highly efficient. The AM fungi tested are promising candidates to improve management practices in tropical tree seedling production. PMID:26260945

  13. Phylogeny and evolutionary histories of Pyrus L. revealed by phylogenetic trees and networks based on data from multiple DNA sequences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence d...

  14. Phylogeny and evoluntionary histories of Pyrus L. revealed by phylogenetic trees and networks based on data from multiple DNA sequences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence d...

  15. aLeaves facilitates on-demand exploration of metazoan gene family trees on MAFFT sequence alignment server with enhanced interactivity.

    PubMed

    Kuraku, Shigehiro; Zmasek, Christian M; Nishimura, Osamu; Katoh, Kazutaka

    2013-07-01

    We report a new web server, aLeaves (http://aleaves.cdb.riken.jp/), for homologue collection from diverse animal genomes. In molecular comparative studies involving multiple species, orthology identification is the basis on which most subsequent biological analyses rely. It can be achieved most accurately by explicit phylogenetic inference. More and more species are subjected to large-scale sequencing, but the resultant resources are scattered in independent project-based, and multi-species, but separate, web sites. This complicates data access and is becoming a serious barrier to the comprehensiveness of molecular phylogenetic analysis. aLeaves, launched to overcome this difficulty, collects sequences similar to an input query sequence from various data sources. The collected sequences can be passed on to the MAFFT sequence alignment server (http://mafft.cbrc.jp/alignment/server/), which has been significantly improved in interactivity. This update enables to switch between (i) sequence selection using the Archaeopteryx tree viewer, (ii) multiple sequence alignment and (iii) tree inference. This can be performed as a loop until one reaches a sensible data set, which minimizes redundancy for better visibility and handling in phylogenetic inference while covering relevant taxa. The work flow achieved by the seamless link between aLeaves and MAFFT provides a convenient online platform to address various questions in zoology and evolutionary biology. PMID:23677614

  16. Epidemic Reconstruction in a Phylogenetics Framework: Transmission Trees as Partitions of the Node Set

    PubMed Central

    Hall, Matthew; Woolhouse, Mark; Rambaut, Andrew

    2015-01-01

    The use of genetic data to reconstruct the transmission tree of infectious disease epidemics and outbreaks has been the subject of an increasing number of studies, but previous approaches have usually either made assumptions that are not fully compatible with phylogenetic inference, or, where they have based inference on a phylogeny, have employed a procedure that requires this tree to be fixed. At the same time, the coalescent-based models of the pathogen population that are employed in the methods usually used for time-resolved phylogeny reconstruction are a considerable simplification of epidemic process, as they assume that pathogen lineages mix freely. Here, we contribute a new method that is simultaneously a phylogeny reconstruction method for isolates taken from an epidemic, and a procedure for transmission tree reconstruction. We observe that, if one or more samples is taken from each host in an epidemic or outbreak and these are used to build a phylogeny, a transmission tree is equivalent to a partition of the set of nodes of this phylogeny, such that each partition element is a set of nodes that is connected in the full tree and contains all the tips corresponding to samples taken from one and only one host. We then implement a Monte Carlo Markov Chain (MCMC) procedure for simultaneous sampling from the spaces of both trees, utilising a newly-designed set of phylogenetic tree proposals that also respect node partitions. We calculate the posterior probability of these partitioned trees based on a model that acknowledges the population structure of an epidemic by employing an individual-based disease transmission model and a coalescent process taking place within each host. We demonstrate our method, first using simulated data, and then with sequences taken from the H7N7 avian influenza outbreak that occurred in the Netherlands in 2003. We show that it is superior to established coalescent methods for reconstructing the topology and node heights of the

  17. Use of Alignment-Free Phylogenetics for Rapid Genome Sequence-Based Typing of Helicobacter pylori Virulence Markers and Antibiotic Susceptibility

    PubMed Central

    Kusters, Johannes G.

    2015-01-01

    Whole-genome sequencing is becoming a leading technology in the typing and epidemiology of microbial pathogens, but the increase in genomic information necessitates significant investment in bioinformatic resources and expertise, and currently used methodologies struggle with genetically heterogeneous bacteria such as the human gastric pathogen Helicobacter pylori. Here we demonstrate that the alignment-free analysis method feature frequency profiling (FFP) can be used to rapidly construct phylogenetic trees of draft bacterial genome sequences on a standard desktop computer and that coupling with in silico genotyping methods gives useful information for comparative and clinical genomic and molecular epidemiology applications. FFP-based phylogenetic trees of seven gastric Helicobacter species matched those obtained by analysis of 16S rRNA genes and ribosomal proteins, and FFP- and core genome single nucleotide polymorphism-based analysis of 63 H. pylori genomes again showed comparable phylogenetic clustering, consistent with genomotypes assigned by using multilocus sequence typing (MLST). Analysis of 377 H. pylori genomes highlighted the conservation of genomotypes and linkage with phylogeographic characteristics and predicted the presence of an incomplete or nonfunctional cag pathogenicity island in 18/276 genomes. In silico analysis of antibiotic susceptibility markers suggests that most H. pylori hspAmerind and hspEAsia isolates are predicted to carry the T2812C mutation potentially conferring low-level clarithromycin resistance, while levels of metronidazole resistance were similar in all multilocus sequence types. In conclusion, the use of FFP phylogenetic clustering and in silico genotyping allows determination of genome evolution and phylogeographic clustering and can contribute to clinical microbiology by genomotyping for outbreak management and the prediction of pathogenic potential and antibiotic susceptibility. PMID:26135867

  18. Algorithms for efficient near-perfect phylogenetic tree reconstruction in theory and practice.

    PubMed

    Sridhar, Srinath; Dhamdhere, Kedar; Blelloch, Guy; Halperin, Eran; Ravi, R; Schwartz, Russell

    2007-01-01

    We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect phylogeny assumption by allowing at most a constant number of additional mutations. We develop two algorithms for constructing optimal near-perfect phylogenies and provide empirical evidence of their performance. The first simple algorithm is fixed parameter tractable when the number of additional mutations and the number of characters that share four gametes with some other character are constants. The second, more involved algorithm for the problem is fixed parameter tractable when only the number of additional mutations is fixed. We have implemented both algorithms and shown them to be extremely efficient in practice on biologically significant data sets. This work proves the BNPP problem fixed parameter tractable and provides the first practical phylogenetic tree reconstruction algorithms that find guaranteed optimal solutions while being easily implemented and computationally feasible for data sets of biologically meaningful size and complexity. PMID:17975268

  19. SICLE: a high-throughput tool for extracting evolutionary relationships from phylogenetic trees

    PubMed Central

    Wisecaver, Jennifer H.

    2016-01-01

    We present the phylogeny analysis software SICLE (Sister Clade Extractor), an easy-to-use, high-throughput tool to describe the nearest neighbors to a node of interest in a phylogenetic tree as well as the support value for the relationship. The application is a command line utility that can be embedded into a phylogenetic analysis pipeline or can be used as a subroutine within another C++ program. As a test case, we applied this new tool to the published phylome of Salinibacter ruber, a species of halophilic Bacteriodetes, identifying 13 unique sister relationships to S. ruber across the 4,589 gene phylogenies. S. ruber grouped with bacteria, most often other Bacteriodetes, in the majority of phylogenies, but 91 phylogenies showed a branch-supported sister association between S. ruber and Archaea, an evolutionarily intriguing relationship indicative of horizontal gene transfer. This test case demonstrates how SICLE makes it possible to summarize the phylogenetic information produced by automated phylogenetic pipelines to rapidly identify and quantify the possible evolutionary relationships that merit further investigation. SICLE is available for free for noncommercial use at http://eebweb.arizona.edu/sicle/.

  20. Molecular phylogenetic trees - On the validity of the Goodman-Moore augmentation algorithm

    NASA Technical Reports Server (NTRS)

    Holmquist, R.

    1979-01-01

    A response is made to the reply of Nei and Tateno (1979) to the letter of Holmquist (1978) supporting the validity of the augmentation algorithm of Moore (1977) in reconstructions of nucleotide substitutions by means of the maximum parsimony principle. It is argued that the overestimation of the augmented numbers of nucleotide substitutions (augmented distances) found by Tateno and Nei (1978) is due to an unrepresentative data sample and that it is only necessary that evolution be stochastically uniform in different regions of the phylogenetic network for the augmentation method to be useful. The importance of the average value of the true distance over all links is explained, and the relative variances of the true and augmented distances are calculated to be almost identical. The effects of topological changes in the phylogenetic tree on the augmented distance and the question of the correctness of ancestral sequences inferred by the method of parsimony are also clarified.

  1. Mapping the shapes of phylogenetic trees from human and zoonotic RNA viruses.

    PubMed

    Poon, Art F Y; Walker, Lorne W; Murray, Heather; McCloskey, Rosemary M; Harrigan, P Richard; Liang, Richard H

    2013-01-01

    A phylogeny is a tree-based model of common ancestry that is an indispensable tool for studying biological variation. Phylogenies play a special role in the study of rapidly evolving populations such as viruses, where the proliferation of lineages is constantly being shaped by the mode of virus transmission, by adaptation to immune systems, and by patterns of human migration and contact. These processes may leave an imprint on the shapes of virus phylogenies that can be extracted for comparative study; however, tree shapes are intrinsically difficult to quantify. Here we present a comprehensive study of phylogenies reconstructed from 38 different RNA viruses from 12 taxonomic families that are associated with human pathologies. To accomplish this, we have developed a new procedure for studying phylogenetic tree shapes based on the 'kernel trick', a technique that maps complex objects into a statistically convenient space. We show that our kernel method outperforms nine different tree balance statistics at correctly classifying phylogenies that were simulated under different evolutionary scenarios. Using the kernel method, we observe patterns in the distribution of RNA virus phylogenies in this space that reflect modes of transmission and pathogenesis. For example, viruses that can establish persistent chronic infections (such as HIV and hepatitis C virus) form a distinct cluster. Although the visibly 'star-like' shape characteristic of trees from these viruses has been well-documented, we show that established methods for quantifying tree shape fail to distinguish these trees from those of other viruses. The kernel approach presented here potentially represents an important new tool for characterizing the evolution and epidemiology of RNA viruses. PMID:24223766

  2. Trends over time in tree and seedling phylogenetic diversity indicate regional differences in forest biodiversity change.

    PubMed

    Potter, Kevin M; Woodall, Christopher W

    2012-03-01

    Changing climate conditions may impact the short-term ability of forest tree species to regenerate in many locations. In the longer term, tree species may be unable to persist in some locations while they become established in new places. Over both time frames, forest tree biodiversity may change in unexpected ways. Using repeated inventory measurements five years apart from more than 7000 forested plots in the eastern United States, we tested three hypotheses: phylogenetic diversity is substantially different from species richness as a measure of biodiversity; forest communities have undergone recent changes in phylogenetic diversity that differ by size class, region, and seed dispersal strategy; and these patterns are consistent with expected early effects of climate change. Specifically, the magnitude of diversity change across broad regions should be greater among seedlings than in trees, should be associated with latitude and elevation, and should be greater among species with high dispersal capacity. Our analyses demonstrated that phylogenetic diversity and species richness are decoupled at small and medium scales and are imperfectly associated at large scales. This suggests that it is appropriate to apply indicators of biodiversity change based on phylogenetic diversity, which account for evolutionary relationships among species and may better represent community functional diversity. Our results also detected broadscale patterns of forest biodiversity change that are consistent with expected early effects of climate change. First, the statistically significant increase over time in seedling diversity in the South suggests that conditions there have become more favorable for the reproduction and dispersal of a wider variety of species, whereas the significant decrease in northern seedling diversity indicates that northern conditions have become less favorable. Second, we found weak correlations between seedling diversity change and latitude in both zones

  3. Bayesian Inference of the Evolution of a Phenotype Distribution on a Phylogenetic Tree

    PubMed Central

    Ansari, M. Azim; Didelot, Xavier

    2016-01-01

    The distribution of a phenotype on a phylogenetic tree is often a quantity of interest. Many phenotypes have imperfect heritability, so that a measurement of the phenotype for an individual can be thought of as a single realization from the phenotype distribution of that individual. If all individuals in a phylogeny had the same phenotype distribution, measured phenotypes would be randomly distributed on the tree leaves. This is, however, often not the case, implying that the phenotype distribution evolves over time. Here we propose a new model based on this principle of evolving phenotype distribution on the branches of a phylogeny, which is different from ancestral state reconstruction where the phenotype itself is assumed to evolve. We develop an efficient Bayesian inference method to estimate the parameters of our model and to test the evidence for changes in the phenotype distribution. We use multiple simulated data sets to show that our algorithm has good sensitivity and specificity properties. Since our method identifies branches on the tree on which the phenotype distribution has changed, it is able to break down a tree into components for which this distribution is unique and constant. We present two applications of our method, one investigating the association between HIV genetic variation and human leukocyte antigen and the other studying host range distribution in a lineage of Salmonella enterica, and we discuss many other potential applications. PMID:27412711

  4. Patterns of thinking about phylogenetic trees: A study of student learning and the potential of tree thinking to improve comprehension of biological concepts

    NASA Astrophysics Data System (ADS)

    Naegle, Erin

    Evolution education is a critical yet challenging component of teaching and learning biology. There is frequently an emphasis on natural selection when teaching about evolution and conducting educational research. A full understanding of evolution, however, integrates evolutionary processes, such as natural selection, with the resulting evolutionary patterns, such as species divergence. Phylogenetic trees are models of evolutionary patterns. The perspective gained from understanding biology through phylogenetic analyses is referred to as tree thinking. Due to the increasing prevalence of tree thinking in biology, understanding how to read phylogenetic trees is an important skill for students to learn. Interpreting graphics is not an intuitive process, as graphical representations are semiotic objects. This is certainly true concerning phylogenetic tree interpretation. Previous research and anecdotal evidence report that students struggle to correctly interpret trees. The objective of this research was to describe and investigate the rationale underpinning the prior knowledge of introductory biology students' tree thinking Understanding prior knowledge is valuable as prior knowledge influences future learning. In Chapter 1, qualitative methods such as semi-structured interviews were used to explore patterns of student rationale in regard to tree thinking. Seven common tree thinking misconceptions are described: (1) Equating the degree of trait similarity with the extent of relatedness, (2) Environmental change is a necessary prerequisite to evolution, (3) Essentialism of species, (4) Evolution is inherently progressive, (5) Evolution is a linear process, (6) Not all species are related, and (7) Trees portray evolution through the hybridization of species. These misconceptions are based in students' incomplete or incorrect understanding of evolution. These misconceptions are often reinforced by the misapplication of cultural conventions to make sense of trees

  5. Multispecies Coalescent Analysis of the Early Diversification of Neotropical Primates: Phylogenetic Inference under Strong Gene Trees/Species Tree Conflict

    PubMed Central

    Schrago, Carlos G.; Menezes, Albert N.; Furtado, Carolina; Bonvicino, Cibele R.; Seuanez, Hector N.

    2014-01-01

    Neotropical primates (NP) are presently distributed in the New World from Mexico to northern Argentina, comprising three large families, Cebidae, Atelidae, and Pitheciidae, consequently to their diversification following their separation from Old World anthropoids near the Eocene/Oligocene boundary, some 40 Ma. The evolution of NP has been intensively investigated in the last decade by studies focusing on their phylogeny and timescale. However, despite major efforts, the phylogenetic relationship between these three major clades and the age of their last common ancestor are still controversial because these inferences were based on limited numbers of loci and dating analyses that did not consider the evolutionary variation associated with the distribution of gene trees within the proposed phylogenies. We show, by multispecies coalescent analyses of selected genome segments, spanning along 92,496,904 bp that the early diversification of extant NP was marked by a 2-fold increase of their effective population size and that Atelids and Cebids are more closely related respective to Pitheciids. The molecular phylogeny of NP has been difficult to solve because of population-level phenomena at the early evolution of the lineage. The association of evolutionary variation with the distribution of gene trees within proposed phylogenies is crucial for distinguishing the mean genetic divergence between species (the mean coalescent time between loci) from speciation time. This approach, based on extensive genomic data provided by new generation DNA sequencing, provides more accurate reconstructions of phylogenies and timescales for all organisms. PMID:25377940

  6. Conserving the functional and phylogenetic trees of life of European tetrapods.

    PubMed

    Thuiller, Wilfried; Maiorano, Luigi; Mazel, Florent; Guilhaumon, François; Ficetola, Gentile Francesco; Lavergne, Sébastien; Renaud, Julien; Roquet, Cristina; Mouillot, David

    2015-02-19

    Protected areas (PAs) are pivotal tools for biodiversity conservation on the Earth. Europe has had an extensive protection system since Natura 2000 areas were created in parallel with traditional parks and reserves. However, the extent to which this system covers not only taxonomic diversity but also other biodiversity facets, such as evolutionary history and functional diversity, has never been evaluated. Using high-resolution distribution data of all European tetrapods together with dated molecular phylogenies and detailed trait information, we first tested whether the existing European protection system effectively covers all species and in particular, those with the highest evolutionary or functional distinctiveness. We then tested the ability of PAs to protect the entire tetrapod phylogenetic and functional trees of life by mapping species' target achievements along the internal branches of these two trees. We found that the current system is adequately representative in terms of the evolutionary history of amphibians while it fails for the rest. However, the most functionally distinct species were better represented than they would be under random conservation efforts. These results imply better protection of the tetrapod functional tree of life, which could help to ensure long-term functioning of the ecosystem, potentially at the expense of conserving evolutionary history. PMID:25561666

  7. Conserving the functional and phylogenetic trees of life of European tetrapods

    PubMed Central

    Thuiller, Wilfried; Maiorano, Luigi; Mazel, Florent; Guilhaumon, François; Ficetola, Gentile Francesco; Lavergne, Sébastien; Renaud, Julien; Roquet, Cristina; Mouillot, David

    2015-01-01

    Protected areas (PAs) are pivotal tools for biodiversity conservation on the Earth. Europe has had an extensive protection system since Natura 2000 areas were created in parallel with traditional parks and reserves. However, the extent to which this system covers not only taxonomic diversity but also other biodiversity facets, such as evolutionary history and functional diversity, has never been evaluated. Using high-resolution distribution data of all European tetrapods together with dated molecular phylogenies and detailed trait information, we first tested whether the existing European protection system effectively covers all species and in particular, those with the highest evolutionary or functional distinctiveness. We then tested the ability of PAs to protect the entire tetrapod phylogenetic and functional trees of life by mapping species' target achievements along the internal branches of these two trees. We found that the current system is adequately representative in terms of the evolutionary history of amphibians while it fails for the rest. However, the most functionally distinct species were better represented than they would be under random conservation efforts. These results imply better protection of the tetrapod functional tree of life, which could help to ensure long-term functioning of the ecosystem, potentially at the expense of conserving evolutionary history. PMID:25561666

  8. TreSpEx—Detection of Misleading Signal in Phylogenetic Reconstructions Based on Tree Information

    PubMed Central

    Struck, Torsten H

    2014-01-01

    Phylogenies of species or genes are commonplace nowadays in many areas of comparative biological studies. However, for phylogenetic reconstructions one must refer to artificial signals such as paralogy, long-branch attraction, saturation, or conflict between different datasets. These signals might eventually mislead the reconstruction even in phylogenomic studies employing hundreds of genes. Unfortunately, there has been no program allowing the detection of such effects in combination with an implementation into automatic process pipelines. TreSpEx (Tree Space Explorer) now combines different approaches (including statistical tests), which utilize tree-based information like nodal support or patristic distances (PDs) to identify misleading signals. The program enables the parallel analysis of hundreds of trees and/or predefined gene partitions, and being command-line driven, it can be integrated into automatic process pipelines. TreSpEx is implemented in Perl and supported on Linux, Mac OS X, and MS Windows. Source code, binaries, and additional material are freely available at http://www.annelida.de/research/bioinformatics/software.html. PMID:24701118

  9. Regional and phylogenetic variation of wood density across 2456 Neotropical tree species.

    PubMed

    Chave, Jérôme; Muller-Landau, Helene C; Baker, Timothy R; Easdale, Tomás A; ter Steege, Hans; Webb, Campbell O

    2006-12-01

    Wood density is a crucial variable in carbon accounting programs of both secondary and old-growth tropical forests. It also is the best single descriptor of wood: it correlates with numerous morphological, mechanical, physiological, and ecological properties. To explore the extent to which wood density could be estimated for rare or poorly censused taxa, and possible sources of variation in this trait, we analyzed regional, taxonomic, and phylogenetic variation in wood density among 2456 tree species from Central and South America. Wood density varied over more than one order of magnitude across species, with an overall mean of 0.645 g/cm3. Our geographical analysis showed significant decreases in wood density with increasing altitude and significant differences among low-altitude geographical regions: wet forests of Central America and western Amazonia have significantly lower mean wood density than dry forests of Central and South America, eastern and central Amazonian forests, and the Atlantic forests of Brazil; and eastern Amazonian forests have lower wood densities than the dry forests and the Atlantic forest. A nested analysis of variance showed that 74% of the species-level wood density variation was explained at the genus level, 34% at the Angiosperm Phylogeny Group (APG) family level, and 19% at the APG order level. This indicates that genus-level means give reliable approximations of values of species, except in a few hypervariable genera. We also studied which evolutionary shifts in wood density occurred in the phylogeny of seed plants using a composite phylogenetic tree. Major changes were observed at deep nodes (Eurosid 1), and also in more recent divergences (for instance in the Rhamnoids, Simaroubaceae, and Anacardiaceae). Our unprecedented wood density data set yields consistent guidelines for estimating wood densities when species-level information is lacking and should significantly reduce error in Central and South American carbon accounting

  10. A Rank-Based Sequence Aligner with Applications in Phylogenetic Analysis

    PubMed Central

    2014-01-01

    Recent tools for aligning short DNA reads have been designed to optimize the trade-off between correctness and speed. This paper introduces a method for assigning a set of short DNA reads to a reference genome, under Local Rank Distance (LRD). The rank-based aligner proposed in this work aims to improve correctness over speed. However, some indexing strategies to speed up the aligner are also investigated. The LRD aligner is improved in terms of speed by storing -mer positions in a hash table for each read. Another improvement, that produces an approximate LRD aligner, is to consider only the positions in the reference that are likely to represent a good positional match of the read. The proposed aligner is evaluated and compared to other state of the art alignment tools in several experiments. A set of experiments are conducted to determine the precision and the recall of the proposed aligner, in the presence of contaminated reads. In another set of experiments, the proposed aligner is used to find the order, the family, or the species of a new (or unknown) organism, given only a set of short Next-Generation Sequencing DNA reads. The empirical results show that the aligner proposed in this work is highly accurate from a biological point of view. Compared to the other evaluated tools, the LRD aligner has the important advantage of being very accurate even for a very low base coverage. Thus, the LRD aligner can be considered as a good alternative to standard alignment tools, especially when the accuracy of the aligner is of high importance. Source code and UNIX binaries of the aligner are freely available for future development and use at http://lrd.herokuapp.com/aligners. The software is implemented in C++ and Java, being supported on UNIX and MS Windows. PMID:25133391

  11. Comparison of methods for rooting phylogenetic trees: a case study using Orcuttieae (Poaceae: Chloridoideae).

    PubMed

    Boykin, Laura M; Kubatko, Laura Salter; Lowrey, Timothy K

    2010-03-01

    DNA sequence data (cpDNA trnL intron and nrDNA ITS1 and ITS2) were analyzed to identify relationships within Orcuttieae, a small tribe of endangered grasses endemic to vernal pools in California and Baja California. The tribe includes three genera: Orcuttia, Tuctoria, and Neostapfia. All three genera carry out C(4) photosynthesis but aquatic taxa of Orcuttia lack Kranz anatomy. The unusual habitat preference of the tribe is coupled with the atypical development of C(4) photosynthesis without Kranz anatomy. Furthermore, the tribe has no known close relatives and has been noted to be phylogenetically isolated within the subfamily Chloridoideae. In this study we examine the problem of inferring the root of the tribe in the absence of an identified outgroup, analyze the phylogenetic relationships of the constituent taxa, and evaluate the evolutionary development of C(4) photosynthesis. We compare four methods for inferring the root of the tree: (1) the outgroup method, (2) midpoint rooting, the imposition of a molecular clock for both (3) maximum likelihood (ML) and (4) Bayesian analysis. We examine the consequences of each method for the inferred phylogenetic relationships. Three of the methods (outgroup rooting and the ML and Bayesian molecular clock analyses) suggest that the root of Orcuttieae is between Neostapfia and the Tuctoria/Orcuttia lineage, while midpoint rooting gives a different root. The Bayesian method additionally provides information about probabilities associated with other possible root locations. Assuming that the true root of Orcuttieae is between Neostapfia and the Tuctoria/Orcuttia lineage, our data indicate Neostapfia and Orcuttia are both monophyletic, while Tuctoria is paraphyletic (with no synapomorphies in either dataset) and forming a grade between the other two genera and needs taxonomic revision. Our data support the hypothesis that Orcuttieae was derived from a terrestrial ancestor and evolved specializations to an aquatic environment

  12. Not Seeing the Forest for the Trees: Size of the Minimum Spanning Trees (MSTs) Forest and Branch Significance in MST-Based Phylogenetic Analysis

    PubMed Central

    Teixeira, Andreia Sofia; Monteiro, Pedro T.; Carriço, João A; Ramirez, Mário; Francisco, Alexandre P.

    2015-01-01

    Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff’s matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models. PMID:25799056

  13. Not seeing the forest for the trees: size of the minimum spanning trees (MSTs) forest and branch significance in MST-based phylogenetic analysis.

    PubMed

    Teixeira, Andreia Sofia; Monteiro, Pedro T; Carriço, João A; Ramirez, Mário; Francisco, Alexandre P

    2015-01-01

    Trees, including minimum spanning trees (MSTs), are commonly used in phylogenetic studies. But, for the research community, it may be unclear that the presented tree is just a hypothesis, chosen from among many possible alternatives. In this scenario, it is important to quantify our confidence in both the trees and the branches/edges included in such trees. In this paper, we address this problem for MSTs by introducing a new edge betweenness metric for undirected and weighted graphs. This spanning edge betweenness metric is defined as the fraction of equivalent MSTs where a given edge is present. The metric provides a per edge statistic that is similar to that of the bootstrap approach frequently used in phylogenetics to support the grouping of taxa. We provide methods for the exact computation of this metric based on the well known Kirchhoff's matrix tree theorem. Moreover, we implement and make available a module for the PHYLOViZ software and evaluate the proposed metric concerning both effectiveness and computational performance. Analysis of trees generated using multilocus sequence typing data (MLST) and the goeBURST algorithm revealed that the space of possible MSTs in real data sets is extremely large. Selection of the edge to be represented using bootstrap could lead to unreliable results since alternative edges are present in the same fraction of equivalent MSTs. The choice of the MST to be presented, results from criteria implemented in the algorithm that must be based in biologically plausible models. PMID:25799056

  14. Bears in a Forest of Gene Trees: Phylogenetic Inference Is Complicated by Incomplete Lineage Sorting and Gene Flow

    PubMed Central

    Kutschera, Verena E.; Bidon, Tobias; Hailer, Frank; Rodi, Julia L.; Fain, Steven R.; Janke, Axel

    2014-01-01

    Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and complicate phylogenetic inferences, but are not accounted for in phylogenetic analyses of concatenated data. We generated a high-resolution data set of autosomal introns from several individuals per species and of Y-chromosomal markers. Incorporating intraspecific variability in coalescence-based phylogenetic and gene flow estimation approaches, we traced the genealogical history of individual alleles. Considerable heterogeneity among nuclear loci and discordance between nuclear and mitochondrial phylogenies were found. A species tree with divergence time estimates indicated that ursine bears diversified within less than 2 My. Consistent with a complex branching order within a clade of Asian bear species, we identified unidirectional gene flow from Asian black into sloth bears. Moreover, gene flow detected from brown into American black bears can explain the conflicting placement of the American black bear in mitochondrial and nuclear phylogenies. These results highlight that both incomplete lineage sorting and introgression are prominent evolutionary forces even on time scales up to several million years. Complex evolutionary patterns are not adequately captured by strictly bifurcating models, and can only be fully understood when analyzing multiple independently inherited loci in a coalescence framework. Phylogenetic incongruence among gene trees hence needs to be recognized as a biologically meaningful signal. PMID:24903145

  15. Bears in a forest of gene trees: phylogenetic inference is complicated by incomplete lineage sorting and gene flow.

    PubMed

    Kutschera, Verena E; Bidon, Tobias; Hailer, Frank; Rodi, Julia L; Fain, Steven R; Janke, Axel

    2014-08-01

    Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and complicate phylogenetic inferences, but are not accounted for in phylogenetic analyses of concatenated data. We generated a high-resolution data set of autosomal introns from several individuals per species and of Y-chromosomal markers. Incorporating intraspecific variability in coalescence-based phylogenetic and gene flow estimation approaches, we traced the genealogical history of individual alleles. Considerable heterogeneity among nuclear loci and discordance between nuclear and mitochondrial phylogenies were found. A species tree with divergence time estimates indicated that ursine bears diversified within less than 2 My. Consistent with a complex branching order within a clade of Asian bear species, we identified unidirectional gene flow from Asian black into sloth bears. Moreover, gene flow detected from brown into American black bears can explain the conflicting placement of the American black bear in mitochondrial and nuclear phylogenies. These results highlight that both incomplete lineage sorting and introgression are prominent evolutionary forces even on time scales up to several million years. Complex evolutionary patterns are not adequately captured by strictly bifurcating models, and can only be fully understood when analyzing multiple independently inherited loci in a coalescence framework. Phylogenetic incongruence among gene trees hence needs to be recognized as a biologically meaningful signal. PMID:24903145

  16. Bioinformatics analysis and construction of phylogenetic tree of aquaporins from Echinococcus granulosus.

    PubMed

    Wang, Fen; Ye, Bin

    2016-09-01

    Cyst echinococcosis caused by the matacestodal larvae of Echinococcus granulosus (Eg), is a chronic, worldwide, and severe zoonotic parasitosis. The treatment of cyst echinococcosis is still difficult since surgery cannot fit the needs of all patients, and drugs can lead to serious adverse events as well as resistance. The screen of target proteins interacted with new anti-hydatidosis drugs is urgently needed to meet the prevailing challenges. Here, we analyzed the sequences and structure properties, and constructed a phylogenetic tree by bioinformatics methods. The MIP family signature and Protein kinase C phosphorylation sites were predicted in all nine EgAQPs. α-helix and random coil were the main secondary structures of EgAQPs. The numbers of transmembrane regions were three to six, which indicated that EgAQPs contained multiple hydrophobic regions. A neighbor-joining tree indicated that EgAQPs were divided into two branches, seven EgAQPs formed a clade with AQP1 from human, a "strict" aquaporins, other two EgAQPs formed a clade with AQP9 from human, an aquaglyceroporins. Unfortunately, homology modeling of EgAQPs was aborted. These results provide a foundation for understanding and researches of the biological function of E. granulosus. PMID:27164831

  17. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees

    PubMed Central

    Letunic, Ivica; Bork, Peer

    2016-01-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. The current version was completely redesigned and rewritten, utilizing current web technologies for speedy and streamlined processing. Numerous new features were introduced and several new data types are now supported. Trees with up to 100,000 leaves can now be efficiently displayed. Full interactive control over precise positioning of various annotation features and an unlimited number of datasets allow the easy creation of complex tree visualizations. iTOL 3 is the first tool which supports direct visualization of the recently proposed phylogenetic placements format. Finally, iTOL's account system has been redesigned to simplify the management of trees in user-defined workspaces and projects, as it is heavily used and currently handles already more than 500,000 trees from more than 10,000 individual users. PMID:27095192

  18. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees.

    PubMed

    Letunic, Ivica; Bork, Peer

    2016-07-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. The current version was completely redesigned and rewritten, utilizing current web technologies for speedy and streamlined processing. Numerous new features were introduced and several new data types are now supported. Trees with up to 100,000 leaves can now be efficiently displayed. Full interactive control over precise positioning of various annotation features and an unlimited number of datasets allow the easy creation of complex tree visualizations. iTOL 3 is the first tool which supports direct visualization of the recently proposed phylogenetic placements format. Finally, iTOL's account system has been redesigned to simplify the management of trees in user-defined workspaces and projects, as it is heavily used and currently handles already more than 500,000 trees from more than 10,000 individual users. PMID:27095192

  19. Molecular phylogenetics reveal multiple tertiary vicariance origins of the African rain forest trees

    PubMed Central

    Couvreur, Thomas LP; Chatrou, Lars W; Sosef, Marc SM; Richardson, James E

    2008-01-01

    Background Tropical rain forests are the most diverse terrestrial ecosystems on the planet. How this diversity evolved remains largely unexplained. In Africa, rain forests are situated in two geographically isolated regions: the West-Central Guineo-Congolian region and the coastal and montane regions of East Africa. These regions have strong floristic affinities with each other, suggesting a former connection via an Eocene pan-African rain forest. High levels of endemism observed in both regions have been hypothesized to be the result of either 1) a single break-up followed by a long isolation or 2) multiple fragmentation and reconnection since the Oligocene. To test these hypotheses the evolutionary history of endemic taxa within a rain forest restricted African lineage of the plant family Annonaceae was studied. Molecular phylogenies and divergence dates were estimated using a Bayesian relaxed uncorrelated molecular clock assumption accounting for both calibration and phylogenetic uncertainties. Results Our results provide strong evidence that East African endemic lineages of Annonaceae have multiple origins dated to significantly different times spanning the Oligocene and Miocene epochs. Moreover, these successive origins (c. 33, 16 and 8 million years – Myr) coincide with known periods of aridification and geological activity in Africa that would have recurrently isolated the Guineo-Congolian rain forest from the East African one. All East African taxa were found to have diversified prior to Pleistocene times. Conclusion Molecular phylogenetic dating analyses of this large pan-African clade of Annonaceae unravels an interesting pattern of diversification for rain forest restricted trees co-occurring in West/Central and East African rain forests. Our results suggest that repeated reconnections between the West/Central and East African rain forest blocks allowed for biotic exchange while the break-ups induced speciation via vicariance, enhancing the levels of

  20. Phylogenetic Analysis of Local-Scale Tree Soil Associations in a Lowland Moist Tropical Forest

    PubMed Central

    Schreeg, Laura A.; Kress, W. John; Erickson, David L.; Swenson, Nathan G.

    2010-01-01

    Background Local plant-soil associations are commonly studied at the species-level, while associations at the level of nodes within a phylogeny have been less well explored. Understanding associations within a phylogenetic context, however, can improve our ability to make predictions across systems and can advance our understanding of the role of evolutionary history in structuring communities. Methodology/Principal Findings Here we quantified evolutionary signal in plant-soil associations using a DNA sequence-based community phylogeny and several soil variables (e.g., extractable phosphorus, aluminum and manganese, pH, and slope as a proxy for soil water). We used published plant distributional data from the 50-ha plot on Barro Colorado Island (BCI), Republic of Panamá. Our results suggest some groups of closely related species do share similar soil associations. Most notably, the node shared by Myrtaceae and Vochysiaceae was associated with high levels of aluminum, a potentially toxic element. The node shared by Apocynaceae was associated with high extractable phosphorus, a nutrient that could be limiting on a taxon specific level. The node shared by the large group of Laurales and Magnoliales was associated with both low extractable phosphorus and with steeper slope. Despite significant node-specific associations, this study detected little to no phylogeny-wide signal. We consider the majority of the ‘traits’ (i.e., soil variables) evaluated to fall within the category of ecological traits. We suggest that, given this category of traits, phylogeny-wide signal might not be expected while node-specific signals can still indicate phylogenetic structure with respect to the variable of interest. Conclusions Within the BCI forest dynamics plot, distributions of some plant taxa are associated with local-scale differences in soil variables when evaluated at individual nodes within the phylogenetic tree, but they are not detectable by phylogeny-wide signal. Trends

  1. Phylogenetic diversity of endophytic leaf fungus isolates from the medicinal tree Trichilia elegans (Meliaceae).

    PubMed

    Rhoden, S A; Garcia, A; Rubin Filho, C J; Azevedo, J L; Pamphile, J A

    2012-01-01

    Various types of organisms, mainly fungi and bacteria, live within vegetal organs and tissues, without causing damage to the plant. These microorganisms, which are called endophytes, can be useful for biological control and plant growth promotion; bioactive compounds from these organisms may have medical and pharmaceutical applications. Trichilia elegans (Meliaceae) is a native tree that grows abundantly in several regions of Brazil. Preparations using the leaves, seeds, bark, and roots of many species of the Meliaceae family have been widely used in traditional medicine, and some members of the Trichilia genus are used in Brazilian popular medicine. We assessed the diversity of endophytic fungi from two wild specimens of T. elegans, collected from a forest remnant, by sequencing ITS1-5.8S-ITS2 of rDNA of the isolates. The fungi were isolated and purified; 97 endophytic fungi were found; they were separated into 17 morpho-groups. Of the 97 endophytic fungi, four genera (Phomopsis, Diaporthe, Dothideomycete, and Cordyceps) with 11 morpho-groups were identified. Phomopsis was the most frequent genus among the identified endophytes. Phylogenetic analysis showed two major clades: Sordariomycetes, which includes three genera, Phomopsis, Diaporthe, and Cordyceps, and the clade Dothideomycetes, which was represented by the order Pleosporales. PMID:22782630

  2. First record of Bursaphelenchus rainulfi on pine trees from eastern China and its phylogenetic relationship with intro-genus species*

    PubMed Central

    Jiang, Li-qin; Li, Xu-qing; Zheng, Jing-wu

    2007-01-01

    Bursaphelenchus rainulfi isolated from dead pine trees in Zhejiang, China, is described and illustrated. It also provided some molecular characters of the Chinese population, including the PCR-RFLP and sequences of ITS region and D2-D3 expansion region of the large subunit (LSU) rRNA gene. Both the morphological characters and ITS-RFLP patterns match with the original description. The phylogenetic trees based on the 13 sequences of D2-D3 expansion region of the LSU rRNA gene and ITS region of Bursaphelenchus species were constructed, respectively, with the results showing the similar clades. The phylogenetic relationship based on the molecular data is similar to that with morphological characters. This is the first report of the species on pine wood in eastern China. PMID:17542063

  3. Analyses of the radiation of birnaviruses from diverse host phyla and of their evolutionary affinities with other double-stranded RNA and positive strand RNA viruses using robust structure-based multiple sequence alignments and advanced phylogenetic methods

    PubMed Central

    2013-01-01

    Background Birnaviruses form a distinct family of double-stranded RNA viruses infecting animals as different as vertebrates, mollusks, insects and rotifers. With such a wide host range, they constitute a good model for studying the adaptation to the host. Additionally, several lines of evidence link birnaviruses to positive strand RNA viruses and suggest that phylogenetic analyses may provide clues about transition. Results We characterized the genome of a birnavirus from the rotifer Branchionus plicalitis. We used X-ray structures of RNA-dependent RNA polymerases and capsid proteins to obtain multiple structure alignments that allowed us to obtain reliable multiple sequence alignments and we employed “advanced” phylogenetic methods to study the evolutionary relationships between some positive strand and double-stranded RNA viruses. We showed that the rotifer birnavirus genome exhibited an organization remarkably similar to other birnaviruses. As this host was phylogenetically very distant from the other known species targeted by birnaviruses, we revisited the evolutionary pathways within the Birnaviridae family using phylogenetic reconstruction methods. We also applied a number of phylogenetic approaches based on structurally conserved domains/regions of the capsid and RNA-dependent RNA polymerase proteins to study the evolutionary relationships between birnaviruses, other double-stranded RNA viruses and positive strand RNA viruses. Conclusions We show that there is a good correlation between the phylogeny of the birnaviruses and that of their hosts at the phylum level using the RNA-dependent RNA polymerase (genomic segment B) on the one hand and a concatenation of the capsid protein, protease and ribonucleoprotein (genomic segment A) on the other hand. This correlation tends to vanish within phyla. The use of advanced phylogenetic methods and robust structure-based multiple sequence alignments allowed us to obtain a more accurate picture (in terms of

  4. Quantification and functional analysis of modular protein evolution in a dense phylogenetic tree.

    PubMed

    Moore, Andrew D; Grath, Sonja; Schüler, Andreas; Huylmans, Ann K; Bornberg-Bauer, Erich

    2013-05-01

    Modularity is a hallmark of molecular evolution. Whether considering gene regulation, the components of metabolic pathways or signaling cascades, the ability to reuse autonomous modules in different molecular contexts can expedite evolutionary innovation. Similarly, protein domains are the modules of proteins, and modular domain rearrangements can create diversity with seemingly few operations in turn allowing for swift changes to an organism's functional repertoire. Here, we assess the patterns and functional effects of modular rearrangements at high resolution. Using a well resolved and diverse group of pancrustaceans, we illustrate arrangement diversity within closely related organisms, estimate arrangement turnover frequency and establish, for the first time, branch-specific rate estimates for fusion, fission, domain addition and terminal loss. Our results show that roughly 16 new arrangements arise per million years and that between 64% and 81% of these can be explained by simple, single-step modular rearrangement events. We find evidence that the frequencies of fission and terminal deletion events increase over time, and that modular rearrangements impact all levels of the cellular signaling apparatus and thus may have strong adaptive potential. Novel arrangements that cannot be explained by simple modular rearrangements contain a significant amount of repeat domains that occur in complex patterns which we term "supra-repeats". Furthermore, these arrangements are significantly longer than those with a single-step rearrangement solution, suggesting that such arrangements may result from multi-step events. In summary, our analysis provides an integrated view and initial quantification of the patterns and functional impact of modular protein evolution in a well resolved phylogenetic tree. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly. PMID:23376183

  5. Visualizing differences in phylogenetic information content of alignments and distinction of three classes of long-branch effects

    PubMed Central

    Wägele, Johann Wolfgang; Mayer, Christoph

    2007-01-01

    Background Published molecular phylogenies are usually based on data whose quality has not been explored prior to tree inference. This leads to errors because trees obtained with conventional methods suppress conflicting evidence, and because support values may be high even if there is no distinct phylogenetic signal. Tools that allow an a priori examination of data quality are rarely applied. Results Using data from published molecular analyses on the phylogeny of crustaceans it is shown that tree topologies and popular support values do not show existing differences in data quality. To visualize variations in signal distinctness, we use network analyses based on split decomposition and split support spectra. Both methods show the same differences in data quality and the same clade-supporting patterns. Both methods are useful to discover long-branch effects. We discern three classes of long branch effects. Class I effects consist of attraction of terminal taxa caused by symplesiomorphies, which results in a false monophyly of paraphyletic groups. Addition of carefully selected taxa can fix this effect. Class II effects are caused by drastic signal erosion. Long branches affected by this phenomenon usually slip down the tree to form false clades that in reality are polyphyletic. To recover the correct phylogeny, more conservative genes must be used. Class III effects consist of attraction due to accumulated chance similarities or convergent character states. This sort of noise can be reduced by selecting less variable portions of the data set, avoiding biases, and adding slower genes. Conclusion To increase confidence in molecular phylogenies an exploratory analysis of the signal to noise ratio can be conducted with split decomposition methods. If long-branch effects are detected, it is necessary to discern between three classes of effects to find the best approach for an improvement of the raw data. PMID:17725833

  6. The Hymenopteran Tree of Life: Evidence from Protein-Coding Genes and Objectively Aligned Ribosomal Data

    PubMed Central

    Klopfstein, Seraina; Vilhelmsen, Lars; Heraty, John M.; Sharkey, Michael; Ronquist, Fredrik

    2013-01-01

    Previous molecular analyses of higher hymenopteran relationships have largely been based on subjectively aligned ribosomal sequences (18S and 28S). Here, we reanalyze the 18S and 28S data (unaligned about 4.4 kb) using an objective and a semi-objective alignment approach, based on MAFFT and BAli-Phy, respectively. Furthermore, we present the first analyses of a substantial protein-coding data set (4.6 kb from one mitochondrial and four nuclear genes). Our results indicate that previous studies may have suffered from inflated support values due to subjective alignment of the ribosomal sequences, but apparently not from significant biases. The protein data provide independent confirmation of several earlier results, including the monophyly of non-xyelid hymenopterans, Pamphilioidea + Unicalcarida, Unicalcarida, Vespina, Apocrita, Proctotrupomorpha and core Proctotrupomorpha. The protein data confirm that Aculeata are nested within a paraphyletic Evaniomorpha, but cast doubt on the monophyly of Evanioidea. Combining the available morphological, ribosomal and protein-coding data, we examine the total-evidence signal as well as congruence and conflict among the three data sources. Despite an emerging consensus on many higher-level hymenopteran relationships, several problems remain unresolved or contentious, including rooting of the hymenopteran tree, relationships of the woodwasps, placement of Stephanoidea and Ceraphronoidea, and the sister group of Aculeata. PMID:23936325

  7. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  8. An African American Paternal Lineage Adds an Extremely Ancient Root to the Human Y Chromosome Phylogenetic Tree

    PubMed Central

    Mendez, Fernando L.; Krahn, Thomas; Schrack, Bonnie; Krahn, Astrid-Maria; Veeramah, Krishna R.; Woerner, August E.; Fomine, Forka Leypey Mathew; Bradman, Neil; Thomas, Mark G.; Karafet, Tatiana M.; Hammer, Michael F.

    2013-01-01

    We report the discovery of an African American Y chromosome that carries the ancestral state of all SNPs that defined the basal portion of the Y chromosome phylogenetic tree. We sequenced ∼240 kb of this chromosome to identify private, derived mutations on this lineage, which we named A00. We then estimated the time to the most recent common ancestor (TMRCA) for the Y tree as 338 thousand years ago (kya) (95% confidence interval = 237–581 kya). Remarkably, this exceeds current estimates of the mtDNA TMRCA, as well as those of the age of the oldest anatomically modern human fossils. The extremely ancient age combined with the rarity of the A00 lineage, which we also find at very low frequency in central Africa, point to the importance of considering more complex models for the origin of Y chromosome diversity. These models include ancient population structure and the possibility of archaic introgression of Y chromosomes into anatomically modern humans. The A00 lineage was discovered in a large database of consumer samples of African Americans and has not been identified in traditional hunter-gatherer populations from sub-Saharan Africa. This underscores how the stochastic nature of the genealogical process can affect inference from a single locus and warrants caution during the interpretation of the geographic location of divergent branches of the Y chromosome phylogenetic tree for the elucidation of human origins. PMID:23453668

  9. Comprehensive phylogenetic reconstruction of relationships in Octocorallia (Cnidaria: Anthozoa) from the Atlantic ocean using mtMutS and nad2 genes tree reconstructions

    NASA Astrophysics Data System (ADS)

    Morris, K. J.; Herrera, S.; Gubili, C.; Tyler, P. A.; Rogers, A.; Hauton, C.

    2012-12-01

    Despite being an abundant group of significant ecological importance the phylogenetic relationships of the Octocorallia remain poorly understood and very much understudied. We used 1132 bp of two mitochondrial protein-coding genes, nad2 and mtMutS (previously referred to as msh1), to construct a phylogeny for 161 octocoral specimens from the Atlantic, including both Isididae and non-Isididae species. We found that four clades were supported using a concatenated alignment. Two of these (A and B) were in general agreement with the of Holaxonia-Alcyoniina and Anthomastus-Corallium clades identified by previous work. The third and fourth clades represent a split of the Calcaxonia-Pennatulacea clade resulting in a clade containing the Pennatulacea and a small number of Isididae specimens and a second clade containing the remaining Calcaxonia. When individual genes were considered nad2 largely agreed with previous work with MtMutS also producing a fourth clade corresponding to a split of Isididae species from the Calcaxonia-Pennatulacea clade. It is expected these difference are a consequence of the inclusion of Isisdae species that have undergone a gene inversion in the mtMutS gene causing their separation in the MtMutS only tree. The fourth clade in the concatenated tree is also suspected to be a result of this gene inversion, as there were very few Isidiae species included in previous work tree and thus this separation would not be clearly resolved. A~larger phylogeny including both Isididae and non Isididae species is required to further resolve these clades.

  10. Phylogenetic trait conservation in the partner choice of a group of ectomycorrhizal trees.

    PubMed

    Hayward, Jeremy; Horton, Thomas R

    2014-10-01

    Ecological interactions are frequently conserved across evolutionary time. In the case of mutualisms, these conserved interactions may play a large role in structuring mutualist communities. We hypothesized that phylogenetic trait conservation could play a key role in determining patterns of association in the ectomycorrhizal symbiosis, a globally important trophic mutualism. We used the association between members of the pantropical plant tribe Pisonieae and its fungal mutualist partners as a model system to test the prediction that Pisonieae-associating ectomycorrhizal fungi will be more closely related than expected by chance, reflecting a conserved trait. We tested this prediction using previously published and newly generated sequences in a Bayesian framework incorporating phylogenetic uncertainty. We report that phylogenetic trait conservation does exist in this association. We generated a five-marker phylogeny of members of the Pisonieae and used this phylogeny in a Bayesian relaxed molecular clock analysis. We established that the most recent common ancestors of Pisonieae species and Pisonieae-associating fungi sharing phylogenetic conservation of their patterns of ectomycorrhizal association occurred no more recently than 14.2 Ma. We therefore suggest that phylogenetic trait conservation in the Pisonieae ectomycorrhizal mutualism association represents an inherited syndrome which has existed for at least 14 Myr. PMID:25169622

  11. [Estimating genetic distance and phylogenetic tree of HPA-1-3, 5, and 15 in different populations].

    PubMed

    Feng, Ming-Liang; Huang, Hui; Shen, Tong; Zhang, Xi; Yin, Biao; Yang, Jian-Hao; Liu, Da-Zhuang

    2008-07-01

    According to the human platelet alloantigens (HPA) polymorphisms in five systems, the distributions of HPA-1 -3, 5, and 15 systems in 1 000 Chinese donors were carried out by using a polymerase chain reaction with sequence-specific primers (PCR-SSP) method. The genetic distance and phylogenetic tree between Chinese Hans and other populations were estimated by using DISPAN and PHYLIP software. As presented by the phylogenetic tree, Asian had a convergence with European first, and grouped together with African. Beninese which came from Africa was on the top of dendrogram. Indian was located between Asian and European. Brazilian was converged with other Europe populations. Oceanian Polynexiya had been shown specifically to cluster with Asia populations. These results proved the "out of Africa theory" from one side, and it also confirmed that early migration of Asian is from south to southeast, and east Asia., thus it is probable that Europeans are migrated from south to north, and west Europe. As genetic distance was estimated effectively by HPA systems, HPA systems could serve as the genetic marker in human migration and evolution research. PMID:18779125

  12. Determining the Position of Storks on the Phylogenetic Tree of Waterbirds by Retroposon Insertion Analysis

    PubMed Central

    Kuramoto, Tae; Nishihara, Hidenori; Watanabe, Maiko; Okada, Norihiro

    2015-01-01

    Despite many studies on avian phylogenetics in recent decades that used morphology, mitochondrial genomes, and/or nuclear genes, the phylogenetic positions of several birds (e.g., storks) remain unsettled. In addition to the aforementioned approaches, analysis of retroposon insertions, which are nearly homoplasy-free phylogenetic markers, has also been used in avian phylogenetics. However, the first step in the analysis of retroposon insertions, that is, isolation of retroposons from genomic libraries, is a costly and time-consuming procedure. Therefore, we developed a high-throughput and cost-effective protocol to collect retroposon insertion information based on next-generation sequencing technology, which we call here the STRONG (Screening of Transposons Obtained by Next Generation Sequencing) method, and applied it to 3 waterbird species, for which we identified 35,470 loci containing chicken repeat 1 retroposons (CR1). Our analysis of the presence/absence of 30 CR1 insertions demonstrated the intra- and interordinal phylogenetic relationships in the waterbird assemblage, namely 1) Loons diverged first among the waterbirds, 2) penguins (Sphenisciformes) and petrels (Procellariiformes) diverged next, and 3) among the remaining families of waterbirds traditionally classified in Ciconiiformes/Pelecaniformes, storks (Ciconiidae) diverged first. Furthermore, our genome-scale, in silico retroposon analysis based on published genome data uncovered a complex divergence history among pelican, heron, and ibis lineages, presumably involving ancient interspecies hybridization between the heron and ibis lineages. Thus, our retroposon-based waterbird phylogeny and the established phylogenetic position of storks will help to understand the evolutionary processes of aquatic adaptation and related morphological convergent evolution. PMID:26527652

  13. A Revised Root for the Human Y Chromosomal Phylogenetic Tree: The Origin of Patrilineal Diversity in Africa

    PubMed Central

    Cruciani, Fulvio; Trombetta, Beniamino; Massaia, Andrea; Destro-Bisol, Giovanni; Sellitto, Daniele; Scozzari, Rosaria

    2011-01-01

    To shed light on the structure of the basal backbone of the human Y chromosome phylogeny, we sequenced about 200 kb of the male-specific region of the human Y chromosome (MSY) from each of seven Y chromosomes belonging to clades A1, A2, A3, and BT. We detected 146 biallelic variant sites through this analysis. We used these variants to construct a patrilineal tree, without taking into account any previously reported information regarding the phylogenetic relationships among the seven Y chromosomes here analyzed. There are several key changes at the basal nodes as compared with the most recent reference Y chromosome tree. A different position of the root was determined, with important implications for the origin of human Y chromosome diversity. An estimate of 142 KY was obtained for the coalescence time of the revised MSY tree, which is earlier than that obtained in previous studies and easier to reconcile with plausible scenarios of modern human origin. The number of deep branchings leading to African-specific clades has doubled, further strengthening the MSY-based evidence for a modern human origin in the African continent. An analysis of 2204 African DNA samples showed that the deepest clades of the revised MSY phylogeny are currently found in central and northwest Africa, opening new perspectives on early human presence in the continent. PMID:21601174

  14. [Phylogeny of genus Spermophilus and position of Alashan ground squirrel (Spermophilus alashanicus, Buchner, 1888) on phylogenetic tree of Paleartic short-tailed ground squirrels].

    PubMed

    Kapustina, S Yu; Brandler, O V; Adiya, Ya

    2015-01-01

    Phylogenetic relationships within a group of Paleartic short tailed ground squirrels (Spermophilus), recently defined as genus, are not sufficiently clear and need a critical revision. Interspecies hybridization, found in Eurasian Spermophilus, can affect the results of reconstruction of molecular phylogeny. Alashan ground squirrel position on the phylogenetic tree needs clarification. We analyzed eight nucleotide sequences of cytb gene of S. alashanicus and 127 sequences of other Spermophilus species form GenBank. S.alashanicus and S. dauricus close phylogenetic relationship, and their affinity to ancestral forms of the group are revealed. Monophyly of Colobotis subgenus was confirmed. Paraphyly of eastern and western forms of S. relictus was shown. PMID:26107897

  15. Phylogenetic revision of Minyomerus Horn, 1876 sec. Jansen & Franz, 2015 (Coleoptera, Curculionidae) using taxonomic concept annotations and alignments

    PubMed Central

    Jansen, M. Andrew; Franz, Nico M.

    2015-01-01

    Abstract This contribution adopts the taxonomic concept annotation and alignment approach. Accordingly, and where indicated, previous and newly inferred meanings of taxonomic names are individuated according to one specific source. Articulations among these concepts and pairwise, logically consistent alignments of original and revisionary classifications are also provided, in addition to conventional nomenclatural provenance information. A phylogenetic revision of the broad-nosed weevil genera Minyomerus Horn, 1876 sec. O’Brien & Wibmer (1982), and Piscatopus Sleeper, 1960 sec. O’Brien & Wibmer (1982) (Curculionidae [non-focal]: Entiminae [non-focal]: Tanymecini [non-focal]) is presented. Prior to this study, Minyomerus sec. O’Brien & Wibmer (1982) contained seven species, whereas the monotypic Piscatopus sec. O’Brien & Wibmer (1982) was comprised solely of Piscatopus griseus Sleeper, 1960 sec. O’Brien & Wibmer (1982). We thoroughly redescribe these recognized species-level entities and furthermore describe ten species as new to science: Minyomerus bulbifrons sec. Jansen & Franz (2015) (henceforth: [JF2015]), sp. n., Minyomerus aeriballux [JF2015], sp. n., Minyomerus cracens [JF2015], sp. n., Minyomerus gravivultus [JF2015], sp. n., Minyomerus imberbus [JF2015], sp. n., Minyomerus reburrus [JF2015], sp. n., Minyomerus politus [JF2015], sp. n., Minyomerus puticulatus [JF2015], sp. n., Minyomerus rutellirostris [JF2015], sp. n., and Minyomerus trisetosus [JF2015], sp. n. A cladistic analysis using 46 morphological characters of 22 terminal taxa (5/17 outgroup/ingroup) yielded a single most-parsimonious cladogram (L = 82, CI = 65, RI = 82). The analysis strongly supports the monophyly of Minyomerus [JF2015] with eight unreversed synapomorphies, and places Piscatopus griseus sec. O’Brien & Wibmer (1982) within the genus as sister to Minyomerus rutellirostris [JF2015]. Accordingly, Piscatopus sec. Sleeper (1960), syn. n. is changed to junior synonymy of

  16. Phylogenetic revision of Minyomerus Horn, 1876 sec. Jansen & Franz, 2015 (Coleoptera, Curculionidae) using taxonomic concept annotations and alignments.

    PubMed

    Jansen, M Andrew; Franz, Nico M

    2015-01-01

    This contribution adopts the taxonomic concept annotation and alignment approach. Accordingly, and where indicated, previous and newly inferred meanings of taxonomic names are individuated according to one specific source. Articulations among these concepts and pairwise, logically consistent alignments of original and revisionary classifications are also provided, in addition to conventional nomenclatural provenance information. A phylogenetic revision of the broad-nosed weevil genera Minyomerus Horn, 1876 sec. O'Brien & Wibmer (1982), and Piscatopus Sleeper, 1960 sec. O'Brien & Wibmer (1982) (Curculionidae [non-focal]: Entiminae [non-focal]: Tanymecini [non-focal]) is presented. Prior to this study, Minyomerus sec. O'Brien & Wibmer (1982) contained seven species, whereas the monotypic Piscatopus sec. O'Brien & Wibmer (1982) was comprised solely of Piscatopus griseus Sleeper, 1960 sec. O'Brien & Wibmer (1982). We thoroughly redescribe these recognized species-level entities and furthermore describe ten species as new to science: Minyomerus bulbifrons sec. Jansen & Franz (2015) (henceforth: [JF2015]), sp. n., Minyomerus aeriballux [JF2015], sp. n., Minyomerus cracens [JF2015], sp. n., Minyomerus gravivultus [JF2015], sp. n., Minyomerus imberbus [JF2015], sp. n., Minyomerus reburrus [JF2015], sp. n., Minyomerus politus [JF2015], sp. n., Minyomerus puticulatus [JF2015], sp. n., Minyomerus rutellirostris [JF2015], sp. n., and Minyomerus trisetosus [JF2015], sp. n. A cladistic analysis using 46 morphological characters of 22 terminal taxa (5/17 outgroup/ingroup) yielded a single most-parsimonious cladogram (L = 82, CI = 65, RI = 82). The analysis strongly supports the monophyly of Minyomerus [JF2015] with eight unreversed synapomorphies, and places Piscatopus griseus sec. O'Brien & Wibmer (1982) within the genus as sister to Minyomerus rutellirostris [JF2015]. Accordingly, Piscatopus sec. Sleeper (1960), syn. n. is changed to junior synonymy of Minyomerus [JF2015], and

  17. Extreme convergence in stick insect evolution: phylogenetic placement of the Lord Howe Island tree lobster

    PubMed Central

    Buckley, Thomas R.; Attanayake, Dilini; Bradler, Sven

    2008-01-01

    The ‘tree lobsters’ are an enigmatic group of robust, ground-dwelling stick insects (order Phasmatodea) from the subfamily Eurycanthinae, distributed in New Guinea, New Caledonia and associated islands. Its most famous member is the Lord Howe Island stick insect Dryococelus australis (Montrouzier), which was believed to have become extinct but was rediscovered in 2001 and is considered to be one of the rarest insects in the world. To resolve the evolutionary position of Dryococelus, we constructed a phylogeny from approximately 2.4 kb of mitochondrial and nuclear sequence data from representatives of all major phasmatodean lineages. Our data placed Dryococelus and the New Caledonian tree lobsters outside the New Guinean Eurycanthinae as members of an unrelated Australasian stick insect clade, the Lanceocercata. These results suggest a convergent origin of the ‘tree lobster’ body form. Our reanalysis of tree lobster characters provides additional support for our hypothesis of convergent evolution. We conclude that the phenotypic traits leading to the traditional classification are convergent adaptations to ground-living behaviour. Our molecular dating analyses indicate an ancient divergence (more than 22 Myr ago) between Dryococelus and its Australian relatives. Hence, Dryococelus represents a long-standing separate evolutionary lineage within the stick insects and must be regarded as a key taxon to protect with respect to phasmatodean diversity. PMID:19129110

  18. An Efficient Independence Sampler for Updating Branches in Bayesian Markov chain Monte Carlo Sampling of Phylogenetic Trees.

    PubMed

    Aberer, Andre J; Stamatakis, Alexandros; Ronquist, Fredrik

    2016-01-01

    Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of old branch lengths. This reduces acceptance rates and presumably slows down convergence and mixing. Here, we explore branch proposals that do not rely on old branch lengths but instead are based on approximations of the conditional posterior. Using a diverse set of empirical data sets, we show that most conditional branch posteriors can be accurately approximated via a [Formula: see text] distribution. We empirically determine the relationship between the logarithmic conditional posterior density, its derivatives, and the characteristics of the branch posterior. We use these relationships to derive an independence sampler for proposing branches with an acceptance ratio of ~90% on most data sets. This proposal samples branches between 2× and 3× more efficiently than traditional proposals with respect to the effective sample size per unit of runtime. We also compare the performance of standard topology proposals with hybrid proposals that use the new independence sampler to update those branches that are most affected by the topological change. Our results show that hybrid proposals can sometimes noticeably decrease the number of generations necessary for topological convergence. Inconsistent performance gains indicate that branch updates are not the limiting factor in improving topological convergence for the currently employed set of proposals. However, our independence sampler might be essential for the construction of novel tree proposals that apply more radical topology changes. PMID:26231183

  19. PHYLOViZ Online: web-based tool for visualization, phylogenetic inference, analysis and sharing of minimum spanning trees.

    PubMed

    Ribeiro-Gonçalves, Bruno; Francisco, Alexandre P; Vaz, Cátia; Ramirez, Mário; Carriço, João André

    2016-07-01

    High-throughput sequencing methods generated allele and single nucleotide polymorphism information for thousands of bacterial strains that are publicly available in online repositories and created the possibility of generating similar information for hundreds to thousands of strains more in a single study. Minimum spanning tree analysis of allelic data offers a scalable and reproducible methodological alternative to traditional phylogenetic inference approaches, useful in epidemiological investigations and population studies of bacterial pathogens. PHYLOViZ Online was developed to allow users to do these analyses without software installation and to enable easy accessing and sharing of data and analyses results from any Internet enabled computer. PHYLOViZ Online also offers a RESTful API for programmatic access to data and algorithms, allowing it to be seamlessly integrated into any third party web service or software. PHYLOViZ Online is freely available at https://online.phyloviz.net. PMID:27131357

  20. The Deinococcus-Thermus phylum and the effect of rRNA composition on phylogenetic tree construction

    NASA Technical Reports Server (NTRS)

    Weisburg, W. G.; Giovannoni, S. J.; Woese, C. R.

    1989-01-01

    Through comparative analysis of 16S ribosomal RNA sequences, it can be shown that two seemingly dissimilar types of eubacteria Deinococcus and the ubiquitous hot spring organism Thermus are distantly but specifically related to one another. This confirms an earlier report based upon 16S rRNA oligonucleotide cataloging studies (Hensel et al., 1986). Their two lineages form a distinctive grouping within the eubacteria that deserved the taxonomic status of a phylum. The (partial) sequence of T. aquaticus rRNA appears relatively close to those of other thermophilic eubacteria. e.g. Thermotoga maritima and Thermomicrobium roseum. However, this closeness does not reflect a true evolutionary closeness; rather it is due to a "thermophilic convergence", the result of unusually high G+C composition in the rRNAs of thermophilic bacteria. Unless such compositional biases are taken into account, the branching order and root of phylogenetic trees can be incorrectly inferred.

  1. PHYLOViZ Online: web-based tool for visualization, phylogenetic inference, analysis and sharing of minimum spanning trees

    PubMed Central

    Ribeiro-Gonçalves, Bruno; Francisco, Alexandre P.; Vaz, Cátia; Ramirez, Mário; Carriço, João André

    2016-01-01

    High-throughput sequencing methods generated allele and single nucleotide polymorphism information for thousands of bacterial strains that are publicly available in online repositories and created the possibility of generating similar information for hundreds to thousands of strains more in a single study. Minimum spanning tree analysis of allelic data offers a scalable and reproducible methodological alternative to traditional phylogenetic inference approaches, useful in epidemiological investigations and population studies of bacterial pathogens. PHYLOViZ Online was developed to allow users to do these analyses without software installation and to enable easy accessing and sharing of data and analyses results from any Internet enabled computer. PHYLOViZ Online also offers a RESTful API for programmatic access to data and algorithms, allowing it to be seamlessly integrated into any third party web service or software. PHYLOViZ Online is freely available at https://online.phyloviz.net. PMID:27131357

  2. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity.

    PubMed

    Magallón, Susana; Gómez-Acevedo, Sandra; Sánchez-Reyes, Luna L; Hernández-Hernández, Tania

    2015-07-01

    The establishment of modern terrestrial life is indissociable from angiosperm evolution. While available molecular clock estimates of angiosperm age range from the Paleozoic to the Late Cretaceous, the fossil record is consistent with angiosperm diversification in the Early Cretaceous. The time-frame of angiosperm evolution is here estimated using a sample representing 87% of families and sequences of five plastid and nuclear markers, implementing penalized likelihood and Bayesian relaxed clocks. A literature-based review of the palaeontological record yielded calibrations for 137 phylogenetic nodes. The angiosperm crown age was bound within a confidence interval calculated with a method that considers the fossil record of the group. An Early Cretaceous crown angiosperm age was estimated with high confidence. Magnoliidae, Monocotyledoneae and Eudicotyledoneae diversified synchronously 135-130 million yr ago (Ma); Pentapetalae is 126-121 Ma; and Rosidae (123-115 Ma) preceded Asteridae (119-110 Ma). Family stem ages are continuously distributed between c. 140 and 20 Ma. This time-frame documents an early phylogenetic proliferation that led to the establishment of major angiosperm lineages, and the origin of over half of extant families, in the Cretaceous. While substantial amounts of angiosperm morphological and functional diversity have deep evolutionary roots, extant species richness was probably acquired later. PMID:25615647

  3. Phylogenetic assemblage structure of North American trees is more strongly shaped by glacial-interglacial climate variability in gymnosperms than in angiosperms.

    PubMed

    Ma, Ziyu; Sandel, Brody; Svenning, Jens-Christian

    2016-05-01

    How fast does biodiversity respond to climate change? The relationship of past and current climate with phylogenetic assemblage structure helps us to understand this question. Studies of angiosperm tree diversity in North America have already suggested effects of current water-energy balance and tropical niche conservatism. However, the role of glacial-interglacial climate variability remains to be determined, and little is known about any of these relationships for gymnosperms. Moreover, phylogenetic endemism, the concentration of unique lineages in restricted ranges, may also be related to glacial-interglacial climate variability and needs more attention. We used a refined phylogeny of both angiosperms and gymnosperms to map phylogenetic diversity, clustering and endemism of North American trees in 100-km grid cells, and climate change velocity since Last Glacial Maximum together with postglacial accessibility to recolonization to quantify glacial-interglacial climate variability. We found: (1) Current climate is the dominant factor explaining the overall patterns, with more clustered angiosperm assemblages toward lower temperature, consistent with tropical niche conservatism. (2) Long-term climate stability is associated with higher angiosperm endemism, while higher postglacial accessibility is linked to to more phylogenetic clustering and endemism in gymnosperms. (3) Factors linked to glacial-interglacial climate change have stronger effects on gymnosperms than on angiosperms. These results suggest that paleoclimate legacies supplement current climate in shaping phylogenetic patterns in North American trees, and especially so for gymnosperms. PMID:27252830

  4. Molecular Dissection of the Basal Clades in the Human Y Chromosome Phylogenetic Tree

    PubMed Central

    Scozzari, Rosaria; Massaia, Andrea; D’Atanasio, Eugenia; Myres, Natalie M.; Perego, Ugo A.; Trombetta, Beniamino; Cruciani, Fulvio

    2012-01-01

    One hundred and forty-six previously detected mutations were more precisely positioned in the human Y chromosome phylogeny by the analysis of 51 representative Y chromosome haplogroups and the use of 59 mutations from literature. Twenty-two new mutations were also described and incorporated in the revised phylogeny. This analysis made it possible to identify new haplogroups and to resolve a deep trifurcation within haplogroup B2. Our data provide a highly resolved branching in the African-specific portion of the Y tree and support the hypothesis of an origin in the north-western quadrant of the African continent for the human MSY diversity. PMID:23145109

  5. Predicting MicroRNA Biomarkers for Cancer Using Phylogenetic Tree and Microarray Analysis

    PubMed Central

    Wang, Hsiuying

    2016-01-01

    MicroRNAs (miRNAs) are shown to be involved in the initiation and progression of cancers in the literature, and the expression of miRNAs is used as an important cancer prognostic tool. The aim of this study is to predict high-confidence miRNA biomarkers for cancer. We adopt a method that combines miRNA phylogenetic structure and miRNA microarray data analysis to discover high-confidence miRNA biomarkers for colon, prostate, pancreatic, lung, breast, bladder and kidney cancers. There are 53 miRNAs selected through this method that either have potential to involve a single cancer’s development or to involve several cancers’ development. These miRNAs can be used as high-confidence miRNA biomarkers of these seven investigated cancers for further experiment validation. miR-17, miR-20, miR-106a, miR-106b, miR-92, miR-25, miR-16, miR-195 and miR-143 are selected to involve a single cancer’s development in these seven cancers. They have the potential to be useful miRNA biomarkers when the result can be confirmed by experiments. PMID:27213352

  6. Optimization strategies for fast detection of positive selection on phylogenetic trees

    PubMed Central

    Valle, Mario; Schabauer, Hannes; Pacher, Christoph; Stockinger, Heinz; Stamatakis, Alexandros; Robinson-Rechavi, Marc; Salamin, Nicolas

    2014-01-01

    Motivation: The detection of positive selection is widely used to study gene and genome evolution, but its application remains limited by the high computational cost of existing implementations. We present a series of computational optimizations for more efficient estimation of the likelihood function on large-scale phylogenetic problems. We illustrate our approach using the branch-site model of codon evolution. Results: We introduce novel optimization techniques that substantially outperform both CodeML from the PAML package and our previously optimized sequential version SlimCodeML. These techniques can also be applied to other likelihood-based phylogeny software. Our implementation scales well for large numbers of codons and/or species. It can therefore analyse substantially larger datasets than CodeML. We evaluated FastCodeML on different platforms and measured average sequential speedups of FastCodeML (single-threaded) versus CodeML of up to 5.8, average speedups of FastCodeML (multi-threaded) versus CodeML on a single node (shared memory) of up to 36.9 for 12 CPU cores, and average speedups of the distributed FastCodeML versus CodeML of up to 170.9 on eight nodes (96 CPU cores in total). Availability and implementation: ftp://ftp.vital-it.ch/tools/FastCodeML/. Contact: selectome@unil.ch or nicolas.salamin@unil.ch PMID:24389654

  7. Predicting MicroRNA Biomarkers for Cancer Using Phylogenetic Tree and Microarray Analysis.

    PubMed

    Wang, Hsiuying

    2016-01-01

    MicroRNAs (miRNAs) are shown to be involved in the initiation and progression of cancers in the literature, and the expression of miRNAs is used as an important cancer prognostic tool. The aim of this study is to predict high-confidence miRNA biomarkers for cancer. We adopt a method that combines miRNA phylogenetic structure and miRNA microarray data analysis to discover high-confidence miRNA biomarkers for colon, prostate, pancreatic, lung, breast, bladder and kidney cancers. There are 53 miRNAs selected through this method that either have potential to involve a single cancer's development or to involve several cancers' development. These miRNAs can be used as high-confidence miRNA biomarkers of these seven investigated cancers for further experiment validation. miR-17, miR-20, miR-106a, miR-106b, miR-92, miR-25, miR-16, miR-195 and miR-143 are selected to involve a single cancer's development in these seven cancers. They have the potential to be useful miRNA biomarkers when the result can be confirmed by experiments. PMID:27213352

  8. A phylogenetic framework for evolutionary study of the nightshades (Solanaceae): a dated 1000-tip tree

    PubMed Central

    2013-01-01

    Background The Solanaceae is a plant family of great economic importance. Despite a wealth of phylogenetic work on individual clades and a deep knowledge of particular cultivated species such as tomato and potato, a robust evolutionary framework with a dated molecular phylogeny for the family is still lacking. Here we investigate molecular divergence times for Solanaceae using a densely-sampled species-level phylogeny. We also review the fossil record of the family to derive robust calibration points, and estimate a chronogram using an uncorrelated relaxed molecular clock. Results Our densely-sampled phylogeny shows strong support for all previously identified clades of Solanaceae and strongly supported relationships between the major clades, particularly within Solanum. The Tomato clade is shown to be sister to section Petota, and the Regmandra clade is the first branching member of the Potato clade. The minimum age estimates for major splits within the family provided here correspond well with results from previous studies, indicating splits between tomato & potato around 8 Million years ago (Ma) with a 95% highest posterior density (HPD) 7–10 Ma, Solanum & Capsicum c. 19 Ma (95% HPD 17–21), and Solanum & Nicotiana c. 24 Ma (95% HPD 23–26). Conclusions Our large time-calibrated phylogeny provides a significant step towards completing a fully sampled species-level phylogeny for Solanaceae, and provides age estimates for the whole family. The chronogram now includes 40% of known species and all but two monotypic genera, and is one of the best sampled angiosperm family phylogenies both in terms of taxon sampling and resolution published thus far. The increased resolution in the chronogram combined with the large increase in species sampling will provide much needed data for the examination of many biological questions using Solanaceae as a model system. PMID:24283922

  9. Probabilistic phylogenetic inference with insertions and deletions.

    PubMed

    Rivas, Elena; Eddy, Sean R

    2008-01-01

    A fundamental task in sequence analysis is to calculate the probability of a multiple alignment given a phylogenetic tree relating the sequences and an evolutionary model describing how sequences change over time. However, the most widely used phylogenetic models only account for residue substitution events. We describe a probabilistic model of a multiple sequence alignment that accounts for insertion and deletion events in addition to substitutions, given a phylogenetic tree, using a rate matrix augmented by the gap character. Starting from a continuous Markov process, we construct a non-reversible generative (birth-death) evolutionary model for insertions and deletions. The model assumes that insertion and deletion events occur one residue at a time. We apply this model to phylogenetic tree inference by extending the program dnaml in phylip. Using standard benchmarking methods on simulated data and a new "concordance test" benchmark on real ribosomal RNA alignments, we show that the extended program dnamlepsilon improves accuracy relative to the usual approach of ignoring gaps, while retaining the computational efficiency of the Felsenstein peeling algorithm. PMID:18787703

  10. A classification of the Chloridoideae (Poaceae) based on multi-gene phylogenetic trees.

    PubMed

    Peterson, Paul M; Romaschenko, Konstantin; Johnson, Gabriel

    2010-05-01

    We conducted a molecular phylogenetic study of the subfamily Chloridoideae using six plastid DNA sequences (ndhA intron, ndhF, rps16-trnK, rps16 intron, rps3, and rpl32-trnL) and a single nuclear ITS DNA sequence. Our large original data set includes 246 species (17.3%) representing 95 genera (66%) of the grasses currently placed in the Chloridoideae. The maximum likelihood and Bayesian analysis of DNA sequences provides strong support for the monophyly of the Chloridoideae; followed by, in order of divergence: a Triraphideae clade with Neyraudia sister to Triraphis; an Eragrostideae clade with the Cotteinae (includes Cottea and Enneapogon) sister to the Uniolinae (includes Entoplocamia, Tetrachne, and Uniola), and a terminal Eragrostidinae clade of Ectrosia, Harpachne, and Psammagrostis embedded in a polyphyletic Eragrostis; a Zoysieae clade with Urochondra sister to a Zoysiinae (Zoysia) clade, and a terminal Sporobolinae clade that includes Spartina, Calamovilfa, Pogoneura, and Crypsis embedded in a polyphyletic Sporobolus; and a very large terminal Cynodonteae clade that includes 13 monophyletic subtribes. The Cynodonteae includes, in alphabetical order: Aeluropodinae (Aeluropus); Boutelouinae (Bouteloua); Eleusininae (includes Apochiton, Astrebla with Schoenefeldia embedded, Austrochloris, Brachyachne, Chloris, Cynodon with Brachyachne embedded in part, Eleusine, Enteropogon with Eustachys embedded in part, Eustachys, Chrysochloa, Coelachyrum, Leptochloa with Dinebra embedded, Lepturus, Lintonia, Microchloa, Saugetia, Schoenefeldia, Sclerodactylon, Tetrapogon, and Trichloris); Hilariinae (Hilaria); Monanthochloinae (includes Distichlis, Monanthochloe, and Reederochloa); Muhlenbergiinae (Muhlenbergia with Aegopogon, Bealia, Blepharoneuron, Chaboissaea, Lycurus, Pereilema, Redfieldia, Schaffnerella, and Schedonnardus all embedded); Orcuttiinae (includes Orcuttia and Tuctoria); Pappophorinae (includes Neesiochloa and Pappophorum); Scleropogoninae (includes

  11. Phylogenetic analysis of otospiralin protein

    PubMed Central

    Torktaz, Ibrahim; Behjati, Mohaddeseh; Rostami, Amin

    2016-01-01

    Background: Fibrocyte-specific protein, otospiralin, is a small protein, widely expressed in the central nervous system as neuronal cell bodies and glia. The increased expression of otospiralin in reactive astrocytes implicates its role in signaling pathways and reparative mechanisms subsequent to injury. Indeed, otospiralin is considered to be essential for the survival of fibrocytes of the mesenchymal nonsensory regions of the cochlea. It seems that other functions of this protein are not yet completely understood. Materials and Methods: Amino acid sequences of otospiralin from 12 vertebrates were derived from National Center for Biotechnology Information database. Phylogenetic analysis and phylogeny estimation were performed using MEGA 5.0.5 program, and neighbor-joining tree was constructed by this software. Results: In this computational study, the phylogenetic tree of otospiralin has been investigated. Therefore, dendrograms of otospiralin were depicted. Alignment performed in MUSCLE method by UPGMB algorithm. Also, entropy plot determined for a better illustration of amino acid variations in this protein. Conclusion: In the present study, we used otospiralin sequence of 12 different species and by constructing phylogenetic tree, we suggested out group for some related species. PMID:27099854

  12. A hybrid phylogenetic-phylogenomic approach for species tree estimation in African Agama lizards with applications to biogeography, character evolution, and diversification.

    PubMed

    Leaché, Adam D; Wagner, Philipp; Linkem, Charles W; Böhme, Wolfgang; Papenfuss, Theodore J; Chong, Rebecca A; Lavin, Brian R; Bauer, Aaron M; Nielsen, Stuart V; Greenbaum, Eli; Rödel, Mark-Oliver; Schmitz, Andreas; LeBreton, Matthew; Ineich, Ivan; Chirio, Laurent; Ofori-Boateng, Caleb; Eniang, Edem A; Baha El Din, Sherif; Lemmon, Alan R; Burbrink, Frank T

    2014-10-01

    Africa is renowned for its biodiversity and endemicity, yet little is known about the factors shaping them across the continent. African Agama lizards (45 species) have a pan-continental distribution, making them an ideal model for investigating biogeography. Many species have evolved conspicuous sexually dimorphic traits, including extravagant breeding coloration in adult males, large adult male body sizes, and variability in social systems among colorful versus drab species. We present a comprehensive time-calibrated species tree for Agama, and their close relatives, using a hybrid phylogenetic-phylogenomic approach that combines traditional Sanger sequence data from five loci for 57 species (146 samples) with anchored phylogenomic data from 215 nuclear genes for 23 species. The Sanger data are analyzed using coalescent-based species tree inference using (*)BEAST, and the resulting posterior distribution of species trees is attenuated using the phylogenomic tree as a backbone constraint. The result is a time-calibrated species tree for Agama that includes 95% of all species, multiple samples for most species, strong support for the major clades, and strong support for most of the initial divergence events. Diversification within Agama began approximately 23 million years ago (Ma), and separate radiations in Southern, East, West, and Northern Africa have been diversifying for >10Myr. A suite of traits (morphological, coloration, and sociality) are tightly correlated and show a strong signal of high morphological disparity within clades, whereby the subsequent evolution of convergent phenotypes has accompanied diversification into new biogeographic areas. PMID:24973715

  13. Phylogenetic position of the genus Perkinsus (Protista, Apicomplexa) based on small subunit ribosomal RNA.

    PubMed

    Goggin, C L; Barker, S C

    1993-07-01

    Parasites of the genus Perkinsus destroy marine molluscs worldwide. Their phylogenetic position within the kingdom Protista is controversial. Nucleotide sequence data (1792 bp) from the small subunit rRNA gene of Perkinsus sp. from Anadara trapezia (Mollusca: Bivalvia) from Moreton Bay, Queensland, was used to examine the phylogenetic affinities of this enigmatic genus. These data were aligned with nucleotide sequences from 6 apicomplexans, 3 ciliates, 3 flagellates, a dinoflagellate, 3 fungi, maize and human. Phylogenetic trees were constructed after analysis with maximum parsimony and distance matrix methods. Our analyses indicate that Perkinsus is phylogenetically closer to dinoflagellates and to coccidean and piroplasm apicomplexans than to fungi or flagellates. PMID:8366895

  14. Ultra-large alignments using phylogeny-aware profiles.

    PubMed

    Nguyen, Nam-Phuong D; Mirarab, Siavash; Kumar, Keerthana; Warnow, Tandy

    2015-01-01

    Many biological questions, including the estimation of deep evolutionary histories and the detection of remote homology between protein sequences, rely upon multiple sequence alignments and phylogenetic trees of large datasets. However, accurate large-scale multiple sequence alignment is very difficult, especially when the dataset contains fragmentary sequences. We present UPP, a multiple sequence alignment method that uses a new machine learning technique, the ensemble of hidden Markov models, which we propose here. UPP produces highly accurate alignments for both nucleotide and amino acid sequences, even on ultra-large datasets or datasets containing fragmentary sequences. UPP is available at https://github.com/smirarab/sepp . PMID:26076734

  15. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  16. Open Reading Frame Phylogenetic Analysis on the Cloud

    PubMed Central

    2013-01-01

    Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843

  17. Open reading frame phylogenetic analysis on the cloud.

    PubMed

    Hung, Che-Lun; Lin, Chun-Yuan

    2013-01-01

    Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843

  18. Explaining forest productivity using tree functional traits and phylogenetic information: two sides of the same coin over evolutionary scale?

    PubMed Central

    Paquette, Alain; Joly, Simon; Messier, Christian

    2015-01-01

    Given evidences that diverse ecosystems provide more services than depauperate ones, much attention has now turned toward finding meaningful and operational diversity indices. We ask two questions: (1) Does phylogenetic diversity contain additional information not explained by functional traits? And (2) What are the strength and nature of the correlation between phylogeny and functional traits according to the evolutionary scale considered? We used data from permanent forest plots of northeastern Canada for which these links have been demonstrated and important functional traits identified. We show that the nature of the relationship between traits and phylogeny varies dramatically among traits, but also according to the evolutionary distance considered. The demonstration that different characters show phylogenetic autocorrelation at different evolutionary depths suggests that phylogenetic content of traits may be too crude to determine whether phylogenies contain relevant information. However, our study provides support for the use of phylogenies to assess ecosystem functioning when key functional traits are unavailable. We also highlight a potentially important contribution of phylogenetics for conservation and the study of the impact of biodiversity loss on ecosystem functioning and the provision of services, given the accumulating evidence that mechanisms promoting diversity effects shift over time to involve different traits. PMID:26140194

  19. PHYLOGENETIC TREE OF 16S RIBOSOMAL RNA SEQUENCES FROM SULFATE-REDUCING BACTERIA IN A SANDY MARINE ENVIRONMENT

    EPA Science Inventory

    Phylogenetic divergence among sulfate-reducing bacteria in an estuarine sediment sample was investigated by PCR amplification and comparison of partial 16S rDNA sequences. wenty unique 16S RDNA sequences were found, 12 from delta subclass bacteria based on overall sequence simila...

  20. TreeParser-Aided Klee Diagrams Display Taxonomic Clusters in DNA Barcode and Nuclear Gene Datasets

    PubMed Central

    Stoeckle, Mark Y.; Coffran, Cameron

    2013-01-01

    Indicator vector analysis of a nucleotide sequence alignment generates a compact heat map, called a Klee diagram, with potential insight into clustering patterns in evolution. However, so far this approach has examined only mitochondrial cytochrome c oxidase I (COI) DNA barcode sequences. To further explore, we developed TreeParser, a freely-available web-based program that sorts a sequence alignment according to a phylogenetic tree generated from the dataset. We applied TreeParser to nuclear gene and COI barcode alignments from birds and butterflies. Distinct blocks in the resulting Klee diagrams corresponded to species and higher-level taxonomic divisions in both groups, and this enabled graphic comparison of phylogenetic information in nuclear and mitochondrial genes. Our results demonstrate TreeParser-aided Klee diagrams objectively display taxonomic clusters in nucleotide sequence alignments. This approach may help establish taxonomy in poorly studied groups and investigate higher-level clustering which appears widespread but not well understood. PMID:24022383

  1. Statistical significance of normalized global alignment.

    PubMed

    Peris, Guillermo; Marzal, Andrés

    2014-03-01

    The comparison of homologous proteins from different species is a first step toward a function assignment and a reconstruction of the species evolution. Though local alignment is mostly used for this purpose, global alignment is important for constructing multiple alignments or phylogenetic trees. However, statistical significance of global alignments is not completely clear, lacking a specific statistical model to describe alignments or depending on computationally expensive methods like Z-score. Recently we presented a normalized global alignment, defined as the best compromise between global alignment cost and length, and showed that this new technique led to better classification results than Z-score at a much lower computational cost. However, it is necessary to analyze the statistical significance of the normalized global alignment in order to be considered a completely functional algorithm for protein alignment. Experiments with unrelated proteins extracted from the SCOP ASTRAL database showed that normalized global alignment scores can be fitted to a log-normal distribution. This fact, obtained without any theoretical support, can be used to derive statistical significance of normalized global alignments. Results are summarized in a table with fitted parameters for different scoring schemes. PMID:24400820

  2. On comparing two structured RNA multiple alignments.

    PubMed

    Patel, Vandanaben; Wang, Jason T L; Setia, Shefali; Verma, Anurag; Warden, Charles D; Zhang, Kaizhong

    2010-12-01

    We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike many other tools that can perform pairwise alignment of either single sequences or structures only, BlockMatch takes into account the characteristics of all the sequences in the blocks along with their consensus structures during the alignment process, thus being able to achieve a high-quality alignment result. We apply BlockMatch to phylogeny reconstruction on a set of 5S rRNA sequences taken from fifteen bacteria species. Experimental results showed that the phylogenetic tree generated by our method is more accurate than the tree constructed based on the widely used ClustalW tool. The BlockMatch algorithm is implemented into a web server, accessible at http://bioinformatics.njit.edu/blockmatch. A jar file of the program is also available for download from the web server. PMID:21121021

  3. Internal Transcribed Spacer 2 (nu ITS2 rRNA) Sequence-Structure Phylogenetics: Towards an Automated Reconstruction of the Green Algal Tree of Life

    PubMed Central

    Buchheim, Mark A.; Keller, Alexander; Koetschan, Christian; Förster, Frank; Merget, Benjamin; Wolf, Matthias

    2011-01-01

    Background Chloroplast-encoded genes (matK and rbcL) have been formally proposed for use in DNA barcoding efforts targeting embryophytes. Extending such a protocol to chlorophytan green algae, though, is fraught with problems including non homology (matK) and heterogeneity that prevents the creation of a universal PCR toolkit (rbcL). Some have advocated the use of the nuclear-encoded, internal transcribed spacer two (ITS2) as an alternative to the traditional chloroplast markers. However, the ITS2 is broadly perceived to be insufficiently conserved or to be confounded by introgression or biparental inheritance patterns, precluding its broad use in phylogenetic reconstruction or as a DNA barcode. A growing body of evidence has shown that simultaneous analysis of nucleotide data with secondary structure information can overcome at least some of the limitations of ITS2. The goal of this investigation was to assess the feasibility of an automated, sequence-structure approach for analysis of IT2 data from a large sampling of phylum Chlorophyta. Methodology/Principal Findings Sequences and secondary structures from 591 chlorophycean, 741 trebouxiophycean and 938 ulvophycean algae, all obtained from the ITS2 Database, were aligned using a sequence structure-specific scoring matrix. Phylogenetic relationships were reconstructed by Profile Neighbor-Joining coupled with a sequence structure-specific, general time reversible substitution model. Results from analyses of the ITS2 data were robust at multiple nodes and showed considerable congruence with results from published phylogenetic analyses. Conclusions/Significance Our observations on the power of automated, sequence-structure analyses of ITS2 to reconstruct phylum-level phylogenies of the green algae validate this approach to assessing diversity for large sets of chlorophytan taxa. Moreover, our results indicate that objections to the use of ITS2 for DNA barcoding should be weighed against the utility of an automated

  4. Evolutionary history of the Afro-Madagascan Ixora species (Rubiaceae): species diversification and distribution of key morphological traits inferred from dated molecular phylogenetic trees

    PubMed Central

    Tosh, J.; Dessein, S.; Buerki, S.; Groeninckx, I.; Mouly, A.; Bremer, B.; Smets, E. F.; De Block, P.

    2013-01-01

    Background and Aims Previous work on the pantropical genus Ixora has revealed an Afro-Madagascan clade, but as yet no study has focused in detail on the evolutionary history and morphological trends in this group. Here the evolutionary history of Afro-Madagascan Ixora spp. (a clade of approx. 80 taxa) is investigated and the phylogenetic trees compared with several key morphological traits in taxa occurring in Madagascar. Methods Phylogenetic relationships of Afro-Madagascan Ixora are assessed using sequence data from four plastid regions (petD, rps16, rpoB-trnC and trnL-trnF) and nuclear ribosomal external transcribed spacer (ETS) and internal transcribed spacer (ITS) regions. The phylogenetic distribution of key morphological characters is assessed. Bayesian inference (implemented in BEAST) is used to estimate the temporal origin of Ixora based on fossil evidence. Key Results Two separate lineages of Madagascan taxa are recovered, one of which is nested in a group of East African taxa. Divergence in Ixora is estimated to have commenced during the mid Miocene, with extensive cladogenesis occurring in the Afro-Madagascan clade during the Pliocene onwards. Conclusions Both lineages of Madagascan Ixora exhibit morphological innovations that are rare throughout the rest of the genus, including a trend towards pauciflorous inflorescences and a trend towards extreme corolla tube length, suggesting that the same ecological and selective pressures are acting upon taxa from both Madagascan lineages. Novel ecological opportunities resulting from climate-induced habitat fragmentation and corolla tube length diversification are likely to have facilitated species radiation on Madagascar. PMID:24142919

  5. Three Phylogenetic Groups of nodA and nifH Genes in Sinorhizobium and Mesorhizobium Isolates from Leguminous Trees Growing in Africa and Latin America

    PubMed Central

    Haukka, Kaisa; Lindström, Kristina; Young, J. Peter W.

    1998-01-01

    The diversity and phylogeny of nodA and nifH genes were studied by using 52 rhizobial isolates from Acacia senegal, Prosopis chilensis, and related leguminous trees growing in Africa and Latin America. All of the strains had similar host ranges and belonged to the genera Sinorhizobium and Mesorhizobium, as previously determined by 16S rRNA gene sequence analysis. The restriction patterns and a sequence analysis of the nodA and nifH genes divided the strains into the following three distinct groups: sinorhizobia from Africa, sinorhizobia from Latin America, and mesorhizobia from both regions. In a phylogenetic tree also containing previously published sequences, the nodA genes of our rhizobia formed a branch of their own, but within the branch no correlation between symbiotic genes and host trees was apparent. Within the large group of African sinorhizobia, similar symbiotic gene types were found in different chromosomal backgrounds, suggesting that transfer of symbiotic genes has occurred across species boundaries. Most strains had plasmids, and the presence of plasmid-borne nifH was demonstrated by hybridization for some examples. The nodA and nifH genes of Sinorhizobium teranga ORS1009T grouped with the nodA and nifH genes of the other African sinorhizobia, but Sinorhizobium saheli ORS609T had a totally different nodA sequence, although it was closely related based on the 16S rRNA gene and nifH data. This might be because this S. saheli strain was originally isolated from Sesbania sp., which belongs to a different cross-nodulation group than Acacia and Prosopis spp. The factors that appear to have influenced the evolution of rhizobial symbiotic genes vary in importance at different taxonomic levels. PMID:9464375

  6. The Tree versus the Forest: The Fungal Tree of Life and the Topological Diversity within the Yeast Phylome

    PubMed Central

    Marcet-Houben, Marina; Gabaldón, Toni

    2009-01-01

    A recurrent topic in phylogenomics is the combination of various sequence alignments to reconstruct a tree that describes the evolutionary relationships within a group of species. However, such approach has been criticized for not being able to properly represent the topological diversity found among gene trees. To evaluate the representativeness of species trees based on concatenated alignments, we reconstruct several fungal species trees and compare them with the complete collection of phylogenies of genes encoded in the Saccharomyces cerevisiae genome. We found that, despite high levels of among-gene topological variation, the species trees do represent widely supported phylogenetic relationships. Most topological discrepancies between gene and species trees are concentrated in certain conflicting nodes. We propose to map such information on the species tree so that it accounts for the levels of congruence across the genome. We identified the lack of sufficient accuracy of current alignment and phylogenetic methods as an important source for the topological diversity encountered among gene trees. Finally, we discuss the implications of the high levels of topological variation for phylogeny-based orthology prediction strategies. PMID:19190756

  7. The Eukaryotic Tree of Life from a Global Phylogenomic Perspective

    PubMed Central

    Burki, Fabien

    2014-01-01

    Molecular phylogenetics has revolutionized our knowledge of the eukaryotic tree of life. With the advent of genomics, a new discipline of phylogenetics has emerged: phylogenomics. This method uses large alignments of tens to hundreds of genes to reconstruct evolutionary histories. This approach has led to the resolution of ancient and contentious relationships, notably between the building blocks of the tree (the supergroups), and allowed to place in the tree enigmatic yet important protist lineages for understanding eukaryote evolution. Here, I discuss the pros and cons of phylogenomics and review the eukaryotic supergroups in light of earlier work that laid the foundation for the current view of the tree, including the position of the root. I conclude by presenting a picture of eukaryote evolution, summarizing the most recent progress in assembling the global tree. PMID:24789819

  8. Improved description of the bipolar ciliate, Euplotes petzi, and definition of its basal position in the Euplotes phylogenetic tree.

    PubMed

    Di Giuseppe, Graziano; Erra, Fabrizio; Paolo Frontini, Francesco; Dini, Fernando; Vallesi, Adriana; Luporini, Pierangelo

    2014-08-01

    Data improving the characterization of the marine Euplotes species, E. petzi Wilbert and Song, 2008, were obtained from morphological, ecological and genetic analyses of Antarctic and Arctic wild-type strains. This species is identified by a minute (mean size, 46 μm × 32 μm) and ellipsoidal cell body which is dorsally decorated with an argyrome of the double-patella type, five dorsal kineties (of which the median one contains 8-10 dikinetids), five sharp-edged longitudinal ridges, and a right anterior spur. Ventrally, it bears 10 fronto-ventral, five transverse, two caudal and two marginal cirri, 30-35 adoral membranelles, and three inconspicuous ridges. Euplotes petzi grows well at 4 °C on green algae, does not produce cysts, undergoes mating under the genetic control of a multiple mating-type system, constitutively secretes water-borne pheromones, and behaves as a psychrophilic microorganism unable to survive at >15 °C. While the α-tubulin gene sequence determination did not provide useful information on the E. petzi molecular phylogeny, the small subunit rRNA (SSU rRNA) gene sequence determination provided solid evidence that E. petzi clusters with E. sinicus Jiang et al., 2010a, into a clade which represents the deepest branch at the base of the Euplotes phylogentic tree. PMID:25051516

  9. Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies.

    PubMed

    Leaché, Adam D; Banbury, Barbara L; Felsenstein, Joseph; de Oca, Adrián Nieto-Montes; Stamatakis, Alexandros

    2015-11-01

    Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the

  10. Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies

    PubMed Central

    Leaché, Adam D.; Banbury, Barbara L.; Felsenstein, Joseph; de Oca, Adrián nieto-Montes; Stamatakis, Alexandros

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the

  11. Osiris: accessible and reproducible phylogenetic and phylogenomic analyses within the Galaxy workflow management system

    PubMed Central

    2014-01-01

    Background Phylogenetic tools and ‘tree-thinking’ approaches increasingly permeate all biological research. At the same time, phylogenetic data sets are expanding at breakneck pace, facilitated by increasingly economical sequencing technologies. Therefore, there is an urgent need for accessible, modular, and sharable tools for phylogenetic analysis. Results We developed a suite of wrappers for new and existing phylogenetics tools for the Galaxy workflow management system that we call Osiris. Osiris and Galaxy provide a sharable, standardized, modular user interface, and the ability to easily create complex workflows using a graphical interface. Osiris enables all aspects of phylogenetic analysis within Galaxy, including de novo assembly of high throughput sequencing reads, ortholog identification, multiple sequence alignment, concatenation, phylogenetic tree estimation, and post-tree comparative analysis. The open source files are available on in the Bitbucket public repository and many of the tools are demonstrated on a public web server (http://galaxy-dev.cnsi.ucsb.edu/osiris/). Conclusions Osiris can serve as a foundation for other phylogenomic and phylogenetic tool development within the Galaxy platform. PMID:24990571

  12. Phylogenetic information and experimental design in molecular systematics.

    PubMed Central

    Goldman, N

    1998-01-01

    Despite the widespread perception that evolutionary inference from molecular sequences is a statistical problem, there has been very little attention paid to questions of experimental design. Previous consideration of this topic has led to little more than an empirical folklore regarding the choice of suitable genes for analysis, and to dispute over the best choice of taxa for inclusion in data sets. I introduce what I believe are new methods that permit the quantification of phylogenetic information in a sequence alignment. The methods use likelihood calculations based on Markov-process models of nucleotide substitution allied with phylogenetic trees, and allow a general approach to optimal experimental design. Two examples are given, illustrating realistic problems in experimental design in molecular phylogenetics and suggesting more general conclusions about the choice of genomic regions, sequence lengths and taxa for evolutionary studies. PMID:9787470

  13. Phylogenetic studies in Ravenelia esculenta and related rust fungi.

    PubMed

    Gandhe, K R; Kuvalekar, Aniket

    2007-09-01

    Ravenelia esculenta Naras. and Thium. is a rust fungus, which infects mostly thorns, inflorescences, flowers and fruits of Acacia eburnea Willd. Aecial stages of the rust produce hypertrophy in infected parts. DNA of the rust fungus was isolated from aeciospores by 'freeze thaw' method. 18S rDNA was amplified and sequenced by automated DNA sequencer. BLAST of the sequence at NCBI retrieved 96 sequences producing significant alignments. Multiple sequence alignment of these sequences was done by ClustalW. Phylogenetic analysis was done by using MEGA 3.1. UPGMA Minimum Evolution tree with bootstrap value of 1000 replicates was constructed using these sequences. From phylogenetic tree it is observed that Ravenelia esculenta and the genus Gymnosporangium share a common ancestry, though Ravenelia esculenta is autoecious on angiosperm and the genus Gymnosporangium is heteroecious with pycnia, aecia on angiosperm and uredia, telia on gymnosperm. Two major clades are recognized which are based on the nature of aecial host (gymnosperm or angiosperm). These clades were also showing shift from pteridophytes to angiosperms as telial hosts. The tree can be interpreted in the other way also where there is separation of 14 families of Uredinales depending upon nature of teliospores, nature of aeciospores and structure of pycnia. These studies determine the phylogenetic position of Ravenelia esculenta among other rust fungi besides broad separation of Uredinales into two clades. These studies also show that there is phylogenetic correlation between molecular and morphological data. This is first report of DNA sequencing and phylogenetic positioning in genus Ravenelia from India. PMID:23100669

  14. Diversity of a ribonucleoprotein family in tobacco chloroplasts: two new chloroplast ribonucleoproteins and a phylogenetic tree of ten chloroplast RNA-binding domains.

    PubMed Central

    Ye, L H; Li, Y Q; Fukami-Kobayashi, K; Go, M; Konishi, T; Watanabe, A; Sugiura, M

    1991-01-01

    Two new ribonucleoproteins (RNPs) have been identified from a tobacco chloroplast lysate. These two proteins (cp29A and cp29B) are nuclear-encoded and have a less affinity to single-stranded DNA as compared with three other chloroplast RNPs (cp28, cp31 and cp33) previously isolated. DNA sequencing revealed that both contain two consensus sequence-type homologous RNA-binding domains (CS-RBDs) and a very acidic amino-terminal domain but shorter than that of cp28, cp31 and cp33. Comparison of cp29A and cp29B showed a 19 amino acid insertion in the region separating the two CS-RBDs in cp29B. This insertion results in three tandem repeats of a glycine-rich sequence of 10 amino acids, which is a novel feature in RNPs. The two proteins are encoded by different single nuclear genes and no alternatively spliced transcripts could be identified. We constructed a phylogenetic tree for the ten chloroplast CS-RBDs. These results suggest that there is a sizable RNP family in chloroplasts and the diversity was mainly generated through a series of gene duplications rather than through alternative pre-mRNA splicing. The gene for cp29B contains three introns. The first and second introns interrupt the first CS-RBD and the third intron does the second CS-RBD. The position of the first intron site is the same as that in the human hnRNP A1 protein gene. Images PMID:1721701

  15. Phylogenetic comparison of local length plasticity of the small subunit of nuclear rDNAs among all Hexapoda orders and the impact of hyper-length-variation on alignment.

    PubMed

    Xie, Qiang; Tian, Xiaoxuan; Qin, Yan; Bu, Wenjun

    2009-02-01

    The SSU nrDNA (18S), is one of the most frequently sequenced molecular markers in phylogenetic studies. However, the length-hyper-variation at multiple positions of this gene can affect the accuracy of alignment greatly and this length variation makes alignment across arthropod orders a serious problem. The analyses of Hexapoda phylogeny is such a case. A more clear recognition of the distribution of the length-variable-regions is needed. In this study, the secondary structure of some length-variable-regions in the SSU nrRNA of Arthropoda was adjusted by the principle of co-variation. It is found that the extent of plasticity of some length-variable-region can extraordinarily be higher than 600 bases in hexapods. And the numbers of hyper length-variable-regions are largest in Strepsiptera and Sternorrhyncha (Hemiptera). Our study shows that some length-variable-regions can serve as synapomorphies for some groups. The phylogenetic comparison also suggested that the expansion of a lateral bulge could be the origin of a helix. PMID:19027081

  16. Make Your Own Phylogenetic Tree

    ERIC Educational Resources Information Center

    Rau, Gerald

    2012-01-01

    Molecular similarity is one of the strongest lines of evidence for evolution--and one of the most difficult for students to grasp. That is because the underlying observations--that identical mutations are found in closely related species and the degree of similarity decreases with evolutionary distance--are not visible to the human eye. And it's…

  17. Phylogenetically resolving epidemiologic linkage

    PubMed Central

    Romero-Severson, Ethan O.; Bulla, Ingo; Leitner, Thomas

    2016-01-01

    Although the use of phylogenetic trees in epidemiological investigations has become commonplace, their epidemiological interpretation has not been systematically evaluated. Here, we use an HIV-1 within-host coalescent model to probabilistically evaluate transmission histories of two epidemiologically linked hosts. Previous critique of phylogenetic reconstruction has claimed that direction of transmission is difficult to infer, and that the existence of unsampled intermediary links or common sources can never be excluded. The phylogenetic relationship between the HIV populations of epidemiologically linked hosts can be classified into six types of trees, based on cladistic relationships and whether the reconstruction is consistent with the true transmission history or not. We show that the direction of transmission and whether unsampled intermediary links or common sources existed make very different predictions about expected phylogenetic relationships: (i) Direction of transmission can often be established when paraphyly exists, (ii) intermediary links can be excluded when multiple lineages were transmitted, and (iii) when the sampled individuals’ HIV populations both are monophyletic a common source was likely the origin. Inconsistent results, suggesting the wrong transmission direction, were generally rare. In addition, the expected tree topology also depends on the number of transmitted lineages, the sample size, the time of the sample relative to transmission, and how fast the diversity increases after infection. Typically, 20 or more sequences per subject give robust results. We confirm our theoretical evaluations with analyses of real transmission histories and discuss how our findings should aid in interpreting phylogenetic results. PMID:26903617

  18. Phylogenetically resolving epidemiologic linkage

    DOE PAGESBeta

    Romero-Severson, Ethan O.; Bulla, Ingo; Leitner, Thomas

    2016-02-22

    The use of phylogenetic trees in epidemiological investigations has become commonplace, but their epidemiological interpretation has not been systematically evaluated. Here, we use an HIV-1 within-host coalescent model to probabilistically evaluate transmission histories of two epidemiologically linked hosts. Previous critique of phylogenetic reconstruction has claimed that direction of transmission is difficult to infer, and that the existence of unsampled intermediary links or common sources can never be excluded. The phylogenetic relationship between the HIV populations of epidemiologically linked hosts can be classified into six types of trees, based on cladistic relationships and whether the reconstruction is consistent with the truemore » transmission history or not. We show that the direction of transmission and whether unsampled intermediary links or common sources existed make very different predictions about expected phylogenetic relationships: (i) Direction of transmission can often be established when paraphyly exists, (ii) intermediary links can be excluded when multiple lineages were transmitted, and (iii) when the sampled individuals’ HIV populations both are monophyletic a common source was likely the origin. Inconsistent results, suggesting the wrong transmission direction, were generally rare. In addition, the expected tree topology also depends on the number of transmitted lineages, the sample size, the time of the sample relative to transmission, and how fast the diversity increases after infection. Typically, 20 or more sequences per subject give robust results. Moreover, we confirm our theoretical evaluations with analyses of real transmission histories and discuss how our findings should aid in interpreting phylogenetic results.« less

  19. Phylogenetically resolving epidemiologic linkage.

    PubMed

    Romero-Severson, Ethan O; Bulla, Ingo; Leitner, Thomas

    2016-03-01

    Although the use of phylogenetic trees in epidemiological investigations has become commonplace, their epidemiological interpretation has not been systematically evaluated. Here, we use an HIV-1 within-host coalescent model to probabilistically evaluate transmission histories of two epidemiologically linked hosts. Previous critique of phylogenetic reconstruction has claimed that direction of transmission is difficult to infer, and that the existence of unsampled intermediary links or common sources can never be excluded. The phylogenetic relationship between the HIV populations of epidemiologically linked hosts can be classified into six types of trees, based on cladistic relationships and whether the reconstruction is consistent with the true transmission history or not. We show that the direction of transmission and whether unsampled intermediary links or common sources existed make very different predictions about expected phylogenetic relationships: (i) Direction of transmission can often be established when paraphyly exists, (ii) intermediary links can be excluded when multiple lineages were transmitted, and (iii) when the sampled individuals' HIV populations both are monophyletic a common source was likely the origin. Inconsistent results, suggesting the wrong transmission direction, were generally rare. In addition, the expected tree topology also depends on the number of transmitted lineages, the sample size, the time of the sample relative to transmission, and how fast the diversity increases after infection. Typically, 20 or more sequences per subject give robust results. We confirm our theoretical evaluations with analyses of real transmission histories and discuss how our findings should aid in interpreting phylogenetic results. PMID:26903617

  20. Biochemical and structural characterizations of two Dictyostelium cellobiohydrolases from the amoebozoa kingdom reveal a high level of conservation between distant phylogenetic trees of life

    DOE PAGESBeta

    Hobdey, Sarah E.; Knott, Brandon C.; Momeni, Majid Haddad; Taylor, II, Larry E.; Borisova, Anna S.; Podkaminer, Kara K.; VanderWall, Todd A.; Himmel, Michael E.; Decker, Stephen R.; Beckham, Gregg T.; et al

    2016-04-01

    Glycoside hydrolase family 7 (GH7) cellobiohydrolases (CBHs) are enzymes often employed in plant cell wall degradation across eukaryotic kingdoms of life, as they provide significant hydrolytic potential in cellulose turnover. To date, many fungal GH7 CBHs have been examined, yet many questions regarding structure-activity relationships in these important natural and commercial enzymes remain. Here, we present the crystal structures and a biochemical analysis of two GH7 CBHs from social amoeba: Dictyostelium discoideum Cel7A (DdiCel7A) and Dictyostelium purpureum Cel7A (DpuCel7A). DdiCel7A and DpuCel7A natively consist of a catalytic domain and do not exhibit a carbohydrate-binding module (CBM). The structures of DdiCel7Amore » and DpuCel7A, resolved to 2.1 Å and 2.7 Å, respectively, are homologous to those of other GH7 CBHs with an enclosed active-site tunnel. Two primary differences between the Dictyostelium CBHs and the archetypal model GH7 CBH, Trichoderma reesei Cel7A (TreCel7A), occur near the hydrolytic active site and the product-binding sites. To compare the activities of these enzymes with the activity of TreCel7A, the family 1 TreCel7A CBM and linker were added to the C terminus of each of the Dictyostelium enzymes, creating DdiCel7ACBM and DpuCel7ACBM, which were recombinantly expressed in T. reesei. DdiCel7ACBM and DpuCel7ACBM hydrolyzed Avicel, pretreated corn stover, and phosphoric acid-swollen cellulose as efficiently as TreCel7A when hydrolysis was compared at their temperature optima. The Ki of cellobiose was significantly higher for DdiCel7ACBM and DpuCel7ACBM than for TreCel7A: 205, 130, and 29 μM, respectively. Finally, taken together, the present study highlights the remarkable degree of conservation of the activity of these key natural and industrial enzymes across quite distant phylogenetic trees of life.« less

  1. Biochemical and Structural Characterization of Two Dictyostelium Cellobiohydrolases from the Amoebozoa Kingdom Reveal a High Level of Conservation Between Distant Phylogenetic Trees of Life

    DOE PAGESBeta

    Hobdey, Sarah E.; Knott, Brandon C.; Momeni, Majid Haddad; Taylor, II, Larry E.; Borisova, Anna S.; Podkaminer, Kara K.; VanderWall, Todd A.; Himmel, Michael E.; Decker, Stephen R.; Beckham, Gregg T.; et al

    2016-06-01

    Glycoside Hydrolase Family 7 (GH7) cellobiohydrolases (CBHs) are commonly employed enzymes in plant cell wall degradation across eukaryotic kingdoms of life, as they provide significant hydrolytic potential in cellulose turnover. To date, many fungal GH7 CBHs have been examined, yet many questions remain regarding structure-activity relationships in these important natural and commercial enzymes. Here, we present crystal structures and biochemical analysis of two GH7 CBHs from social amoeba: Dictyostelium discoideum and Dictyostelium purpureum (DdiCel7A and DpuCel7A, respectively). DdiCel7A and DpuCel7A natively consist of a catalytic domain and do not exhibit a carbohydrate-binding module (CBM). The structures, resolved to 2.1 Amore » (DdiCel7A), and 2.7 A (DpuCel7A), are homologous to other GH7 CBHs with an enclosed active site tunnel. Two primary differences between the Dictyostelium CBHs and the archetypal model GH7 CBH from Trichoderma reesei Cel7A (TreCel7A) occur near the hydrolytic active site and the product binding sites. To compare the activity of these enzymes with TreCel7A, the Family 1 TreCel7A CBM and linker was added to the C-terminus of the Dictyostelium enzymes, DdiCel7ACBM and DpuCel7ACBM, which were recombinantly expressed in T. reesei. DdiCel7ACBM and DpuCel7ACBM hydrolyze Avicel, pretreated corn stover, and phosphoric acid swollen cellulose as efficiently as TreCel7A when compared at their temperature optima. The Ki of cellobiose is significantly higher for DdiCel7ACBM and DpuCel7ACBM than for TreCel7A: 205, 130, and 29 uM, respectively. Taken together, the present study highlights the remarkable conservation in the activity of these key natural and industrial enzymes across quite distant phylogenetic trees of life.« less

  2. A scalable and flexible approach for investigating the genomic landscapes of phylogenetic incongruence.

    PubMed

    Prasad, Arjun B; Mullikin, James C; Green, Eric D

    2013-03-01

    Analyses of DNA sequence datasets have repeatedly revealed inconsistencies in phylogenetic trees derived with different data. This is termed phylogenetic incongruence, and may arise from a methodological failure of the inference process or from biological processes, such as horizontal gene transfer, incomplete lineage sorting, and introgression. To better understand patterns of incongruence, we developed a method (PartFinder) that uses likelihood ratios applied to sliding windows for visualizing tree-support changes across genome-sequence alignments, allowing the comparative examination of complex phylogenetic scenarios among many species. As a pilot, we used PartFinder to investigate incongruence in the Homo-Pan-Gorilla group as well as Platyrrhini using high-quality bacterial artificial chromosome (BAC)-derived sequences as well as assembled whole-genome shotgun sequences. Our simulations verified the sensitivity of PartFinder, and our results were comparable to other studies of the Homo-Pan-Gorilla group. Analyses of the whole-genome alignments reveal significant associations between support for the accepted species relationship and specific characteristics of the genomic regions, such as GC-content, alignment score, exon content, and conservation. Finally, we analyzed sequence data generated for five platyrrhine species, and found incongruence that suggests a polytomy within Cebidae, in particular. Together, these studies demonstrate the utility of PartFinder for investigating the patterns of phylogenetic incongruence. PMID:23247042

  3. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    SciTech Connect

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.

  4. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny inmore » minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less

  5. Aligning Biomolecular Networks Using Modular Graph Kernels

    NASA Astrophysics Data System (ADS)

    Towfic, Fadi; Greenlee, M. Heather West; Honavar, Vasant

    Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.

  6. Cloning and sequencing of the growth hormone gene of large yellow croaker and its phylogenetic significance.

    PubMed

    Chen, Yun; Wang, Yaping; He, Shunping; Zhu, Zuoyan

    2004-10-01

    Using conserved primers and the PCR reaction, the growth hormone (GH) gene and the 3'-UTR of the large yellow croaker (Pseudosciaena crocea) were amplified and sequenced. The gene structure was analyzed and compared to the GH genes of 5 other percoid fish downloaded from Genbank. Also the GH gene of the large yellow croaker and the genes from 14 Percoidei and 2 Labroidei species were aligned using Clustal X. A matrix of 564 bp was used to construct the phylogenetic tree using maximum parsimony and neighbor-joining methods. Phylogenetic trees by the two methods are identical in most of the clades with high bootstrap support. The results are also identical to those from morphological data. In general, this analysis does not support the monophyll of the families Centropomidae and Carangidae. But our GH gene tree indicates that the representative species of the families Sparidae and Sciaenidae are a monophyletic group. PMID:15524313

  7. MUST, a computer package of Management Utilities for Sequences and Trees.

    PubMed Central

    Philippe, H

    1993-01-01

    The MUST package is a phylogenetically oriented set of programs for data management and display, allowing one to handle both raw data (sequences) and results (trees, number of steps, bootstrap proportions). It is complementary to the main available software for phylogenetic analysis (PHYLIP, PAUP, HENNING86, CLUSTAL) with which it is fully compatible. The first part of MUST consists of the acquisition of new sequences, their storage, modification, and checking of sequence integrity in files of aligned sequences. In order to improve alignment, an editor function for aligned sequences offers numerous options, such as selection of subsets of sequences, display of consensus sequences, and search for similarities over small sequence fragments. For phylogenetic reconstruction, the choice of species and portions of sequences to be analyzed is easy and very rapid, permitting fast testing of numerous combinations of sequences and taxa. The resulting files can be formatted for most programs of tree construction. An interactive tree-display program recovers the output of all these programs. Finally, various modules allow an in-depth analysis of results, such as comparison of distance matrices, variation of bootstrap proportions with respect to various parameters or comparison of the number of steps per position. All presently available complete sequences of 28S rRNA are furnished aligned in the package. MUST therefore allows the management of all the operations required for phylogenetic reconstructions. PMID:8255784

  8. Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer.

    PubMed

    Bromberg, Raquel; Grishin, Nick V; Otwinowski, Zbyszek

    2016-06-01

    Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz. PMID:27336403

  9. Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer

    PubMed Central

    Grishin, Nick V.; Otwinowski, Zbyszek

    2016-01-01

    Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz. PMID:27336403

  10. An exploration of how to define and measure the evolution of behavior, learning, memory and mind across the full phylogenetic tree of life

    PubMed Central

    Eisenstein, E. M.; Eisenstein, D. L.; Sarma, J. S. M.

    2016-01-01

    ABSTRACT There are probably few terms in evolutionary studies regarding neuroscience issues that are used more frequently than ‘behavior', ‘learning', ‘memory', and ‘mind'. Yet there are probably as many different meanings of these terms as there are users of them. Further, investigators in such studies, while recognizing the full phylogenetic spectrum of life and the evolution of these phenomena, rarely go beyond mammals and other vertebrates in their investigations; invertebrates are sometimes included. What is rarely taken into consideration, though, is that to fully understand the evolution and significance for survival of these phenomena across phylogeny, it is essential that they be measured and compared in the same units of measurement across the full phylogenetic spectrum from aneural bacteria and protozoa to humans. This paper explores how these terms are generally used as well as how they might be operationally defined and measured to facilitate uniform examination and comparisons across the full phylogenetic spectrum of life. This paper has 2 goals: (1) to provide models for measuring the evolution of ‘behavior' and its changes across the full phylogenetic spectrum, and (2) to explain why ‘mind phenomena' cannot be measured scientifically at the present time. PMID:27489578

  11. An exploration of how to define and measure the evolution of behavior, learning, memory and mind across the full phylogenetic tree of life.

    PubMed

    Eisenstein, E M; Eisenstein, D L; Sarma, J S M

    2016-01-01

    There are probably few terms in evolutionary studies regarding neuroscience issues that are used more frequently than 'behavior', 'learning', 'memory', and 'mind'. Yet there are probably as many different meanings of these terms as there are users of them. Further, investigators in such studies, while recognizing the full phylogenetic spectrum of life and the evolution of these phenomena, rarely go beyond mammals and other vertebrates in their investigations; invertebrates are sometimes included. What is rarely taken into consideration, though, is that to fully understand the evolution and significance for survival of these phenomena across phylogeny, it is essential that they be measured and compared in the same units of measurement across the full phylogenetic spectrum from aneural bacteria and protozoa to humans. This paper explores how these terms are generally used as well as how they might be operationally defined and measured to facilitate uniform examination and comparisons across the full phylogenetic spectrum of life. This paper has 2 goals: (1) to provide models for measuring the evolution of 'behavior' and its changes across the full phylogenetic spectrum, and (2) to explain why 'mind phenomena' cannot be measured scientifically at the present time. PMID:27489578

  12. Canonical phylogenetic ordination.

    PubMed

    Giannini, Norberto P

    2003-10-01

    A phylogenetic comparative method is proposed for estimating historical effects on comparative data using the partitions that compose a cladogram, i.e., its monophyletic groups. Two basic matrices, Y and X, are defined in the context of an ordinary linear model. Y contains the comparative data measured over t taxa. X consists of an initial tree matrix that contains all the xj monophyletic groups (each coded separately as a binary indicator variable) of the phylogenetic tree available for those taxa. The method seeks to define the subset of groups, i.e., a reduced tree matrix, that best explains the patterns in Y. This definition is accomplished via regression or canonical ordination (depending on the dimensionality of Y) coupled with Monte Carlo permutations. It is argued here that unrestricted permutations (i.e., under an equiprobable model) are valid for testing this specific kind of groupwise hypothesis. Phylogeny is either partialled out or, more properly, incorporated into the analysis in the form of component variation. Direct extensions allow for testing ecomorphological data controlled by phylogeny in a variation partitioning approach. Currently available statistical techniques make this method applicable under most univariate/multivariate models and metrics; two-way phylogenetic effects can be estimated as well. The simplest case (univariate Y), tested with simulations, yielded acceptable type I error rates. Applications presented include examples from evolutionary ethology, ecology, and ecomorphology. Results showed that the new technique detected previously overlooked variation clearly associated with phylogeny and that many phylogenetic effects on comparative data may occur at particular groups rather than across the entire tree. PMID:14530135

  13. Y-chromosome Short Tandem Repeat Intermediate Variant Alleles DYS392.2, DYS449.2, and DYS385.2 Delineate New Phylogenetic Substructure in Human Y-chromosome Haplogroup Tree

    PubMed Central

    Myres, Natalie M.; Ritchie, Kathleen H.; Lin, Alice A.; Hughes, Robert H.; Woodward, Scott R.; Underhill, Peter A.

    2009-01-01

    Aim To determine the human Y-chromosome haplogroup backgrounds of intermediate-sized variant alleles displayed by short tandem repeat (STR) loci DYS392, DYS449, and DYS385, and to evaluate the potential of each intermediate variant to elucidate new phylogenetic substructure within the human Y-chromosome haplogroup tree. Methods Molecular characterization of lineages was achieved using a combination of Y-chromosome haplogroup defining binary polymorphisms and up to 37 short tandem repeat loci. DNA sequencing and median-joining network analyses were used to evaluate Y-chromosome lineages displaying intermediate variant alleles. Results We show that DYS392.2 occurs on a single haplogroup background, specifically I1*-M253, and likely represents a new phylogenetic subdivision in this European haplogroup. Intermediate variants DYS449.2 and DYS385.2 both occur on multiple haplogroup backgrounds, and when evaluated within specific haplogroup contexts, delineate new phylogenetic substructure, with DYS449.2 being informative within haplogroup A-P97 and DYS385.2 in haplogroups D-M145, E1b1a-M2, and R1b*-M343. Sequence analysis of variant alleles observed within the various haplogroup backgrounds showed that the nature of the intermediate variant differed, confirming the mutations arose independently. Conclusions Y-chromosome short tandem repeat intermediate variant alleles, while relatively rare, typically occur on multiple haplogroup backgrounds. This distribution indicates that such mutations arise at a rate generally intermediate to those of binary markers and Y-STR loci. As a result, intermediate-sized Y-STR variants can reveal phylogenetic substructure within the Y-chromosome phylogeny not currently detected by either binary or Y-STR markers alone, but only when such variants are evaluated within a haplogroup context. PMID:19480020

  14. Archaeal phylogeny: reexamination of the phylogenetic position of Archaeoglobus fulgidus in light of certain composition-induced artifacts

    NASA Technical Reports Server (NTRS)

    Woese, C. R.; Achenbach, L.; Rouviere, P.; Mandelco, L.

    1991-01-01

    A major and too little recognized source of artifact in phylogenetic analysis of molecular sequence data is compositional difference among sequences. The problem becomes particularly acute when alignments contain ribosomal RNAs from both mesophilic and thermophilic species. Among prokaryotes the latter are considerably higher in G + C content than the former, which often results in artificial clustering of thermophilic lineages and their being placed artificially deep in phylogenetic trees. In this communication we review archaeal phylogeny in the light of this consideration, focusing in particular on the phylogenetic position of the sulfate reducing species Archaeoglobus fulgidus, using both 16S rRNA and 23S rRNA sequences. The analysis shows clearly that the previously reported deep branching of the A. fulgidus lineage (very near the base of the euryarchaeal side of the archaeal tree) is incorrect, and that the lineage actually groups with a previously recognized unit that comprises the Methanomicrobiales and extreme halophiles.

  15. Structure of the small ribosomal subunit RNA of the pulmonate snail, Limicolaria kambeul, and phylogenetic analysis of the Metazoa.

    PubMed

    Winnepennickx, B; Backeljau, T; van de Peer, Y; De Wachter, R

    1992-09-01

    The complete nucleotide sequence of the small ribosomal subunit RNA of the gastropod, Limicolaria kambeul, was determined and used to infer a secondary structure model. In order to clarify the phylogenetic position of the Mollusca among the Metazoa, an evolutionary tree was constructed by neighbor-joining, starting from an alignment of small ribosomal subunit RNA sequences. The Mollusca appear to be a monophyletic group, related to Arthropoda and Chordata in an unresolved trichotomy. PMID:1505675

  16. Phylogeny.fr: robust phylogenetic analysis for the non-specialist

    PubMed Central

    Dereeper, A.; Guignon, V.; Blanc, G.; Audic, S.; Buffet, S.; Chevenet, F.; Dufayard, J.-F.; Guindon, S.; Lefort, V.; Lescot, M.; Gascuel, O.

    2008-01-01

    Phylogenetic analyses are central to many research areas in biology and typically involve the identification of homologous sequences, their multiple alignment, the phylogenetic reconstruction and the graphical representation of the inferred tree. The Phylogeny.fr platform transparently chains programs to automatically perform these tasks. It is primarily designed for biologists with no experience in phylogeny, but can also meet the needs of specialists; the first ones will find up-to-date tools chained in a phylogeny pipeline to analyze their data in a simple and robust way, while the specialists will be able to easily build and run sophisticated analyses. Phylogeny.fr offers three main modes. The ‘One Click’ mode targets non-specialists and provides a ready-to-use pipeline chaining programs with recognized accuracy and speed: MUSCLE for multiple alignment, PhyML for tree building, and TreeDyn for tree rendering. All parameters are set up to suit most studies, and users only have to provide their input sequences to obtain a ready-to-print tree. The ‘Advanced’ mode uses the same pipeline but allows the parameters of each program to be customized by users. The ‘A la Carte’ mode offers more flexibility and sophistication, as users can build their own pipeline by selecting and setting up the required steps from a large choice of tools to suit their specific needs. Prior to phylogenetic analysis, users can also collect neighbors of a query sequence by running BLAST on general or specialized databases. A guide tree then helps to select neighbor sequences to be used as input for the phylogeny pipeline. Phylogeny.fr is available at: http://www.phylogeny.fr/ PMID:18424797

  17. The Phylogenetic Likelihood Library

    PubMed Central

    Flouri, T.; Izquierdo-Carrasco, F.; Darriba, D.; Aberer, A.J.; Nguyen, L.-T.; Minh, B.Q.; Von Haeseler, A.; Stamatakis, A.

    2015-01-01

    We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2–10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). PMID:25358969

  18. The phylogenetic likelihood library.

    PubMed

    Flouri, T; Izquierdo-Carrasco, F; Darriba, D; Aberer, A J; Nguyen, L-T; Minh, B Q; Von Haeseler, A; Stamatakis, A

    2015-03-01

    We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2-10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). PMID:25358969

  19. Multigene phylogenetics reveals temporal diversification of major African malaria vectors.

    PubMed

    Kamali, Maryam; Marek, Paul E; Peery, Ashley; Antonio-Nkondjio, Christophe; Ndo, Cyrille; Tu, Zhijian; Simard, Frederic; Sharakhov, Igor V

    2014-01-01

    The major vectors of malaria in sub-Saharan Africa belong to subgenus Cellia. Yet, phylogenetic relationships and temporal diversification among African mosquito species have not been unambiguously determined. Knowledge about vector evolutionary history is crucial for correct interpretation of genetic changes identified through comparative genomics analyses. In this study, we estimated a molecular phylogeny using 49 gene sequences for the African malaria vectors An. gambiae, An. funestus, An. nili, the Asian malaria mosquito An. stephensi, and the outgroup species Culex quinquefasciatus and Aedes aegypti. To infer the phylogeny, we identified orthologous sequences uniformly distributed approximately every 5 Mb in the five chromosomal arms. The sequences were aligned and the phylogenetic trees were inferred using maximum likelihood and neighbor-joining methods. Bayesian molecular dating using a relaxed log normal model was used to infer divergence times. Trees from individual genes agreed with each other, placing An. nili as a basal clade that diversified from the studied malaria mosquito species 47.6 million years ago (mya). Other African malaria vectors originated more recently, and independently acquired traits related to vectorial capacity. The lineage leading to An. gambiae diverged 30.4 mya, while the African vector An. funestus and the Asian vector An. stephensi were the most closely related sister taxa that split 20.8 mya. These results were supported by consistently high bootstrap values in concatenated phylogenetic trees generated individually for each chromosomal arm. Genome-wide multigene phylogenetic analysis is a useful approach for discerning historic relationships among malaria vectors, providing a framework for the correct interpretation of genomic changes across species, and comprehending the evolutionary origins of this ubiquitous and deadly insect-borne disease. PMID:24705448

  20. Data on phylogenetic analyses of gazelles (genus Gazella) based on mitochondrial and nuclear intron markers.

    PubMed

    Lerp, Hannes; Klaus, Sebastian; Allgöwer, Stefanie; Wronski, Torsten; Pfenninger, Markus; Plath, Martin

    2016-06-01

    The data provided is related to the article "Phylogenetic analyses of gazelles reveal repeated transitions of key ecological traits and provide novel insights into the origin of the genus Gazella" [1]. The data is based on 48 tissue samples of all nine extant species of the genus Gazella, namely Gazella gazella, Gazella arabica, Gazella bennettii, Gazella cuvieri, Gazella dorcas, Gazella leptoceros, Gazella marica, Gazella spekei, and Gazella subgutturosa and four related taxa (Saiga tatarica, Antidorcas marsupialis, Antilope cervicapra and Eudorcas rufifrons). It comprises alignments of sequences of a cytochrome b data set and of six nuclear intron markers. For the latter new primers were designed based on cattle and sheep genomes. Based on these alignments phylogenetic trees were inferred using Bayesian Inference and Maximum Likelihood methods. Furthermore, ancestral character states (inferred with BayesTraits 1.0) and ancestral ranges based on a Dispersal-Extinction-Cladogenesis model were estimated and results׳ files were stored within this article. PMID:27054158

  1. Data on phylogenetic analyses of gazelles (genus Gazella) based on mitochondrial and nuclear intron markers

    PubMed Central

    Lerp, Hannes; Klaus, Sebastian; Allgöwer, Stefanie; Wronski, Torsten; Pfenninger, Markus; Plath, Martin

    2016-01-01

    The data provided is related to the article “Phylogenetic analyses of gazelles reveal repeated transitions of key ecological traits and provide novel insights into the origin of the genus Gazella” [1]. The data is based on 48 tissue samples of all nine extant species of the genus Gazella, namely Gazella gazella, Gazella arabica, Gazella bennettii, Gazella cuvieri, Gazella dorcas, Gazella leptoceros, Gazella marica, Gazella spekei, and Gazella subgutturosa and four related taxa (Saiga tatarica, Antidorcas marsupialis, Antilope cervicapra and Eudorcas rufifrons). It comprises alignments of sequences of a cytochrome b data set and of six nuclear intron markers. For the latter new primers were designed based on cattle and sheep genomes. Based on these alignments phylogenetic trees were inferred using Bayesian Inference and Maximum Likelihood methods. Furthermore, ancestral character states (inferred with BayesTraits 1.0) and ancestral ranges based on a Dispersal-Extinction-Cladogenesis model were estimated and results׳ files were stored within this article. PMID:27054158

  2. Molecular identification and phylogenetic study of Demodex caprae.

    PubMed

    Zhao, Ya-E; Cheng, Juan; Hu, Li; Ma, Jun-Xian

    2014-10-01

    The DNA barcode has been widely used in species identification and phylogenetic analysis since 2003, but there have been no reports in Demodex. In this study, to obtain an appropriate DNA barcode for Demodex, molecular identification of Demodex caprae based on mitochondrial cox1 was conducted. Firstly, individual adults and eggs of D. caprae were obtained for genomic DNA (gDNA) extraction; Secondly, mitochondrial cox1 fragment was amplified, cloned, and sequenced; Thirdly, cox1 fragments of D. caprae were aligned with those of other Demodex retrieved from GenBank; Finally, the intra- and inter-specific divergences were computed and the phylogenetic trees were reconstructed to analyze phylogenetic relationship in Demodex. Results obtained from seven 429-bp fragments of D. caprae showed that sequence identities were above 99.1% among three adults and four eggs. The intraspecific divergences in D. caprae, Demodex folliculorum, Demodex brevis, and Demodex canis were 0.0-0.9, 0.5-0.9, 0.0-0.2, and 0.0-0.5%, respectively, while the interspecific divergences between D. caprae and D. folliculorum, D. canis, and D. brevis were 20.3-20.9, 21.8-23.0, and 25.0-25.3, respectively. The interspecific divergences were 10 times higher than intraspecific ones, indicating considerable barcoding gap. Furthermore, the phylogenetic trees showed that four Demodex species gathered separately, representing independent species; and Demodex folliculorum gathered with canine Demodex, D. caprae, and D. brevis in sequence. In conclusion, the selected 429-bp mitochondrial cox1 gene is an appropriate DNA barcode for molecular classification, identification, and phylogenetic analysis of Demodex. D. caprae is an independent species and D. folliculorum is closer to D. canis than to D. caprae or D. brevis. PMID:25132566

  3. Evaluating the phylogenetic signal limit from mitogenomes, slow evolving nuclear genes, and the concatenation approach. New insights into the Lacertini radiation using fast evolving nuclear genes and species trees.

    PubMed

    Mendes, Joana; Harris, D James; Carranza, Salvador; Salvi, Daniele

    2016-07-01

    Estimating the phylogeny of lacertid lizards, and particularly the tribe Lacertini has been challenging, possibly due to the fast radiation of this group resulting in a hard polytomy. However this is still an open question, as concatenated data primarily from mitochondrial markers have been used so far whereas in a recent phylogeny based on a compilation of these data within a squamate supermatrix the basal polytomy seems to be resolved. In this study, we estimate phylogenetic relationships between all Lacertini genera using for the first time DNA sequences from five fast evolving nuclear genes (acm4, mc1r, pdc, βfib and reln) and two mitochondrial genes (nd4 and 12S). We generated a total of 529 sequences from 88 species and used Maximum Likelihood and Bayesian Inference methods based on concatenated multilocus dataset as well as a coalescent-based species tree approach with the aim of (i) shedding light on the basal relationships of Lacertini (ii) assessing the monophyly of genera which were previously questioned, and (iii) discussing differences between estimates from this and previous studies based on different markers, and phylogenetic methods. Results uncovered (i) a new phylogenetic clade formed by the monotypic genera Archaeolacerta, Zootoca, Teira and Scelarcis; and (ii) support for the monophyly of the Algyroides clade, with two sister species pairs represented by western (A. marchi and A. fitzingeri) and eastern (A. nigropunctatus and A. moreoticus) lineages. In both cases the members of these groups show peculiar morphology and very different geographical distributions, suggesting that they are relictual groups that were once diverse and widespread. They probably originated about 11-13 million years ago during early events of speciation in the tribe, and the split between their members is estimated to be only slightly older. This scenario may explain why mitochondrial markers (possibly saturated at higher divergence levels) or slower nuclear markers

  4. Quartets and unrooted phylogenetic networks.

    PubMed

    Gambette, Philippe; Berry, Vincent; Paul, Christophe

    2012-08-01

    Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of level-k networks. In particular, we give an equivalence theorem between circular split systems and unrooted level-1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted level-k phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions. PMID:22809417

  5. ALFRED: A Practical Method for Alignment-Free Distance Computation.

    PubMed

    Thankachan, Sharma V; Chockalingam, Sriram P; Liu, Yongchao; Apostolico, Alberto; Aluru, Srinivas

    2016-06-01

    Alignment-free approaches are gaining persistent interest in many sequence analysis applications such as phylogenetic inference and metagenomic classification/clustering, especially for large-scale sequence datasets. Besides the widely used k-mer methods, the average common substring (ACS) approach has emerged to be one of the well-known alignment-free approaches. Two recent works further generalize this ACS approach by allowing a bounded number k of mismatches in the common substrings, relying on approximation (linear time) and exact computation, respectively. Albeit having a good worst-case time complexity [Formula: see text], the exact approach is complex and unlikely to be efficient in practice. Herein, we present ALFRED, an alignment-free distance computation method, which solves the generalized common substring search problem via exact computation. Compared to the theoretical approach, our algorithm is easier to implement and more practical to use, while still providing highly competitive theoretical performances with an expected run-time of [Formula: see text]. By applying our program to phylogenetic inference as a case study, we find that our program facilitates to exactly reconstruct the topology of the reference phylogenetic tree for a set of 27 primate mitochondrial genomes, at reasonably acceptable speed. ALFRED is implemented in C++ programming language and the source code is freely available online. PMID:27138275

  6. Divergent ancestral lineages of newfound hantaviruses harbored by phylogenetically related crocidurine shrew species in Korea

    PubMed Central

    Arai, Satoru; Gu, Se Hun; Baek, Luck Ju; Tabara, Kenji; Bennett, Shannon; Oh, Hong-Shik; Takada, Nobuhiro; Kang, Hae Ji; Tanaka-Taya, Keiko; Morikawa, Shigeru; Okabe, Nobuhiko; Yanagihara, Richard; Song, Jin-Won

    2012-01-01

    Spurred by the recent isolation of a novel hantavirus, named Imjin virus (MJNV), from the Ussuri white-toothed shrew (Crocidura lasiura), targeted trapping was conducted for the phylogenetically related Asian lesser white-toothed shrew (Crocidura shantungensis). Pair-wise alignment and comparison of the S, M and L segments of a newfound hantavirus, designated Jeju virus (JJUV), indicated remarkably low nucleotide and amino acid sequence similarity with MJNV. Phylogenetic analyses, using maximum likelihood and Bayesian methods, showed divergent ancestral lineages for JJUV and MJNV, despite the close phylogenetic relationship of their reservoir soricid hosts. Also, no evidence of host switching was apparent in tanglegrams, generated by TreeMap 2.0β. PMID:22230701

  7. Calculation of Evolutionary Correlation between Individual Genes and Full-Length Genome: A Method Useful for Choosing Phylogenetic Markers for Molecular Epidemiology

    PubMed Central

    Wang, Shuai; Luo, Xuenong; Wei, Wei; Zheng, Yadong; Dou, Yongxi; Cai, Xuepeng

    2013-01-01

    Individual genes or regions are still commonly used to estimate the phylogenetic relationships among viral isolates. The genomic regions that can faithfully provide assessments consistent with those predicted with full-length genome sequences would be preferable to serve as good candidates of the phylogenetic markers for molecular epidemiological studies of many viruses. Here we employed a statistical method to evaluate the evolutionary relationships between individual viral genes and full-length genomes without tree construction as a way to determine which gene can match the genome well in phylogenetic analyses. This method was performed by calculation of linear correlations between the genetic distance matrices of aligned individual gene sequences and aligned genome sequences. We applied this method to the phylogenetic analyses of porcine circovirus 2 (PCV2), measles virus (MV), hepatitis E virus (HEV) and Japanese encephalitis virus (JEV). Phylogenetic trees were constructed for comparisons and the possible factors affecting the method accuracy were also discussed in the calculations. The results revealed that this method could produce results consistent with those of previous studies about the proper consensus sequences that could be successfully used as phylogenetic markers. And our results also suggested that these evolutionary correlations could provide useful information for identifying genes that could be used effectively to infer the genetic relationships. PMID:24312527

  8. Fast statistical alignment.

    PubMed

    Bradley, Robert K; Roberts, Adam; Smoot, Michael; Juvekar, Sudeep; Do, Jaeyoung; Dewey, Colin; Holmes, Ian; Pachter, Lior

    2009-05-01

    We describe a new program for the alignment of multiple biological sequences that is both statistically motivated and fast enough for problem sizes that arise in practice. Our Fast Statistical Alignment program is based on pair hidden Markov models which approximate an insertion/deletion process on a tree and uses a sequence annealing algorithm to combine the posterior probabilities estimated from these models into a multiple alignment. FSA uses its explicit statistical model to produce multiple alignments which are accompanied by estimates of the alignment accuracy and uncertainty for every column and character of the alignment--previously available only with alignment programs which use computationally-expensive Markov Chain Monte Carlo approaches--yet can align thousands of long sequences. Moreover, FSA utilizes an unsupervised query-specific learning procedure for parameter estimation which leads to improved accuracy on benchmark reference alignments in comparison to existing programs. The centroid alignment approach taken by FSA, in combination with its learning procedure, drastically reduces the amount of false-positive alignment on biological data in comparison to that given by other methods. The FSA program and a companion visualization tool for exploring uncertainty in alignments can be used via a web interface at http://orangutan.math.berkeley.edu/fsa/, and the source code is available at http://fsa.sourceforge.net/. PMID:19478997

  9. Phylogenics & Tree-Thinking

    ERIC Educational Resources Information Center

    Baum, David A.; Offner, Susan

    2008-01-01

    Phylogenetic trees, which are depictions of the inferred evolutionary relationships among a set of species, now permeate almost all branches of biology and are appearing in increasing numbers in biology textbooks. While few state standards explicitly require knowledge of phylogenetics, most require some knowledge of evolutionary biology, and many…

  10. On Determining if Tree-based Networks Contain Fixed Trees.

    PubMed

    Anaya, Maria; Anipchenko-Ulaj, Olga; Ashfaq, Aisha; Chiu, Joyce; Kaiser, Mahedi; Ohsawa, Max Shoji; Owen, Megan; Pavlechko, Ella; St John, Katherine; Suleria, Shivam; Thompson, Keith; Yap, Corrine

    2016-05-01

    We address an open question of Francis and Steel about phylogenetic networks and trees. They give a polynomial time algorithm to decide if a phylogenetic network, N, is tree-based and pose the problem: given a fixed tree T and network N, is N based on T? We show that it is [Formula: see text]-hard to decide, by reduction from 3-Dimensional Matching (3DM) and further that the problem is fixed-parameter tractable. PMID:27125655

  11. Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

    NASA Technical Reports Server (NTRS)

    Fox, G. E.

    1985-01-01

    Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.

  12. IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.

    PubMed

    Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam

    2015-01-01

    IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix. PMID:25861209

  13. IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments

    PubMed Central

    Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam

    2015-01-01

    IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix. PMID:25861209

  14. Global Alignment System for Large Genomic Sequencing

    2002-03-01

    AVID is a global alignment system tailored for the alignment of large genomic sequences up to megabases in length. Features include the possibility of one sequence being in draft form, fast alignment, robustness and accuracy. The method is an anchor based alignment using maximal matches derived from suffix trees.

  15. Detecting the limits of regulatory element conservation anddivergence estimation using pairwise and multiple alignments

    SciTech Connect

    Pollard, Daniel A.; Moses, Alan M.; Iyer, Venky N.; Eisen,Michael B.

    2006-08-14

    Background: Molecular evolutionary studies of noncodingsequences rely on multiple alignments. Yet how multiple alignmentaccuracy varies across sequence types, tree topologies, divergences andtools, and further how this variation impacts specific inferences,remains unclear. Results: Here we develop a molecular evolutionsimulation platform, CisEvolver, with models of background noncoding andtranscription factor binding site evolution, and use simulated alignmentsto systematically examine multiple alignment accuracy and its impact ontwo key molecular evolutionary inferences: transcription factor bindingsite conservation and divergence estimation. We find that the accuracy ofmultiple alignments is determined almost exclusively by the pairwisedivergence distance of the two most diverged species and that additionalspecies have a negligible influence on alignment accuracy. Conservedtranscription factor binding sites align better than surroundingnoncoding DNA yet are often found to be misaligned at relatively shortdivergence distances, such that studies of binding site gain and losscould easily be confounded by alignment error. Divergence estimates frommultiple alignments tend to be overestimated at short divergencedistances but reach a tool specific divergence at which they cease toincrease, leading to underestimation at long divergences. Our moststriking finding was that overall alignment accuracy, binding sitealignment accuracy and divergence estimation accuracy vary greatly acrossbranches in a tree and are most accurate for terminal branches connectingsister taxa and least accurate for internal branches connectingsub-alignments. Conclusions: Our results suggest that variation inalignment accuracy can lead to errors in molecular evolutionaryinferences that could be construed as biological variation. Thesefindings have implications for which species to choose for analyses, whatkind of errors would be expected for a given set of species and howmultiple alignment tools and

  16. Phylogenetic support values are not necessarily informative: the case of the Serialia hypothesis (a mollusk phylogeny)

    PubMed Central

    Wägele, J Wolfgang; Letsch, Harald; Klussmann-Kolb, Annette; Mayer, Christoph; Misof, Bernhard; Wägele, Heike

    2009-01-01

    Background Molecular phylogenies are being published increasingly and many biologists rely on the most recent topologies. However, different phylogenetic trees often contain conflicting results and contradict significant background data. Not knowing how reliable traditional knowledge is, a crucial question concerns the quality of newly produced molecular data. The information content of DNA alignments is rarely discussed, as quality statements are mostly restricted to the statistical support of clades. Here we present a case study of a recently published mollusk phylogeny that contains surprising groupings, based on five genes and 108 species, and we apply new or rarely used tools for the analysis of the information content of alignments and for the filtering of noise (masking of random-like alignment regions, split decomposition, phylogenetic networks, quartet mapping). Results The data are very fragmentary and contain contaminations. We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies. Split-decomposition, quartet mapping and neighbornet analyses reveal conflicting nucleotide patterns and lack of distinct phylogenetic signal for the deeper phylogeny of mollusks. Conclusion Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded. Contradictions between phylogenies based on different analyses are already a strong indication of unnoticed pitfalls. The use of tree-independent tools for exploratory analyses of data quality is highly recommended. Concerning the new mollusk phylogeny more convincing evidence is needed. PMID:19555513

  17. Phylogenetic study of Class Armophorea (Alveolata, Ciliophora) based on 18S-rDNA data

    PubMed Central

    da Silva Paiva, Thiago; do Nascimento Borges, Bárbara; da Silva-Neto, Inácio Domingos

    2013-01-01

    The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195) retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree. PMID:24385862

  18. Sequence exploration reveals information bias among molecular markers used in phylogenetic reconstruction for Colletotrichum species.

    PubMed

    Rampersad, Sephra N; Hosein, Fazeeda N; Carrington, Christine Vf

    2014-01-01

    The Colletotrichum gloeosporioides species complex is among the most destructive fungal plant pathogens in the world, however, identification of isolates of quarantine importance to the intra-specific level is confounded by a number of factors that affect phylogenetic reconstruction. Information bias and quality parameters were investigated to determine whether nucleotide sequence alignments and phylogenetic trees accurately reflect the genetic diversity and phylogenetic relatedness of individuals. Sequence exploration of GAPDH, ACT, TUB2 and ITS markers indicated that the query sequences had different patterns of nucleotide substitution but were without evidence of base substitution saturation. Regions of high entropy were much more dispersed in the ACT and GAPDH marker alignments than for the ITS and TUB2 markers. A discernible bimodal gap in the genetic distance frequency histograms was produced for the ACT and GAPDH markers which indicated successful separation of intra- and inter-specific sequences in the data set. Overall, analyses indicated clear differences in the ability of these markers to phylogenetically separate individuals to the intra-specific level which coincided with information bias. PMID:25392785

  19. Simultaneous Bayesian Estimation of Alignment and Phylogeny under a Joint Model of Protein Sequence and Structure

    PubMed Central

    Herman, Joseph L.; Challis, Christopher J.; Novák, Ádám; Hein, Jotun; Schmidler, Scott C.

    2014-01-01

    For sequences that are highly divergent, there is often insufficient information to infer accurate alignments, and phylogenetic uncertainty may be high. One way to address this issue is to make use of protein structural information, since structures generally diverge more slowly than sequences. In this work, we extend a recently developed stochastic model of pairwise structural evolution to multiple structures on a tree, analytically integrating over ancestral structures to permit efficient likelihood computations under the resulting joint sequence–structure model. We observe that the inclusion of structural information significantly reduces alignment and topology uncertainty, and reduces the number of topology and alignment errors in cases where the true trees and alignments are known. In some cases, the inclusion of structure results in changes to the consensus topology, indicating that structure may contain additional information beyond that which can be obtained from sequences. We use the model to investigate the order of divergence of cytoglobins, myoglobins, and hemoglobins and observe a stabilization of phylogenetic inference: although a sequence-based inference assigns significant posterior probability to several different topologies, the structural model strongly favors one of these over the others and is more robust to the choice of data set. PMID:24899668

  20. Consistency and inconsistency of consensus methods for inferring species trees from gene trees in the presence of ancestral population structure.

    PubMed

    DeGiorgio, Michael; Rosenberg, Noah A

    2016-08-01

    In the last few years, several statistically consistent consensus methods for species tree inference have been devised that are robust to the gene tree discordance caused by incomplete lineage sorting in unstructured ancestral populations. One source of gene tree discordance that has only recently been identified as a potential obstacle for phylogenetic inference is ancestral population structure. In this article, we describe a general model of ancestral population structure, and by relying on a single carefully constructed example scenario, we show that the consensus methods Democratic Vote, STEAC, STAR, R(∗) Consensus, Rooted Triple Consensus, Minimize Deep Coalescences, and Majority-Rule Consensus are statistically inconsistent under the model. We find that among the consensus methods evaluated, the only method that is statistically consistent in the presence of ancestral population structure is GLASS/Maximum Tree. We use simulations to evaluate the behavior of the various consensus methods in a model with ancestral population structure, showing that as the number of gene trees increases, estimates on the basis of GLASS/Maximum Tree approach the true species tree topology irrespective of the level of population structure, whereas estimates based on the remaining methods only approach the true species tree topology if the level of structure is low. However, through simulations using species trees both with and without ancestral population structure, we show that GLASS/Maximum Tree performs unusually poorly on gene trees inferred from alignments with little information. This practical limitation of GLASS/Maximum Tree together with the inconsistency of other methods prompts the need for both further testing of additional existing methods and development of novel methods under conditions that incorporate ancestral population structure. PMID:27086043

  1. Phylogenetics, classification, and biogeography of the treefrogs (Amphibia: Anura: Arboranae).

    PubMed

    Duellman, William E; Marion, Angela B; Hedges, S Blair

    2016-01-01

    A phylogenetic analysis of sequences from 503 species of hylid frogs and four outgroup taxa resulted in 16,128 aligned sites of 19 genes. The molecular data were subjected to a maximum likelihood analysis that resulted in a new phylogenetic tree of treefrogs. A conservative new classification based on the tree has (1) three families composing an unranked taxon, Arboranae, (2) nine subfamilies (five resurrected, one new), and (3) six resurrected generic names and five new generic names. Using the results of a maximum likelihood timetree, times of divergence were determined. For the most part these times of divergence correlated well with historical geologic events. The arboranan frogs originated in South America in the Late Mesozoic or Early Cenozoic. The family Pelodryadidae diverged from its South American relative, Phyllomedusidae, in the Eocene and invaded Australia via Antarctica. There were two dispersals from South America to North America in the Paleogene. One lineage was the ancestral stock of Acris and its relatives, whereas the other lineage, subfamily Hylinae, differentiated into a myriad of genera in Middle America. PMID:27394762

  2. Entanglement, Invariants, and Phylogenetics

    NASA Astrophysics Data System (ADS)

    Sumner, J. G.

    2007-10-01

    This thesis develops and expands upon known techniques of mathematical physics relevant to the analysis of the popular Markov model of phylogenetic trees required in biology to reconstruct the evolutionary relationships of taxonomic units from biomolecular sequence data. The techniques of mathematical physics are plethora and have been developed for some time. The Markov model of phylogenetics and its analysis is a relatively new technique where most progress to date has been achieved by using discrete mathematics. This thesis takes a group theoretical approach to the problem by beginning with a remarkable mathematical parallel to the process of scattering in particle physics. This is shown to equate to branching events in the evolutionary history of molecular units. The major technical result of this thesis is the derivation of existence proofs and computational techniques for calculating polynomial group invariant functions on a multi-linear space where the group action is that relevant to a Markovian time evolution. The practical results of this thesis are an extended analysis of the use of invariant functions in distance based methods and the presentation of a new reconstruction technique for quartet trees which is consistent with the most general Markov model of sequence evolution.

  3. The dawn of open access to phylogenetic data.

    PubMed

    Magee, Andrew F; May, Michael R; Moore, Brian R

    2014-01-01

    The scientific enterprise depends critically on the preservation of and open access to published data. This basic tenet applies acutely to phylogenies (estimates of evolutionary relationships among species). Increasingly, phylogenies are estimated from increasingly large, genome-scale datasets using increasingly complex statistical methods that require increasing levels of expertise and computational investment. Moreover, the resulting phylogenetic data provide an explicit historical perspective that critically informs research in a vast and growing number of scientific disciplines. One such use is the study of changes in rates of lineage diversification (speciation--extinction) through time. As part of a meta-analysis in this area, we sought to collect phylogenetic data (comprising nucleotide sequence alignment and tree files) from 217 studies published in 46 journals over a 13-year period. We document our attempts to procure those data (from online archives and by direct request to corresponding authors), and report results of analyses (using Bayesian logistic regression) to assess the impact of various factors on the success of our efforts. Overall, complete phylogenetic data for [Formula: see text] of these studies are effectively lost to science. Our study indicates that phylogenetic data are more likely to be deposited in online archives and/or shared upon request when: (1) the publishing journal has a strong data-sharing policy; (2) the publishing journal has a higher impact factor, and; (3) the data are requested from faculty rather than students. Importantly, our survey spans recent policy initiatives and infrastructural changes; our analyses indicate that the positive impact of these community initiatives has been both dramatic and immediate. Although the results of our study indicate that the situation is dire, our findings also reveal tremendous recent progress in the sharing and preservation of phylogenetic data. PMID:25343725

  4. Phylogenetic analysis of the genus Sorghum based on combined sequence data from cpDNA regions and ITS generate well-supported trees with two major lineages

    PubMed Central

    Ng'uni, Dickson; Geleta, Mulatu; Fatih, Moneim; Bryngelsson, Tomas

    2010-01-01

    Background and Aims Wild Sorghum species provide novel traits for both biotic and abiotic stress resistance and yield for the improvement of cultivated sorghum. A better understanding of the phylogeny in the genus Sorghum will enhance use of the valuable agronomic traits found in wild sorghum. Methods Four regions of chloroplast DNA (cpDNA; psbZ-trnG, trnY-trnD, trnY-psbM and trnT-trnL) and the internal transcribed spacer (ITS) of nuclear ribosomal DNA were used to analyse the phylogeny of sorghum based on maximum-parsimony analyses. Key Results Parsimony analyses of the ITS and cpDNA regions as separate or combined sequence datasets formed trees with strong bootstrap support with two lineages: the Eu-sorghum species S. laxiflorum and S. macrospermum in one and Stiposorghum and Para-sorghum in the other. Within Eu-sorghum, S. bicolor-3, -11 and -14 originating from southern Africa form a distinct clade. S. bicolor-2, originally from Yemen, is distantly related to other S. bicolor accessions. Conclusions Eu-sorghum species are more closely related to S. macrospermum and S. laxiflorum than to any other Australian wild Sorghum species. S. macrospermum and S. laxiflorum are so closely related that it is inappropriate to classify them in separate sections. S. almum is closely associated with S. bicolor, suggesting that the latter is the maternal parent of the former given that cpDNA is maternally inherited in angiosperms. S. bicolor-3, -11 and -14, from southern Africa, are closely related to each other but distantly related to S. bicolor-2. PMID:20061309

  5. Refuting phylogenetic relationships

    PubMed Central

    Bucknam, James; Boucher, Yan; Bapteste, Eric

    2006-01-01

    Background Phylogenetic methods are philosophically grounded, and so can be philosophically biased in ways that limit explanatory power. This constitutes an important methodologic dimension not often taken into account. Here we address this dimension in the context of concatenation approaches to phylogeny. Results We discuss some of the limits of a methodology restricted to verificationism, the philosophy on which gene concatenation practices generally rely. As an alternative, we describe a software which identifies and focuses on impossible or refuted relationships, through a simple analysis of bootstrap bipartitions, followed by multivariate statistical analyses. We show how refuting phylogenetic relationships could in principle facilitate systematics. We also apply our method to the study of two complex phylogenies: the phylogeny of the archaea and the phylogeny of the core of genes shared by all life forms. While many groups are rejected, our results left open a possible proximity of N. equitans and the Methanopyrales, of the Archaea and the Cyanobacteria, and as well the possible grouping of the Methanobacteriales/Methanoccocales and Thermosplasmatales, of the Spirochaetes and the Actinobacteria and of the Proteobacteria and firmicutes. Conclusion It is sometimes easier (and preferable) to decide which species do not group together than which ones do. When possible topologies are limited, identifying local relationships that are rejected may be a useful alternative to classical concatenation approaches aiming to find a globally resolved tree on the basis of weak phylogenetic markers. Reviewers This article was reviewed by Mark Ragan, Eugene V Koonin and J Peter Gogarten. PMID:16956399

  6. Phylogenetic analysis of Maverick/Polinton giant transposons across organisms.

    PubMed

    Haapa-Paananen, Saija; Wahlberg, Niklas; Savilahti, Harri

    2014-09-01

    Polintons are a recently discovered group of large transposable elements (<40Kb in size) encoding up to 10 different proteins. The increasing number of genome sequencing projects has led to the discovery of these elements in genomes of protists, fungi, and animals, but not in plants. The RepBase database of eukaryotic repetitive elements currently contains consensus sequences and information of 70 Polinton elements from 28 organisms. Previous phylogenetic analyses have shown the relationship of Polintons to linear plasmids, bacteriophages, and retroviruses. However, a comprehensive phylogenetic analysis of all known Polintons has been lacking. We retrieved the Polinton consensus sequences from the most recent version of RepBase, and compiled amino acid sequences for the two most common Polinton-specific genes, the DNA polymerase-B and retroviral-like integrase. Open reading frame predictions and homology comparisons revealed partial or full sequences for 54 polymerases and 55 Polinton integrases. Multiple sequence alignments portrayed conservation in several functional motifs of these proteins. Phylogenetic analyses based on Bayesian inference using single- and combined-gene datasets revealed seven distinct lineages of Polintons that broadly follow the tree of life. Two of the seven lineages are found within the same species, indicating that ancient divergences have been retained to this day. PMID:24882428

  7. Diversity Measures in Environmental Sequences Are Highly Dependent on Alignment Quality—Data from ITS and New LSU Primers Targeting Basidiomycetes

    PubMed Central

    Fischer, Christiane; Daniel, Rolf; Wubet, Tesfaye

    2012-01-01

    The ribosomal DNA comprised of the ITS1-5.8S-ITS2 regions is widely used as a fungal marker in molecular ecology and systematics but cannot be aligned with confidence across genetically distant taxa. In order to study the diversity of Agaricomycotina in forest soils, we designed primers targeting the more alignable 28S (LSU) gene, which should be more useful for phylogenetic analyses of the detected taxa. This paper compares the performance of the established ITS1F/4B primer pair, which targets basidiomycetes, to that of two new pairs. Key factors in the comparison were the diversity covered, off-target amplification, rarefaction at different Operational Taxonomic Unit (OTU) cutoff levels, sensitivity of the method used to process the alignment to missing data and insecure positional homology, and the congruence of monophyletic clades with OTU assignments and BLAST-derived OTU names. The ITS primer pair yielded no off-target amplification but also exhibited the least fidelity to the expected phylogenetic groups. The LSU primers give complementary pictures of diversity, but were more sensitive to modifications of the alignment such as the removal of difficult-to align stretches. The LSU primers also yielded greater numbers of singletons but also had a greater tendency to produce OTUs containing sequences from a wider variety of species as judged by BLAST similarity. We introduced some new parameters to describe alignment heterogeneity based on Shannon entropy and the extent and contents of the OTUs in a phylogenetic tree space. Our results suggest that ITS should not be used when calculating phylogenetic trees from genetically distant sequences obtained from environmental DNA extractions and that it is inadvisable to define OTUs on the basis of very heterogeneous alignments. PMID:22363808

  8. Graphics processing unit-based alignment of protein interaction networks.

    PubMed

    Xie, Jiang; Zhou, Zhonghua; Ma, Jin; Xiang, Chaojuan; Nie, Qing; Zhang, Wu

    2015-08-01

    Network alignment is an important bridge to understanding human protein-protein interactions (PPIs) and functions through model organisms. However, the underlying subgraph isomorphism problem complicates and increases the time required to align protein interaction networks (PINs). Parallel computing technology is an effective solution to the challenge of aligning large-scale networks via sequential computing. In this study, the typical Hungarian-Greedy Algorithm (HGA) is used as an example for PIN alignment. The authors propose a HGA with 2-nearest neighbours (HGA-2N) and implement its graphics processing unit (GPU) acceleration. Numerical experiments demonstrate that HGA-2N can find alignments that are close to those found by HGA while dramatically reducing computing time. The GPU implementation of HGA-2N optimises the parallel pattern, computing mode and storage mode and it improves the computing time ratio between the CPU and GPU compared with HGA when large-scale networks are considered. By using HGA-2N in GPUs, conserved PPIs can be observed, and potential PPIs can be predicted. Among the predictions based on 25 common Gene Ontology terms, 42.8% can be found in the Human Protein Reference Database. Furthermore, a new method of reconstructing phylogenetic trees is introduced, which shows the same relationships among five herpes viruses that are obtained using other methods. PMID:26243827

  9. Using the Multiple Analysis Approach to Reconstruct Phylogenetic Relationships among Planktonic Foraminifera from Highly Divergent and Length-polymorphic SSU rDNA Sequences

    PubMed Central

    Aurahs, Ralf; Göker, Markus; Grimm, Guido W.; Hemleben, Vera; Hemleben, Christoph; Schiebel, Ralf; Kučera, Michal

    2009-01-01

    The high sequence divergence within the small subunit ribosomal RNA gene (SSU rDNA) of foraminifera makes it difficult to establish the homology of individual nucleotides across taxa. Alignment-based approaches so far relied on time-consuming manual alignments and discarded up to 50% of the sequenced nucleotides prior to phylogenetic inference. Here, we investigate the potential of the multiple analysis approach to infer a molecular phylogeny of all modern planktonic foraminiferal taxa by using a matrix of 146 new and 153 previously published SSU rDNA sequences. Our multiple analysis approach is based on eleven different automated alignments, analysed separately under the maximum likelihood criterion. The high degree of congruence between the phylogenies derived from our novel approach, traditional manually homologized culled alignments and the fossil record indicates that poorly resolved nucleotide homology does not represent the most significant obstacle when exploring the phylogenetic structure of the SSU rDNA in planktonic foraminifera. We show that approaches designed to extract phylogenetically valuable signals from complete sequences show more promise to resolve the backbone of the planktonic foraminifer tree than attempts to establish strictly homologous base calls in a manual alignment. PMID:20140067

  10. A Metric on the Space of Partly Reduced Phylogenetic Networks

    PubMed Central

    2016-01-01

    Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of evolutionary events acting at the population level, such as recombination between genes, hybridization between lineages, and horizontal gene transfer. The researchers have designed several measures for computing the dissimilarity between two phylogenetic networks, and each measure has been proven to be a metric on a special kind of phylogenetic networks. However, none of the existing measures is a metric on the space of partly reduced phylogenetic networks. In this paper, we provide a metric, de-distance, on the space of partly reduced phylogenetic networks, which is polynomial-time computable. PMID:27419137

  11. [Phylogenetic analysis of Pleurotus species].

    PubMed

    Shnyreva, A A; Shnyreva, A V

    2015-02-01

    We performed phylogenetic analysis for ten Pleurotus species, based on internal transcribed spacer (ITS) sequences of rDNA. A phylogenetic tree was constructed on the basis of 31 oyster fungi strains of different origin and 10 reference sequences from GenBank. Our analysis demonstrates that the tested Pleurotus species are of monophyletic origin. We evaluated the evolutionary distances between these species. Classic genetic analysis of sexual compatibility based on monocaryon (mon)-mon crosses showed no reproductive barriers within the P. cornucopiae-P. euosmus species complex. Thus, despite the divergence (subclustering) between commercial strains and natural isolates of P. ostreatus revealed by phylogenetic analysis, there is no reproductive isolation between these groups. A common allele of the matB locus was identified for the commercial strains Sommer and L/4, supporting the common origin of these strains. PMID:25966583

  12. Parsimony and model-based analyses of indels in avian nuclear genes reveal congruent and incongruent phylogenetic signals.

    PubMed

    Yuri, Tamaki; Kimball, Rebecca T; Harshman, John; Bowie, Rauri C K; Braun, Michael J; Chojnowski, Jena L; Han, Kin-Lan; Hackett, Shannon J; Huddleston, Christopher J; Moore, William S; Reddy, Sushma; Sheldon, Frederick H; Steadman, David W; Witt, Christopher C; Braun, Edward L

    2013-01-01

    Insertion/deletion (indel) mutations, which are represented by gaps in multiple sequence alignments, have been used to examine phylogenetic hypotheses for some time. However, most analyses combine gap data with the nucleotide sequences in which they are embedded, probably because most phylogenetic datasets include few gap characters. Here, we report analyses of 12,030 gap characters from an alignment of avian nuclear genes using maximum parsimony (MP) and a simple maximum likelihood (ML) framework. Both trees were similar, and they exhibited almost all of the strongly supported relationships in the nucleotide tree, although neither gap tree supported many relationships that have proven difficult to recover in previous studies. Moreover, independent lines of evidence typically corroborated the nucleotide topology instead of the gap topology when they disagreed, although the number of conflicting nodes with high bootstrap support was limited. Filtering to remove short indels did not substantially reduce homoplasy or reduce conflict. Combined analyses of nucleotides and gaps resulted in the nucleotide topology, but with increased support, suggesting that gap data may prove most useful when analyzed in combination with nucleotide substitutions. PMID:24832669

  13. The augmentation algorithm and molecular phylogenetic trees

    NASA Technical Reports Server (NTRS)

    Holmquist, R.

    1978-01-01

    Moore's (1977) augmentation procedure is discussed, and it is concluded that the procedure is valid for obtaining estimates of the total number of fixed nucleotide substitutions both theoretically and in practice, for both simulated and real data, and in agreement, for experimentally dense data sets, with stochastic estimates of the divergence, provided the restrictions on codon mutability resulting from natural selection are explicitly allowed for. Tateno and Nei's (1978) critique that the augmentation procedure has a systematic bias toward overestimation of the total number of nucleotide replacements is disputed, and a data analysis suggests that ancestral sequences inferred by the method of parsimony contain a large number of incorrectly assigned nucleotides.

  14. Genome trees constructed using five different approaches suggest new major bacterial clades

    PubMed Central

    Wolf, Yuri I; Rogozin, Igor B; Grishin, Nick V; Tatusov, Roman L; Koonin, Eugene V

    2001-01-01

    Background The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the

  15. Molecular systematics of terraranas (Anura: Brachycephaloidea) with an assessment of the effects of alignment and optimality criteria.

    PubMed

    Padial, José M; Grant, Taran; Frost, Darrel R

    2014-01-01

    Brachycephaloidea is a monophyletic group of frogs with more than 1000 species distributed throughout the New World tropics, subtropics, and Andean regions. Recently, the group has been the target of multiple molecular phylogenetic analyses, resulting in extensive changes in its taxonomy. Here, we test previous hypotheses of phylogenetic relationships for the group by combining available molecular evidence (sequences of 22 genes representing 431 ingroup and 25 outgroup terminals) and performing a tree-alignment analysis under the parsimony optimality criterion using the program POY. To elucidate the effects of alignment and optimality criterion on phylogenetic inferences, we also used the program MAFFT to obtain a similarity-alignment for analysis under both parsimony and maximum likelihood using the programs TNT and GARLI, respectively. Although all three analytical approaches agreed on numerous points, there was also extensive disagreement. Tree-alignment under parsimony supported the monophyly of the ingroup and the sister group relationship of the monophyletic marsupial frogs (Hemiphractidae), while maximum likelihood and parsimony analyses of the MAFFT similarity-alignment did not. All three methods differed with respect to the position of Ceuthomantis smaragdinus (Ceuthomantidae), with tree-alignment using parsimony recovering this species as the sister of Pristimantis + Yunganastes. All analyses rejected the monophyly of Strabomantidae and Strabomantinae as originally defined, and the tree-alignment analysis under parsimony further rejected the recently redefined Craugastoridae and Pristimantinae. Despite the greater emphasis in the systematics literature placed on the choice of optimality criterion for evaluating trees than on the choice of method for aligning DNA sequences, we found that the topological differences attributable to the alignment method were as great as those caused by the optimality criterion. Further, the optimal tree-alignment indicates

  16. An alternative construction of internodons: the emergence of a multi-level tree of life.

    PubMed

    Alexander, Samuel A; de Bruin, Arie; Kornet, D J

    2015-01-01

    Internodons are a formalization of Hennig's concept of species. We present an alternative construction of internodons imposing a tree structure on the genealogical network. We prove that the segments (trivial unary trees) from this tree structure are precisely the internodons. We obtain the following spin-offs. First, the generated tree turns out to be an organismal tree of life. Second, this organismal tree is homeomorphic to the phylogenetic Hennigian species tree of life, implying the discovery of a multi-level tree of life: this phylogenetic tree can be obtained by zooming out from the organismal tree, or conversely, the organismal tree of life can be generated by expanding the phylogenetic nodes into unary trees. Finally, the definition of the organismal tree allows an efficient algorithmic transformation of a given genealogical network into its corresponding phylogenetic species tree of life. The latter will be presented in a separate paper. PMID:25515028

  17. [Foundations of the new phylogenetics].

    PubMed

    Pavlinov, I Ia

    2004-01-01

    phylistics (Rasnitsyn's term; close to Simpsonian evolutionary taxonomy) belonging rather to the classical realm, and Hennigian cladistics that pays attention to origin of monophyletic taxa exclusively. In early of the 20th century, microevolutionary doctrine became predominating in evolutionary studies. Its core is the population thinking accompanied by the phenetic one based on equation of kinship to overall similarity. They were connected to positivist philosophy and hence were characterized by reductionism at both ontological and epistemological levels. It led to fall of classical phylogenetics but created the prerequisites for the new phylogenetics which also appeared to be full of reductionism. The new rise of phylogenetic (rather than tree) thinking during the last third of the 20th century was caused by lost of explanatory power of population one and by development of the new worldview and new epistemological premises. That new worldview is based on the synergetic (Prigoginian) model of development of non-equilibrium systems: evolution of the biota, a part of which is phylogeny, is considered as such a development. At epistemological level, the principal premise appeared to be fall of positivism which was replaced by post-positivism argumentation schemes. Input of cladistics into new phylogenetics is twofold. On the one hand, it reduced phylogeny to cladistic history lacking any adaptivist interpretation and presuming minimal evolution model. From this it followed reduction of kinship relation to sister-group relation lacking any reference to real time scale and to ancestor-descendant relation. On the other hand, cladistics elaborated methodology of phylogenetic reconstructions based on the synapomorphy principle, the outgroup concept became its part. The both inputs served as premises of incorporation of both numerical techniques and molecular data into phylogenetic reconstruction. Numerical phyletics provided the new phylogenetics with easily manipulated algorithms

  18. Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium.

    PubMed

    Gaudet, Pascale; Livstone, Michael S; Lewis, Suzanna E; Thomas, Paul D

    2011-09-01

    The goal of the Gene Ontology (GO) project is to provide a uniform way to describe the functions of gene products from organisms across all kingdoms of life and thereby enable analysis of genomic data. Protein annotations are either based on experiments or predicted from protein sequences. Since most sequences have not been experimentally characterized, most available annotations need to be based on predictions. To make as accurate inferences as possible, the GO Consortium's Reference Genome Project is using an explicit evolutionary framework to infer annotations of proteins from a broad set of genomes from experimental annotations in a semi-automated manner. Most components in the pipeline, such as selection of sequences, building multiple sequence alignments and phylogenetic trees, retrieving experimental annotations and depositing inferred annotations, are fully automated. However, the most crucial step in our pipeline relies on software-assisted curation by an expert biologist. This curation tool, Phylogenetic Annotation and INference Tool (PAINT) helps curators to infer annotations among members of a protein family. PAINT allows curators to make precise assertions as to when functions were gained and lost during evolution and record the evidence (e.g. experimentally supported GO annotations and phylogenetic information including orthology) for those assertions. In this article, we describe how we use PAINT to infer protein function in a phylogenetic context with emphasis on its strengths, limitations and guidelines. We also discuss specific examples showing how PAINT annotations compare with those generated by other highly used homology-based methods. PMID:21873635

  19. Phylogenetic Analysis of Selected Menthol-Producing Species Belonging to the Lamiaceae Family.

    PubMed

    Mirzaei, Motahareh; Mirzaei, Hamed; Sahebkar, Amirhossein; Bagherian, Ali; Masoud Khoi, Mohammad Jaber; Reza Mirzaei, Hamid; Salehi, Rasoul; Reza Jaafari, Mahmoud; Kazemi Oskuee, Reza

    2015-01-01

    Menthol is an organic compound with diverse medicinal and commercial applications, and is made either synthetically or through extraction from mint oils. The aim of the present study was to investigate menthol levels in selected menthol-producing species belonging to the Lamiaceae family, and to determine phylogenetic relationships of menthol dehydrogenase gene sequence among these species. Three genus of Lamiaceae, namely Mentha, Salvia, and Micromeria, were selected for phytochemical and phylogenetic analyses. After identification of each species based on menthol dehydrogenase gene in NCBI, BLAST software was used for the sequence alignment. MEGA4 software was used to draw phylogenetic tree for various species. Phytochemical analysis revealed that the highest and lowest amounts of both essential oil and menthol belonged to Mentha spicata and Micromeria hyssopifolia, respectively. The species Mentha spicata and Mentha piperita, which were assigned to one cluster in the dendrogram, contained the highest amounts of essential oil and menthol while Micromeria species, which was in the distinct cluster and placed in the farther evolutionary distance, contained the lowest amount of essential oil and menthol. Phylogenetic and phytochemistry analyses showed that essential oil and menthol contents of menthol-producing species are associated with menthol dehydrogenase gene sequence. PMID:26252633

  20. Phylogenetic analysis to uncover organellar origins of nuclear-encoded genes.

    PubMed

    Foth, Bernardo J

    2007-01-01

    Most proteins that are located in mitochondria or plastids are encoded by the nuclear genome, because the organellar genomes have undergone severe reduction during evolution. In many cases, although not all, the nuclear genes encoding organelle-targeted proteins actually originated from the respective organellar genome and thus carry the phylogenetic fingerprint that still bespeaks their evolutionary origin. Phylogenetic analysis is a powerful in silico method that can yield important insights into the evolutionary history or molecular kinship of any gene or protein and that can thus also be used more specifically in the context of organellar targeting as one means to recognize protein candidates (e.g., from genome data) that may be targeted to mitochondria or plastids. This chapter provides protocols for creating multiple sequence alignments and carrying out phylogenetic analysis with the robust and comprehensive software packages Clustal and PHYLIP, which are both available free of charge for multiple computer platforms. Besides presenting step-by-step instructions on how to run these computer programs, this chapter also covers topics such as data collection and presentation of phylogenetic trees. PMID:17951706

  1. Phylogenetic relationships among Agamid lizards of the Laudakia caucasia species group: testing hypotheses of biogeographic fragmentation and an area cladogram for the Iranian Plateau.

    PubMed

    Macey, J R; Schulte, J A; Ananjeva, N B; Larson, A; Rastegar-Pouyani, N; Shammakov, S M; Papenfuss, T J

    1998-08-01

    Phylogenetic relationships within the Laudakia caucasia species group on the Iranian Plateau were investigated using 1708 aligned bases of mitochondrial DNA sequence from the genes encoding ND1 (subunit one of NADH dehydrogenase), tRNAGln, tRNAIle, tRNAMet, ND2, tRNATrp, tRNAAla, tRNAAsn, tRNACys, tRNATyr, and COI (subunit I of cytochrome c oxidase). The aligned sequences contain 207 phylogenetically informative characters. Three hypotheses for historical fragmentation of Laudakia populations on the Iranian Plateau were tested. In two hypotheses, fragmentation of populations is suggested to have proceeded along continuous mountain belts that surround the Iranian Plateau. In another hypothesis, fragmentation is suggested to have resulted from a north-south split caused by uplifting of the Zagros Mountains in the late Miocene or early Pliocene [5-10 MYBP (million years before present)]. The shortest tree suggest the later hypothesis, and statistical tests reject the other two hypothesis. The phylogenetic tree is exceptional in that every branch is well supported. Geologic history provides dates for most branches of the tree. A plot of DNA substitutions against dates from geologic history refines the date for the north-south split across the Iranian Plateau to 9 MYBP (late Miocene). The rate of evolution for this segment of mtDNA is 0.65% (0.61-0.70%) change per lineage per million years. A hypothesis of area relationships for the biota of the Iranian Plateau is generated from the phylogenetic tree. PMID:9751922

  2. A phylogenetic analysis of Aquifex pyrophilus

    NASA Technical Reports Server (NTRS)

    Burggraf, S.; Olsen, G. J.; Stetter, K. O.; Woese, C. R.

    1992-01-01

    The 16S rRNA of the bacterion Aquifex pyrophilus, a microaerophilic, oxygen-reducing hyperthermophile, has been sequenced directly from the the PCR amplified gene. Phylogenetic analyses show the Aq. pyrophilus lineage to be probably the deepest (earliest) in the (eu)bacterial tree. The addition of this deep branching to the bacterial tree further supports the argument that the Bacteria are of thermophilic ancestry.

  3. Phylogenetic relationships within cation transporter families of Arabidopsis.

    PubMed

    Mäser, P; Thomine, S; Schroeder, J I; Ward, J M; Hirschi, K; Sze, H; Talke, I N; Amtmann, A; Maathuis, F J; Sanders, D; Harper, J F; Tchieu, J; Gribskov, M; Persans, M W; Salt, D E; Kim, S A; Guerinot, M L

    2001-08-01

    Uptake and translocation of cationic nutrients play essential roles in physiological processes including plant growth, nutrition, signal transduction, and development. Approximately 5% of the Arabidopsis genome appears to encode membrane transport proteins. These proteins are classified in 46 unique families containing approximately 880 members. In addition, several hundred putative transporters have not yet been assigned to families. In this paper, we have analyzed the phylogenetic relationships of over 150 cation transport proteins. This analysis has focused on cation transporter gene families for which initial characterizations have been achieved for individual members, including potassium transporters and channels, sodium transporters, calcium antiporters, cyclic nucleotide-gated channels, cation diffusion facilitator proteins, natural resistance-associated macrophage proteins (NRAMP), and Zn-regulated transporter Fe-regulated transporter-like proteins. Phylogenetic trees of each family define the evolutionary relationships of the members to each other. These families contain numerous members, indicating diverse functions in vivo. Closely related isoforms and separate subfamilies exist within many of these gene families, indicating possible redundancies and specialized functions. To facilitate their further study, the PlantsT database (http://plantst.sdsc.edu) has been created that includes alignments of the analyzed cation transporters and their chromosomal locations. PMID:11500563

  4. Do tree split probabilities determine the branch lengths?

    PubMed

    Chor, Benny; Steel, Mike

    2015-06-01

    The evolution of aligned DNA sequence sites is generally modeled by a Markov process operating along the edges of a phylogenetic tree. It is well known that the probability distribution on the site patterns at the tips of the tree determines the tree topology, and its branch lengths. However, the number of patterns is typically much larger than the number of edges, suggesting considerable redundancy in the branch length estimation. In this paper we ask whether the probabilities of just the 'edge-specific' patterns (the ones that correspond to a change of state on a single edge) suffice to recover the branch lengths of the tree, under a symmetric 2-state Markov process. We first show that this holds provided the branch lengths are sufficiently short, by applying the inverse function theorem. We then consider whether this restriction to short branch lengths is necessary. We show that for trees with up to four leaves it can be lifted. This leaves open the interesting question of whether this holds in general. Our results also extend to certain Markov processes on more than 2-states, such as the Jukes-Cantor model. PMID:25843219

  5. Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.

    PubMed

    Westesson, Oscar; Lunter, Gerton; Paten, Benedict; Holmes, Ian

    2012-01-01

    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes. PMID:22536326

  6. Correlated mutations in protein sequences: Phylogenetic and structural effects

    SciTech Connect

    Lapedes, A.S. |; Giraud, B.G.; Stormo, G.D.

    1998-12-01

    Covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. Covariation analysis of sets of aligned sequences for protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In this paper the authors identify two reasons why naive use of covariation analysis for protein sequences fails to reliably indicate sequence positions that are spatially proximate. The first reason involves the bias introduced in calculation of covariation measures due to the fact that biological sequences are generally related by a non-trivial phylogenetic tree. The authors present a null-model approach to solve this problem. The second reason involves linked chains of covariation which can result in pairs of sites displaying significant covariation even though they are not spatially proximate. They present a maximum entropy solution to this classic problem of causation versus correlation. The methodologies are validated in simulation.

  7. Phylogenetic and microscopic studies in the genus Lactifluus (Basidiomycota, Russulales) in West Africa, including the description of four new species.

    PubMed

    Maba, Dao Lamèga; Guelly, Atsu K; Yorou, Nourou S; Verbeken, Annemieke; Agerer, Reinhard

    2015-06-01

    Despite the crucial ecological role of lactarioid taxa (Lactifluus, Lactarius) as common ectomycorrhiza formers in tropical African seasonal forests, their current diversity is not yet adequately assessed. During the last few years, numerous lactarioid specimens have been sampled in various ecosystems from Togo (West Africa). We generated 48 ITS sequences and aligned them against lactarioid taxa from other tropical African ecozones (Guineo-Congolean evergreen forests, Zambezian miombo). A Maximum Likelihood phylogenetic tree was inferred from a dataset of 109 sequences. The phylogenetic placement of the specimens, combined with morpho-anatomical data, supported the description of four new species from Togo within the monophyletic genus Lactifluus: within subgen. Lactifluus (L. flavellus), subgen. Russulopsis (L. longibasidius and L. pectinatus), and subgen. Edules (L. melleus). This demonstrates that the current species richness of the genus is considerably higher than hitherto estimated for African species and, in addition, a need to redefine the subgenera and sections within it. PMID:26203413

  8. Morphological and molecular convergences in mammalian phylogenetics.

    PubMed

    Zou, Zhengting; Zhang, Jianzhi

    2016-01-01

    Phylogenetic trees reconstructed from molecular sequences are often considered more reliable than those reconstructed from morphological characters, in part because convergent evolution, which confounds phylogenetic reconstruction, is believed to be rarer for molecular sequences than for morphologies. However, neither the validity of this belief nor its underlying cause is known. Here comparing thousands of characters of each type that have been used for inferring the phylogeny of mammals, we find that on average morphological characters indeed experience much more convergences than amino acid sites, but this disparity is explained by fewer states per character rather than an intrinsically higher susceptibility to convergence for morphologies than sequences. We show by computer simulation and actual data analysis that a simple method for identifying and removing convergence-prone characters improves phylogenetic accuracy, potentially enabling, when necessary, the inclusion of morphologies and hence fossils for reliable tree inference. PMID:27585543

  9. Phylogenetic informativeness reconciles ray-finned fish molecular divergence times

    PubMed Central

    2014-01-01

    Background Discordance among individual molecular age estimates, or between molecular age estimates and the fossil record, is observed in many clades across the Tree of Life. This discordance is attributed to a variety of variables including calibration age uncertainty, calibration placement, nucleotide substitution rate heterogeneity, or the specified molecular clock model. However, the impact of changes in phylogenetic informativeness of individual genes over time on phylogenetic inferences is rarely analyzed. Using nuclear and mitochondrial sequence data for ray-finned fishes (Actinopterygii) as an example, we extend the utility of phylogenetic informativeness profiles to predict the time intervals when nucleotide substitution saturation results in discordance among molecular ages estimated. Results We demonstrate that even with identical calibration regimes and molecular clock methods, mitochondrial based molecular age estimates are systematically older than those estimated from nuclear sequences. This discordance is most severe for highly nested nodes corresponding to more recent (i.e., Jurassic-Recent) divergences. By removing data deemed saturated, we reconcile the competing age estimates and highlight that the older mtDNA based ages were driven by nucleotide saturation. Conclusions Homoplasious site patterns in a DNA sequence alignment can systematically bias molecular divergence time estimates. Our study demonstrates that PI profiles can provide a non-arbitrary criterion for data exclusion to mitigate the influence of homoplasy on time calibrated branch length estimates. Analyses of actinopterygian molecular clocks demonstrate that scrutiny of the time scale on which sequence data is informative is a fundamental, but generally overlooked, step in molecular divergence time estimation. PMID:25103329

  10. QueTAL: a suite of tools to classify and compare TAL effectors functionally and phylogenetically

    PubMed Central

    Pérez-Quintero, Alvaro L.; Lamy, Léo; Gordon, Jonathan L.; Escalon, Aline; Cunnac, Sébastien; Szurek, Boris; Gagnevin, Lionel

    2015-01-01

    Transcription Activator-Like (TAL) effectors from Xanthomonas plant pathogenic bacteria can bind to the promoter region of plant genes and induce their expression. DNA-binding specificity is governed by a central domain made of nearly identical repeats, each determining the recognition of one base pair via two amino acid residues (a.k.a. Repeat Variable Di-residue, or RVD). Knowing how TAL effectors differ from each other within and between strains would be useful to infer functional and evolutionary relationships, but their repetitive nature precludes reliable use of traditional alignment methods. The suite QueTAL was therefore developed to offer tailored tools for comparison of TAL effector genes. The program DisTAL considers each repeat as a unit, transforms a TAL effector sequence into a sequence of coded repeats and makes pair-wise alignments between these coded sequences to construct trees. The program FuncTAL is aimed at finding TAL effectors with similar DNA-binding capabilities. It calculates correlations between position weight matrices of potential target DNA sequence predicted from the RVD sequence, and builds trees based on these correlations. The programs accurately represented phylogenetic and functional relationships between TAL effectors using either simulated or literature-curated data. When using the programs on a large set of TAL effector sequences, the DisTAL tree largely reflected the expected species phylogeny. In contrast, FuncTAL showed that TAL effectors with similar binding capabilities can be found between phylogenetically distant taxa. This suite will help users to rapidly analyse any TAL effector genes of interest and compare them to other available TAL genes and should improve our understanding of TAL effectors evolution. It is available at http://bioinfo-web.mpl.ird.fr/cgi-bin2/quetal/quetal.cgi. PMID:26284082

  11. Phylogenetic Analysis of Poliovirus Sequences.

    PubMed

    Jorba, Jaume

    2016-01-01

    Comparative genomic sequencing is a major surveillance tool in the Polio Laboratory Network. Due to the rapid evolution of polioviruses (~1 % per year), pathways of virus transmission can be reconstructed from the pathways of genomic evolution. Here, we describe three main phylogenetic methods; estimation of genetic distances, reconstruction of a maximum-likelihood (ML) tree, and estimation of substitution rates using Bayesian Markov chain Monte Carlo (MCMC). The data set used consists of complete capsid sequences from a survey of poliovirus sequences available in GenBank. PMID:26983737

  12. A Consistent Phylogenetic Backbone for the Fungi

    PubMed Central

    Ebersberger, Ingo; de Matos Simoes, Ricardo; Kupczok, Anne; Gube, Matthias; Kothe, Erika; Voigt, Kerstin; von Haeseler, Arndt

    2012-01-01

    The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data—a common practice in phylogenomic analyses—introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses. PMID:22114356

  13. A consistent phylogenetic backbone for the fungi.

    PubMed

    Ebersberger, Ingo; de Matos Simoes, Ricardo; Kupczok, Anne; Gube, Matthias; Kothe, Erika; Voigt, Kerstin; von Haeseler, Arndt

    2012-05-01

    The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data-a common practice in phylogenomic analyses-introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses. PMID:22114356

  14. Probabilistic Graphical Model Representation in Phylogenetics

    PubMed Central

    Höhna, Sebastian; Heath, Tracy A.; Boussau, Bastien; Landis, Michael J.; Ronquist, Fredrik; Huelsenbeck, John P.

    2014-01-01

    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution. [Computation; graphical models; inference; modularization; statistical phylogenetics; tree plate.] PMID:24951559

  15. Phylogenetics and the origin of species

    PubMed Central

    Avise, John C.; Wollenberg, Kurt

    1997-01-01

    A recent criticism that the biological species concept (BSC) unduly neglects phylogeny is examined under a novel modification of coalescent theory that considers multiple, sex-defined genealogical pathways through sexual organismal pedigrees. A competing phylogenetic species concept (PSC) also is evaluated from this vantage. Two analytical approaches are employed to capture the composite phylogenetic information contained within the braided assemblages of hereditary pathways of a pedigree: (i) consensus phylogenetic trees across allelic transmission routes and (ii) composite phenograms from quantitative values of organismal coancestry. Outcomes from both approaches demonstrate that the supposed sharp distinction between biological and phylogenetic species concepts is illusory. Historical descent and reproductive ties are related aspects of phylogeny and jointly illuminate biotic discontinuity. PMID:9223259

  16. Phylogenetic identification of lateral genetic transfer events

    PubMed Central

    Beiko, Robert G; Hamilton, Nicholas

    2006-01-01

    Background Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NP-hard for comparisons between rooted trees, and may be so for unrooted trees as well. Results Efficient Evaluation of Edit Paths (EEEP) is a new tree comparison algorithm that uses evolutionarily reasonable constraints to identify and eliminate many unproductive search avenues, reducing the time required to solve many edit path problems. The performance of EEEP compares favourably to that of other algorithms when applied to strictly bifurcating trees with specified numbers of SPR operations. We also used EEEP to recover edit paths from over 19 000 unrooted, incompletely resolved protein trees containing up to 144 taxa as part of a large phylogenomic study. While inferred protein trees were far more similar to a reference supertree than random trees were to each other, the phylogenetic distance spanned by random versus inferred transfer events was similar, suggesting that real transfer events occur most frequently between closely related organisms, but can span large phylogenetic distances as well. While most of the protein trees examined here were very similar to the reference supertree, requiring zero or one edit operations for reconciliation, some trees implied up to 40 transfer events within a single orthologous set of proteins. Conclusion Since sequence trees typically have no implied root and may contain unresolved or multifurcating nodes, the strategy implemented in EEEP is the most appropriate for phylogenomic analyses. The high degree of consistency among inferred

  17. Understanding phylogenetic incongruence: lessons from phyllostomid bats

    PubMed Central

    Dávalos, Liliana M; Cirranello, Andrea L; Geisler, Jonathan H; Simmons, Nancy B

    2012-01-01

    All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive

  18. Alignment validation

    SciTech Connect

    ALICE; ATLAS; CMS; LHCb; Golling, Tobias

    2008-09-06

    The four experiments, ALICE, ATLAS, CMS and LHCb are currently under constructionat CERN. They will study the products of proton-proton collisions at the Large Hadron Collider. All experiments are equipped with sophisticated tracking systems, unprecedented in size and complexity. Full exploitation of both the inner detector andthe muon system requires an accurate alignment of all detector elements. Alignmentinformation is deduced from dedicated hardware alignment systems and the reconstruction of charged particles. However, the system is degenerate which means the data is insufficient to constrain all alignment degrees of freedom, so the techniques are prone to converging on wrong geometries. This deficiency necessitates validation and monitoring of the alignment. An exhaustive discussion of means to validate is subject to this document, including examples and plans from all four LHC experiments, as well as other high energy experiments.

  19. Molecular characterization and phylogenetic analysis of deformed wing viruses isolated from South Korea.

    PubMed

    Reddy, Kondreddy Eswar; Noh, Jin Hyeong; Yoo, Mi-Sun; Kim, Young-Ha; Kim, Nam-Hee; Doan, Huong Thi Thanh; Ramya, Mummadireddy; Jung, Suk-Chan; Van Quyen, Dong; Kang, Seung-Won

    2013-12-27

    Deformed wing virus (DWV) is one of the most common viral infection in honeybees. Phylogenetic trees were constructed for 16 partial nucleotide sequences of the structural polyprotein region and the RNA helicase region of South Korean DWVs. The sequences were compared with 10 previously reported DWV sequences from different countries and the sequences of two closely related viruses, Kakugo virus (KGV) and Varroa destructor virus-1 (VDV-1). The phylogeny based on these two regions, the Korean DWV genomes were highly conserved with 95-100% identity, while they also shared 93-97% similarity with genotypes from other countries, although they formed a separate cluster. To investigate this phenomenon in more detail, the complete DWV genome sequences of Korea-1 and Korea-2 were determined and aligned with six previously reported complete DWV genome sequences from different countries, as well as KGV and VDV-1, and a phylogenetic tree was constructed. The two Korean DWVs shared 96.4% similarity. Interestingly, the Korea-2 genome was more similar to the USA (96.5%) genome than the Korea-1. The Korean genotypes highly conserved with USA (96%) but low similarity with the United Kingdom3 (UK3) genome (89%). The end of the 5' untranslated region (UTR), the start of the open reading frame (ORF) region, and the 3' UTR were variable and contained several substitutions/transitions. This phenomenon may be explained by intramolecular recombination between the Korean and other DWV genotypes. PMID:24035266

  20. Investigation of the protein osteocalcin of Camelops hesternus: Sequence, structure and phylogenetic implications

    NASA Astrophysics Data System (ADS)

    Humpula, James F.; Ostrom, Peggy H.; Gandhi, Hasand; Strahler, John R.; Walker, Angela K.; Stafford, Thomas W.; Smith, James J.; Voorhies, Michael R.; George Corner, R.; Andrews, Phillip C.

    2007-12-01

    Ancient DNA sequences offer an extraordinary opportunity to unravel the evolutionary history of ancient organisms. Protein sequences offer another reservoir of genetic information that has recently become tractable through the application of mass spectrometric techniques. The extent to which ancient protein sequences resolve phylogenetic relationships, however, has not been explored. We determined the osteocalcin amino acid sequence from the bone of an extinct Camelid (21 ka, Camelops hesternus) excavated from Isleta Cave, New Mexico and three bones of extant camelids: bactrian camel ( Camelus bactrianus); dromedary camel ( Camelus dromedarius) and guanaco ( Llama guanacoe) for a diagenetic and phylogenetic assessment. There was no difference in sequence among the four taxa. Structural attributes observed in both modern and ancient osteocalcin include a post-translation modification, Hyp 9, deamidation of Gln 35 and Gln 39, and oxidation of Met 36. Carbamylation of the N-terminus in ancient osteocalcin may result in blockage and explain previous difficulties in sequencing ancient proteins via Edman degradation. A phylogenetic analysis using osteocalcin sequences of 25 vertebrate taxa was conducted to explore osteocalcin protein evolution and the utility of osteocalcin sequences for delineating phylogenetic relationships. The maximum likelihood tree closely reflected generally recognized taxonomic relationships. For example, maximum likelihood analysis recovered rodents, birds and, within hominins, the Homo-Pan-Gorilla trichotomy. Within Artiodactyla, character state analysis showed that a substitution of Pro 4 for His 4 defines the Capra-Ovis clade within Artiodactyla. Homoplasy in our analysis indicated that osteocalcin evolution is not a perfect indicator of species evolution. Limited sequence availability prevented assigning functional significance to sequence changes. Our preliminary analysis of osteocalcin evolution represents an initial step towards a

  1. The phylogenetic utility and functional constraint of microRNA flanking sequences

    PubMed Central

    Kenny, Nathan J.; Sin, Yung Wa; Hayward, Alexander; Paps, Jordi; Chu, Ka Hou; Hui, Jerome H. L.

    2015-01-01

    MicroRNAs (miRNAs) have recently risen to prominence as novel factors responsible for post-transcriptional regulation of gene expression. miRNA genes have been posited as highly conserved in the clades in which they exist. Consequently, miRNAs have been used as rare genome change characters to estimate phylogeny by tracking their gain and loss. However, their short length (21–23 bp) has limited their perceived utility in sequenced-based phylogenetic inference. Here, using reference taxa with established phylogenetic relationships, we demonstrate that miRNA sequences are of high utility in quantitative, rather than in qualitative, phylogenetic analysis. The clear orthology among miRNA genes from different species makes it straightforward to identify and align these sequences from even fragmentary datasets. We also identify significant sequence conservation in the regions directly flanking miRNA genes, and show that this too is of utility in phylogenetic analysis, as well as highlighting conserved regions that will be of interest to other fields. Employing miRNA sequences from 12 sequenced drosophilid genomes, together with a Tribolium castaneum outgroup, we demonstrate that this approach is robust using Bayesian and maximum-likelihood methods. The utility of these characters is further demonstrated in the rhabditid nematodes and primates. As next-generation sequencing makes it more cost-effective to sequence genomes and small RNA libraries, this methodology provides an alternative data source for phylogenetic analysis. The approach allows rapid resolution of relationships between both closely related and rapidly evolving species, and provides an additional tool for investigation of relationships within the tree of life. PMID:25694624

  2. Alignments of RNA structures.

    PubMed

    Blin, Guillaume; Denise, Alain; Dulucq, Serge; Herrbach, Claire; Touzet, Hélène

    2010-01-01

    We describe a theoretical unifying framework to express the comparison of RNA structures, which we call alignment hierarchy. This framework relies on the definition of common supersequences for arc-annotated sequences and encompasses the main existing models for RNA structure comparison based on trees and arc-annotated sequences with a variety of edit operations. It also gives rise to edit models that have not been studied yet. We provide a thorough analysis of the alignment hierarchy, including a new polynomial-time algorithm and an NP-completeness proof. The polynomial-time algorithm involves biologically relevant edit operations such as pairing or unpairing nucleotides. It has been implemented in a software, called gardenia, which is available at the Web server http://bioinfo.lifl.fr/RNA/gardenia. PMID:20431150

  3. Phylogenetic analysis of adenovirus sequences.

    PubMed

    Harrach, Balázs; Benko, Mária

    2007-01-01

    Members of the family Adenoviridae have been isolated from a large variety of hosts, including representatives from every major vertebrate class from fish to mammals. The high prevalence, together with the fairly conserved organization of the central part of their genomes, make the adenoviruses one of (if not the) best models for studying viral evolution on a larger time scale. Phylogenetic calculation can infer the evolutionary distance among adenovirus strains on serotype, species, and genus levels, thus helping the establishment of a correct taxonomy on the one hand, and speeding up the process of typing new isolates on the other. Initially, four major lineages corresponding to four genera were recognized. Later, the demarcation criteria of lower taxon levels, such as species or types, could also be defined with phylogenetic calculations. A limited number of possible host switches have been hypothesized and convincingly supported. Application of the web-based BLAST and MultAlin programs and the freely available PHYLIP package, along with the TreeView program, enables everyone to make correct calculations. In addition to step-by-step instruction on how to perform phylogenetic analysis, critical points where typical mistakes or misinterpretation of the results might occur will be identified and hints for their avoidance will be provided. PMID:17656792

  4. Prioritizing Populations for Conservation Using Phylogenetic Networks

    PubMed Central

    Volkmann, Logan; Martyn, Iain; Moulton, Vincent; Spillner, Andreas; Mooers, Arne O.

    2014-01-01

    In the face of inevitable future losses to biodiversity, ranking species by conservation priority seems more than prudent. Setting conservation priorities within species (i.e., at the population level) may be critical as species ranges become fragmented and connectivity declines. However, existing approaches to prioritization (e.g., scoring organisms by their expected genetic contribution) are based on phylogenetic trees, which may be poor representations of differentiation below the species level. In this paper we extend evolutionary isolation indices used in conservation planning from phylogenetic trees to phylogenetic networks. Such networks better represent population differentiation, and our extension allows populations to be ranked in order of their expected contribution to the set. We illustrate the approach using data from two imperiled species: the spotted owl Strix occidentalis in North America and the mountain pygmy-possum Burramys parvus in Australia. Using previously published mitochondrial and microsatellite data, we construct phylogenetic networks and score each population by its relative genetic distinctiveness. In both cases, our phylogenetic networks capture the geographic structure of each species: geographically peripheral populations harbor less-redundant genetic information, increasing their conservation rankings. We note that our approach can be used with all conservation-relevant distances (e.g., those based on whole-genome, ecological, or adaptive variation) and suggest it be added to the assortment of tools available to wildlife managers for allocating effort among threatened populations. PMID:24586451

  5. Sequence comparison via polar coordinates representation and curve tree.

    PubMed

    Dai, Qi; Guo, Xiaodong; Li, Lihua

    2012-01-01

    Sequence comparison has become one of the essential bioinformatics tools in bioinformatics research, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations among the sequences. Existing graphical representation methods have achieved promising results in sequence comparison, but there are some design challenges with the graphical representations and feature-based measures. We reported here a new method for sequence comparison. It considers whole distribution of dual bases and employs polar coordinates method to map a biological sequence into a closed curve. The curve tree was then constructed to numerically characterize the closed curve of biological sequences, and further compared biological sequences by evaluating the distance of the curve tree of the query sequence matching against a corresponding curve tree of the template sequence. The proposed method was tested by phylogenetic analysis, and its performance was further compared with alignment-based methods. The results demonstrate that using polar coordinates representation and curve tree to compare sequences is more efficient. PMID:22001081

  6. Coalescent Histories for Lodgepole Species Trees.

    PubMed

    Disanto, Filippo; Rosenberg, Noah A

    2015-10-01

    Coalescent histories are combinatorial structures that describe for a given gene tree and species tree the possible lists of branches of the species tree on which the gene tree coalescences take place. Properties of the number of coalescent histories for gene trees and species trees affect a variety of probabilistic calculations in mathematical phylogenetics. Exact and asymptotic evaluations of the number of coalescent histories, however, are known only in a limited number of cases. Here we introduce a particular family of species trees, the lodgepole species trees (λn)n ≥ 0, in which tree λn has m = 2n+1 taxa. We determine the number of coalescent histories for the lodgepole species trees, in the case that the gene tree matches the species tree, showing that this number grows with m!! in the number of taxa m. This computation demonstrates the existence of tree families in which the growth in the number of coalescent histories is faster than exponential. Further, it provides a substantial improvement on the lower bound for the ratio of the largest number of matching coalescent histories to the smallest number of matching coalescent histories for trees with m taxa, increasing a previous bound of [Formula: see text] to [Formula: see text]. We discuss the implications of our enumerative results for phylogenetic computations. PMID:25973633

  7. Mitochondrial DNA control region of three mackerels, genus Rastrelliger: structure, molecular diversity and phylogenetic relationship.

    PubMed

    Jondeung, Amnuay; Karinthanyakit, Wirangrong

    2016-07-01

    The complete mitochondrial control regions (CR) of three mackerels (Rastrelliger spp.) were examined and analyzed. The CR contained three domains, in which three termination-associated sequences (TAS-I, TAS-II and TAS-III), two central conserved sequence blocks (CSB-E, CSB-D), three conserved sequence blocks (CSB-I, CSB-II, and CSB-III) and a putative promoter were detected. Molecular indices analyses of the aligned complete CR sequences showed high level of haplotype diversities and genetic divergences among the three species. The intraspecific divergence among species of this genus ranked from 0.25% to 1.62% and interspecific divergence from 1.90% to 4.30%. The phylogenetic tree shows monophyly with R. brachysoma as a basal species of Rastrelliger. Applying the average divergence rate for fish control regions, the results suggest that the time of separation among Rastrelligers could have occurred in the middle Pleistocene era. PMID:26119119

  8. Phylogenetic Approaches to Natural Product Structure Prediction

    PubMed Central

    Ziemert, Nadine; Jensen, Paul R.

    2015-01-01

    Phylogenetics is the study of the evolutionary relatedness among groups of organisms. Molecular phylogenetics uses sequence data to infer these relationships for both organisms and the genes they maintain. With the large amount of publicly available sequence data, phylogenetic inference has become increasingly important in all fields of biology. In the case of natural product research, phylogenetic relationships are proving to be highly informative in terms of delineating the architecture and function of the genes involved in secondary metabolite biosynthesis. Polyketide synthases and nonribosomal peptide synthetases provide model examples in which individual domain phylogenies display different predictive capacities, resolving features ranging from substrate specificity to structural motifs associated with the final metabolic product. This chapter provides examples in which phylogeny has proven effective in terms of predicting functional or structural aspects of secondary metabolism. The basics of how to build a reliable phylogenetic tree are explained along with information about programs and tools that can be used for this purpose. Furthermore, it introduces the Natural Product Domain Seeker, a recently developed Web tool that employs phylogenetic logic to classify ketosynthase and condensation domains based on established enzyme architecture and biochemical function. PMID:23084938

  9. Alignment fixture

    DOEpatents

    Bell, Grover C.; Gibson, O. Theodore

    1980-01-01

    A part alignment fixture is provided which may be used for precise variable lateral and tilt alignment relative to the fixture base of various shaped parts. The fixture may be used as a part holder for machining or inspection of parts or alignment of parts during assembly and the like. The fixture includes a precisely machined diameter disc-shaped hub adapted to receive the part to be aligned. The hub is nested in a guide plate which is adapted to carry two oppositely disposed pairs of positioning wedges so that the wedges may be reciprocatively positioned by means of respective micrometer screws. The sloping faces of the wedges contact the hub at respective quadrants of the hub periphery. The lateral position of the hub relative to the guide plate is adjusted by positioning the wedges with the associated micrometer screws. The tilt of the part is adjusted relative to a base plate, to which the guide plate is pivotally connected by means of a holding plate. Two pairs of oppositely disposed wedges are mounted for reciprocative lateral positioning by means of separate micrometer screws between flanges of the guide plate and the base plate. Once the wedges are positioned to achieve the proper tilt of the part or hub on which the part is mounted relative to the base plate, the fixture may be bolted to a machining, inspection, or assembly device.

  10. Application of 16S rRNA, cytochrome b and control region sequences for understanding the phylogenetic relationships in Oryx species.

    PubMed

    Khan, H A; Arif, I A; Al Homaidan, A A; Al Farhan, A H

    2008-01-01

    The present study reports the application of mitochondrial markers for the molecular phylogeny of Oryx species, including the Arabian oryx (AO), scimitar-horned oryx (SHO) and plains oryx (PO), using the Addax as an outgroup. Sequences of three molecular markers, 16S rRNA, cytochrome b and a control region, for the above four taxa were aligned and the topologies of respective phylogenetic trees were compared. All these markers clearly differentiated the genus Addax from Oryx. However, for species-level grouping, while 16S rRNA and cytochrome b produced similar phylogeny (SHO grouped with PO), the control region grouped SHO with AO. Further studies are warranted to generate more sequencing data, apply multiple bioinformatics tools and to include relevant nuclear markers for phylogenetic analysis of Oryx species. PMID:19224456

  11. Molecular phylogenetic studies on filarial parasites based on 5S ribosomal spacer sequences.

    PubMed

    Xie, H; Bain, O; Williams, S A

    1994-06-01

    This paper is the first large-scale molecular phylogenetic study on filarial parasites (family Onchocercidae) which includes 16 species of 6 genera: Brugia beaveri Ash et Little, 1962, B. buckleyi Dissanaike et Paramananthan, 1961; B. malayi (Brug, 1927) Buckley, 1960; B. pahangi (Buckley et Edeson, 1956) Buckley, 1960; B. patei (Buckley, Nelson et Heisch, 1958) Buckley, 1960; B. timori Partono et al, 1977; Wuchereria bancrofti (Cobbold, 1877) Seurat, 1921: W. kalimantani Palmieri. Purnomo, Dennis and Marwoto, 1980: Mansonella perstans (Manson, 1891) Eberhard et Orihel, 1984; loa loc, Stiles, 1905; Onchocerca volvulus (Leuckart, 1983) Railliet er Henry, 1910; O. ochengi Bwangamoi, 1969; O. gutturosa Neumann, 1910; Dirofilaria immitis (Leidy, 1856) Railliet e Henry, 1911; Acanthocheilonema viteae (Krepkogorskaya, 1933) Bain, Baker et Chabaud, 1982 and Litomosoides sigmodontis Chandler, 1931. 5S rRNA gene spacer region sequence data were collected by PCR, cloning and dideoxy sequencing. The 5S rRNA gene spacer region sequences were aligned and analyzed by maximum parsimony algorithms, distance methods and maximum likelihood methods to construct phylogenetic trees. Bootstrap analysis was used to test the robustness of the different phylogenetic reconstructions. The data indicated that 5S spacer region sequences are highly conserved within species yet differ significantly between species. Spliced leader sequences were observed in all of the 5S rDNA spacers with no sequence variation, although flanking region sequence and length heterogeneity was observed even within species. All of the various tree-building methods gave very similar results. This study identified four clades which are strongly supported by bootstrap analysis the Brugia clade; the Wuchereria clade; the Brugia-Wuchereria clade and the Onchocerca clade. The analyses indicated that L. sigmodontis and A. viteae may be the most primitive among the 16 species studied. The data did not show any close

  12. Phylogenetic analysis of Demodex caprae based on mitochondrial 16S rDNA sequence.

    PubMed

    Zhao, Ya-E; Hu, Li; Ma, Jun-Xian

    2013-11-01

    Demodex caprae infests the hair follicles and sebaceous glands of goats worldwide, which not only seriously impairs goat farming, but also causes a big economic loss. However, there are few reports on the DNA level of D. caprae. To reveal the taxonomic position of D. caprae within the genus Demodex, the present study conducted phylogenetic analysis of D. caprae based on mt16S rDNA sequence data. D. caprae adults and eggs were obtained from a skin nodule of the goat suffering demodicidosis. The mt16S rDNA sequences of individual mite were amplified using specific primers, and then cloned, sequenced, and aligned. The sequence divergence, genetic distance, and transition/transversion rate were computed, and the phylogenetic trees in Demodex were reconstructed. Results revealed the 339-bp partial sequences of six D. caprae isolates were obtained, and the sequence identity was 100% among isolates. The pairwise divergences between D. caprae and Demodex canis or Demodex folliculorum or Demodex brevis were 22.2-24.0%, 24.0-24.9%, and 22.9-23.2%, respectively. The corresponding average genetic distances were 2.840, 2.926, and 2.665, and the average transition/transversion rates were 0.70, 0.55, and 0.54, respectively. The divergences, genetic distances, and transition/transversion rates of D. caprae versus the other three species all reached interspecies level. The five phylogenetic trees all presented that D. caprae clustered with D. brevis first, and then with D. canis, D. folliculorum, and Demodex injai in sequence. In conclusion, D. caprae is an independent species, and it is closer to D. brevis than to D. canis, D. folliculorum, or D. injai. PMID:23996126

  13. Less is more in mammalian phylogenomics: AT-rich genes minimize tree conflicts and unravel the root of placental mammals.

    PubMed

    Romiguier, Jonathan; Ranwez, Vincent; Delsuc, Frédéric; Galtier, Nicolas; Douzery, Emmanuel J P

    2013-09-01

    Despite the rapid increase of size in phylogenomic data sets, a number of important nodes on animal phylogeny are still unresolved. Among these, the rooting of the placental mammal tree is still a controversial issue. One difficulty lies in the pervasive phylogenetic conflicts among genes, with each one telling its own story, which may be reliable or not. Here, we identified a simple criterion, that is, the GC content, which substantially helps in determining which gene trees best reflect the species tree. We assessed the ability of 13,111 coding sequence alignments to correctly reconstruct the placental phylogeny. We found that GC-rich genes induced a higher amount of conflict among gene trees and performed worse than AT-rich genes in retrieving well-supported, consensual nodes on the placental tree. We interpret this GC effect mainly as a consequence of genome-wide variations in recombination rate. Indeed, recombination is known to drive GC-content evolution through GC-biased gene conversion and might be problematic for phylogenetic reconstruction, for instance, in an incomplete lineage sorting context. When we focused on the AT-richest fraction of the data set, the resolution level of the placental phylogeny was greatly increased, and a strong support was obtained in favor of an Afrotheria rooting, that is, Afrotheria as the sister group of all other placentals. We show that in mammals most conflicts among gene trees, which have so far hampered the resolution of the placental tree, are concentrated in the GC-rich regions of the genome. We argue that the GC content-because it is a reliable indicator of the long-term recombination rate-is an informative criterion that could help in identifying the most reliable molecular markers for species tree inference. PMID:23813978

  14. DNA barcoding and phylogenetic relationships in Timaliidae.

    PubMed

    Huang, Z H; Ke, D H

    2015-01-01

    The Timaliidae, a diverse family of oscine passerine birds, has long been a subject of debate regarding its phylogeny. The mitochondrial cytochrome c oxidase subunit I (COI) gene has been used as a powerful marker for identification and phylogenetic studies of animal species. In the present study, we analyzed the COI barcodes of 71 species from 21 genera belonging to the family Timaliidae. Every bird species possessed a barcode distinct from that of other bird species. Kimura two-parameter (K2P) distances were calculated between barcodes. The average genetic distance between species was 18 times higher than the average genetic distance within species. The neighbor-joining method was used to construct a phylogenetic tree and all the species could be discriminated by their distinct clades within the phylogenetic tree. The results indicate that some currently recognized babbler genera might not be monophyletic, with the COI gene data supporting the hypothesis of polyphyly for Garrulax, Alcippe, and Minla. Thus, DNA barcoding is an effective molecular tool for Timaliidae species identification and phylogenetic inference. PMID:26125793

  15. Vicariant patterns of fragmentation among gekkonid lizards of the genus Teratoscincus produced by the Indian collision: A molecular phylogenetic perspective and an area cladogram for Central Asia.

    PubMed

    Macey, J R; Wang, Y; Ananjeva, N B; Larson, A; Papenfuss, T J

    1999-08-01

    A well-supported phylogenetic hypothesis is presented for gekkonid lizards of the genus Teratoscincus. Phylogenetic relationships of four of the five species are investigated using 1733 aligned bases of mitochondrial DNA sequence from the genes encoding ND1 (subunit one of NADH dehydrogenase), tRNA(Ile), tRNA(Gln), tRNA(Met), ND2, tRNA(Trp), tRNA(Ala), tRNA(Asn), tRNA(Cys), tRNA(Tyr), and COI (subunit I of cytochrome c oxidase). A single most parsimonious tree depicts T. przewalskii and T. roborowskii as a monophyletic group, with T. scincus as their sister taxon and T. microlepis as the sister taxon to the clade containing the first three species. The aligned sequences contain 341 phylogenetically informative characters. Each node is supported by a bootstrap value of 100% and the shortest suboptimal tree requires 29 additional steps. Allozymic variation is presented for proteins encoded by 19 loci but these data are largely uninformative phylogenetically. Teratoscincus species occur on tectonic plates of Gondwanan origin that were compressed by the impinging Indian Subcontinent, resulting in massive montane uplifting along plate boundaries. Taxa occurring in China (Tarim Block) form a monophyletic group showing vicariant separation from taxa in former Soviet Central Asia and northern Afghanistan (Farah Block); alternative biogeographic hypotheses are statistically rejected. This vicariant event involved the rise of the Tien Shan-Pamir and is well dated to 10 million years before present. Using this date for separation of taxa occurring on opposite sides of the Tien Shan-Pamir, an evolutionary rate of 0.57% divergence per lineage per million years is calculated. This rate is similar to estimates derived from fish, bufonid frogs, and agamid lizards for the same region of the mitochondrial genome ( approximately 0.65% divergence per lineage per million years). Evolutionary divergence of the mitochondrial genome has a surprisingly stable rate across vertebrates. PMID

  16. High-resolution SAR11 ecotype dynamics at the Bermuda Atlantic Time-series Study site by phylogenetic placement of pyrosequences.

    PubMed

    Vergin, Kevin L; Beszteri, Bánk; Monier, Adam; Thrash, J Cameron; Temperton, Ben; Treusch, Alexander H; Kilpert, Fabian; Worden, Alexandra Z; Giovannoni, Stephen J

    2013-07-01

    Advances in next-generation sequencing technologies are providing longer nucleotide sequence reads that contain more information about phylogenetic relationships. We sought to use this information to understand the evolution and ecology of bacterioplankton at our long-term study site in the Western Sargasso Sea. A bioinformatics pipeline called PhyloAssigner was developed to align pyrosequencing reads to a reference multiple sequence alignment of 16S ribosomal RNA (rRNA) genes and assign them phylogenetic positions in a reference tree using a maximum likelihood algorithm. Here, we used this pipeline to investigate the ecologically important SAR11 clade of Alphaproteobacteria. A combined set of 2.7 million pyrosequencing reads from the 16S rRNA V1-V2 regions, representing 9 years at the Bermuda Atlantic Time-series Study (BATS) site, was quality checked and parsed into a comprehensive bacterial tree, yielding 929 036 Alphaproteobacteria reads. Phylogenetic structure within the SAR11 clade was linked to seasonally recurring spatiotemporal patterns. This analysis resolved four new SAR11 ecotypes in addition to five others that had been described previously at BATS. The data support a conclusion reached previously that the SAR11 clade diversified by subdivision of niche space in the ocean water column, but the new data reveal a more complex pattern in which deep branches of the clade diversified repeatedly across depth strata and seasonal regimes. The new data also revealed the presence of an unrecognized clade of Alphaproteobacteria, here named SMA-1 (Sargasso Mesopelagic Alphaproteobacteria, group 1), in the upper mesopelagic zone. The high-resolution phylogenetic analyses performed herein highlight significant, previously unknown, patterns of evolutionary diversification, within perhaps the most widely distributed heterotrophic marine bacterial clade, and strongly links to ecosystem regimes. PMID:23466704

  17. High-resolution SAR11 ecotype dynamics at the Bermuda Atlantic Time-series Study site by phylogenetic placement of pyrosequences

    PubMed Central

    Vergin, Kevin L; Beszteri, Bánk; Monier, Adam; Cameron Thrash, J; Temperton, Ben; Treusch, Alexander H; Kilpert, Fabian; Worden, Alexandra Z; Giovannoni, Stephen J

    2013-01-01

    Advances in next-generation sequencing technologies are providing longer nucleotide sequence reads that contain more information about phylogenetic relationships. We sought to use this information to understand the evolution and ecology of bacterioplankton at our long-term study site in the Western Sargasso Sea. A bioinformatics pipeline called PhyloAssigner was developed to align pyrosequencing reads to a reference multiple sequence alignment of 16S ribosomal RNA (rRNA) genes and assign them phylogenetic positions in a reference tree using a maximum likelihood algorithm. Here, we used this pipeline to investigate the ecologically important SAR11 clade of Alphaproteobacteria. A combined set of 2.7 million pyrosequencing reads from the 16S rRNA V1–V2 regions, representing 9 years at the Bermuda Atlantic Time-series Study (BATS) site, was quality checked and parsed into a comprehensive bacterial tree, yielding 929 036 Alphaproteobacteria reads. Phylogenetic structure within the SAR11 clade was linked to seasonally recurring spatiotemporal patterns. This analysis resolved four new SAR11 ecotypes in addition to five others that had been described previously at BATS. The data support a conclusion reached previously that the SAR11 clade diversified by subdivision of niche space in the ocean water column, but the new data reveal a more complex pattern in which deep branches of the clade diversified repeatedly across depth strata and seasonal regimes. The new data also revealed the presence of an unrecognized clade of Alphaproteobacteria, here named SMA-1 (Sargasso Mesopelagic Alphaproteobacteria, group 1), in the upper mesopelagic zone. The high-resolution phylogenetic analyses performed herein highlight significant, previously unknown, patterns of evolutionary diversification, within perhaps the most widely distributed heterotrophic marine bacterial clade, and strongly links to ecosystem regimes. PMID:23466704

  18. A deliberate practice approach to teaching phylogenetic analysis.

    PubMed

    Hobbs, F Collin; Johnson, Daniel J; Kearns, Katherine D

    2013-01-01

    One goal of postsecondary education is to assist students in developing expert-level understanding. Previous attempts to encourage expert-level understanding of phylogenetic analysis in college science classrooms have largely focused on isolated, or "one-shot," in-class activities. Using a deliberate practice instructional approach, we designed a set of five assignments for a 300-level plant systematics course that incrementally introduces the concepts and skills used in phylogenetic analysis. In our assignments, students learned the process of constructing phylogenetic trees through a series of increasingly difficult tasks; thus, skill development served as a framework for building content knowledge. We present results from 5 yr of final exam scores, pre- and postconcept assessments, and student surveys to assess the impact of our new pedagogical materials on student performance related to constructing and interpreting phylogenetic trees. Students improved in their ability to interpret relationships within trees and improved in several aspects related to between-tree comparisons and tree construction skills. Student feedback indicated that most students believed our approach prepared them to engage in tree construction and gave them confidence in their abilities. Overall, our data confirm that instructional approaches implementing deliberate practice address student misconceptions, improve student experiences, and foster deeper understanding of difficult scientific concepts. PMID:24297294

  19. A Deliberate Practice Approach to Teaching Phylogenetic Analysis

    PubMed Central

    Hobbs, F. Collin; Johnson, Daniel J.; Kearns, Katherine D.

    2013-01-01

    One goal of postsecondary education is to assist students in developing expert-level understanding. Previous attempts to encourage expert-level understanding of phylogenetic analysis in college science classrooms have largely focused on isolated, or “one-shot,” in-class activities. Using a deliberate practice instructional approach, we designed a set of five assignments for a 300-level plant systematics course that incrementally introduces the concepts and skills used in phylogenetic analysis. In our assignments, students learned the process of constructing phylogenetic trees through a series of increasingly difficult tasks; thus, skill development served as a framework for building content knowledge. We present results from 5 yr of final exam scores, pre- and postconcept assessments, and student surveys to assess the impact of our new pedagogical materials on student performance related to constructing and interpreting phylogenetic trees. Students improved in their ability to interpret relationships within trees and improved in several aspects related to between-tree comparisons and tree construction skills. Student feedback indicated that most students believed our approach prepared them to engage in tree construction and gave them confidence in their abilities. Overall, our data confirm that instructional approaches implementing deliberate practice address student misconceptions, improve student experiences, and foster deeper understanding of difficult scientific concepts. PMID:24297294

  20. Improving phylogenetic regression under complex evolutionary models.

    PubMed

    Mazel, Florent; Davies, T Jonathan; Georges, Damien; Lavergne, Sébastien; Thuiller, Wilfried; Peres-NetoO, Pedro R

    2016-02-01

    Phylogenetic Generalized Least Square (PGLS) is the tool of choice among phylogenetic comparative methods to measure the correlation between species features such as morphological and life-history traits or niche characteristics. In its usual form, it assumes that the residual variation follows a homogenous model of evolution across the branches of the phylogenetic tree. Since a homogenous model of evolution is unlikely to be realistic in nature, we explored the robustness of the phylogenetic regression when this assumption is violated. We did so by simulating a set of traits under various heterogeneous models of evolution, and evaluating the statistical performance (type I error [the percentage of tests based on samples that incorrectly rejected a true null hypothesis] and power [the percentage of tests that correctly rejected a false null hypothesis]) of classical phylogenetic regression. We found that PGLS has good power but unacceptable type I error rates. This finding is important since this method has been increasingly used in comparative analyses over the last decade. To address this issue, we propose a simple solution based on transforming the underlying variance-covariance matrix to adjust for model heterogeneity within PGLS. We suggest that heterogeneous rates of evolution might be particularly prevalent in large phylogenetic trees, while most current approaches assume a homogenous rate of evolution. Our analysis demonstrates that overlooking rate heterogeneity can result in inflated type I errors, thus misleading comparative analyses. We show that it is possible to correct for this bias even when the underlying model of evolution is not known a priori. PMID:27145604

  1. Probabilistic graphical model representation in phylogenetics.

    PubMed

    Höhna, Sebastian; Heath, Tracy A; Boussau, Bastien; Landis, Michael J; Ronquist, Fredrik; Huelsenbeck, John P

    2014-09-01

    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution. PMID:24951559

  2. GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters

    PubMed Central

    Sela, Itamar; Ashkenazy, Haim; Katoh, Kazutaka; Pupko, Tal

    2015-01-01

    Inference of multiple sequence alignments (MSAs) is a critical part of phylogenetic and comparative genomics studies. However, from the same set of sequences different MSAs are often inferred, depending on the methodologies used and the assumed parameters. Much effort has recently been devoted to improving the ability to identify unreliable alignment regions. Detecting such unreliable regions was previously shown to be important for downstream analyses relying on MSAs, such as the detection of positive selection. Here we developed GUIDANCE2, a new integrative methodology that accounts for: (i) uncertainty in the process of indel formation, (ii) uncertainty in the assumed guide tree and (iii) co-optimal solutions in the pairwise alignments, used as building blocks in progressive alignment algorithms. We compared GUIDANCE2 with seven methodologies to detect unreliable MSA regions using extensive simulations and empirical benchmarks. We show that GUIDANCE2 outperforms all previously developed methodologies. Furthermore, GUIDANCE2 also provides a set of alternative MSAs which can be useful for downstream analyses. The novel algorithm is implemented as a web-server, available at: http://guidance.tau.ac.il. PMID:25883146

  3. ALIGNING JIG

    DOEpatents

    Culver, J.S.; Tunnell, W.C.

    1958-08-01

    A jig or device is described for setting or aligning an opening in one member relative to another member or structure, with a predetermined offset, or it may be used for measuring the amount of offset with which the parts have previously been sct. This jig comprises two blocks rabbeted to each other, with means for securing thc upper block to the lower block. The upper block has fingers for contacting one of the members to be a1igmed, the lower block is designed to ride in grooves within the reference member, and calibration marks are provided to determine the amount of offset. This jig is specially designed to align the collimating slits of a mass spectrometer.

  4. Isolation and phylogenetic characterization of Canine distemper virus from India.

    PubMed

    Swati; Deka, Dipak; Uppal, Sanjeev Kumar; Verma, Ramneek

    2015-09-01

    Canine distemper (CD), caused by canine distemper virus (CDV) is a highly contagious disease that infects a variety of carnivores. Sequence analysis of CDVs from different geographical areas has shown a lot of variation in the genome of the virus especially in haemagglutinin gene which might be one of the causes of vaccine failure. In this study, we isolated the virus (place: Ludhiana, Punjab; year: 2014) and further cloned, sequenced and analyzed partial haemagglutinin (H) gene and full length genes for fusion protein (F), phosphoprotein (P) and matrix protein (M) from an Indian wild-type CDV. Higher sequence homology was observed with the strains from Switzerland, Hungary, Germany; and lower with the vaccine strains like Ondersteport, CDV3, Convac for all the genes. The multiple sequence alignment showed more variation in partial H (45 nucleotide and 5 amino acid substitutions) and complete F (79 nucleotide and 30 amino acid substitutions) than in complete P (44 nucleotide and 22 amino acid substitutions) and complete M (22 nucleotide and 4 amino acid substitutions) gene/protein. Predicted potential N-linked glycosylation sites in H, F, M and P proteins were similar to the previously known wild-type CDVs but different from the vaccine strains. The Indian CDV formed a distinct clade in the phylogenetic tree clearly separated from the previously known wild-type and vaccine strains. PMID:26396979

  5. The phylogenetic analysis of variable-length sequence data: elongation factor-1alpha introns in European populations of the parasitoid wasp genus Pauesia (Hymenoptera: Braconidae: Aphidiinae).

    PubMed

    Sanchis, A; Michelena, J M; Latorre, A; Quicke, D L; Gärdenfors, U; Belshaw, R

    2001-06-01

    Elongation factor-1alpha (EF-1alpha) is a highly conserved nuclear coding gene that can be used to investigate recent divergences due to the presence of rapidly evolving introns. However, a universal feature of intron sequences is that even closely related species exhibit insertion and deletion events, which cause variation in the lengths of the sequences. Indels are frequently rich in evolutionary information, but most investigators ignore sites that fall within these variable regions, largely because the analytical tools and theory are not well developed. We examined this problem in the taxonomically problematic parasitoid wasp genus Pauesia (Hymenoptera: Braconidae: Aphidiinae) using congruence as a criterion for assessing a range of methods for aligning such variable-length EF-1alpha intron sequences. These methods included distance- and parsimony-based multiple-alignment programs (CLUSTAL W and MALIGN), direct optimization (POY), and two "by eye" alignment strategies. Furthermore, with one method (CLUSTAL W) we explored in detail the robustness of results to changes in the gap cost parameters. Phenetic-based alignments ("by eye" and CLUSTAL W) appeared, under our criterion, to perform as well as more readily defensible, but computationally more demanding, methods. In general, all of our alignment and tree-building strategies recovered the same basic topological structure, which means that an underlying phylogenetic signal remained regardless of the strategy chosen. However, several relationships between clades were sensitive both to alignment and to tree-building protocol. Further alignments, considering only sequences belonging to the same group, allowed us to infer a range of phylogenetic relationships that were highly robust to tree-building protocol. By comparing these topologies with those obtained by varying the CLUSTAL parameters, we generated the distribution area of congruence and taxonomic compatibility. Finally, we present the first robust estimate

  6. Image alignment

    DOEpatents

    Dowell, Larry Jonathan

    2014-04-22

    Disclosed is a method and device for aligning at least two digital images. An embodiment may use frequency-domain transforms of small tiles created from each image to identify substantially similar, "distinguishing" features within each of the images, and then align the images together based on the location of the distinguishing features. To accomplish this, an embodiment may create equal sized tile sub-images for each image. A "key" for each tile may be created by performing a frequency-domain transform calculation on each tile. A information-distance difference between each possible pair of tiles on each image may be calculated to identify distinguishing features. From analysis of the information-distance differences of the pairs of tiles, a subset of tiles with high discrimination metrics in relation to other tiles may be located for each image. The subset of distinguishing tiles for each image may then be compared to locate tiles with substantially similar keys and/or information-distance metrics to other tiles of other images. Once similar tiles are located for each image, the images may be aligned in relation to the identified similar tiles.

  7. The Inference of Gene Trees with Species Trees

    PubMed Central

    Szöllősi, Gergely J.; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

    2015-01-01

    This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution. PMID:25070970

  8. Exact solutions for species tree inference from discordant gene trees.

    PubMed

    Chang, Wen-Chieh; Górecki, Paweł; Eulenstein, Oliver

    2013-10-01

    Phylogenetic analysis has to overcome the grant challenge of inferring accurate species trees from evolutionary histories of gene families (gene trees) that are discordant with the species tree along whose branches they have evolved. Two well studied approaches to cope with this challenge are to solve either biologically informed gene tree parsimony (GTP) problems under gene duplication, gene loss, and deep coalescence, or the classic RF supertree problem that does not rely on any biological model. Despite the potential of these problems to infer credible species trees, they are NP-hard. Therefore, these problems are addressed by heuristics that typically lack any provable accuracy and precision. We describe fast dynamic programming algorithms that solve the GTP problems and the RF supertree problem exactly, and demonstrate that our algorithms can solve instances with data sets consisting of as many as 22 taxa. Extensions of our algorithms can also report the number of all optimal species trees, as well as the trees themselves. To better asses the quality of the resulting species trees that best fit the given gene trees, we also compute the worst case species trees, their numbers, and optimization score for each of the computational problems. Finally, we demonstrate the performance of our exact algorithms using empirical and simulated data sets, and analyze the quality of heuristic solutions for the studied problems by contrasting them with our exact solutions. PMID:24131054

  9. Exploration of phylogenetic data using a global sequence analysis method

    PubMed Central

    Chapus, Charles; Dufraigne, Christine; Edwards, Scott; Giron, Alain; Fertil, Bernard; Deschavanne, Patrick

    2005-01-01

    Background Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. Results Here we explore the phylogenetic signal present in molecular data by genomic signatures, defined as the set of frequencies of short oligonucleotides present in DNA sequences. Although violating many of the standard assumptions of traditional phylogenetic analyses – in particular explicit statements of homology inherent in character matrices – the use of the signature does permit the analysis of very long sequences, even those that are unalignable, and is therefore most useful in cases where alignment is questionable. We compare the results obtained by traditional phylogenetic methods to those inferred by the signature method for two genes: RAG1, which is easily alignable, and 18S RNA, where alignments are often ambiguous for some regions. We also apply this method to a multigene data set of 33 genes for 9 bacteria and one archea species as well as to the whole genome of a set of 16 γ-proteobacteria. In addition to delivering phylogenetic results comparable to traditional methods, the comparison of signatures for the sequences involved in the bacterial example identified putative candidates for horizontal gene transfers. Conclusion The signature method is therefore a fast tool for exploring phylogenetic data, providing not only a pretreatment for discovering new sequence relationships, but also for identifying cases of sequence evolution that could confound traditional phylogenetic analysis. PMID:16280081

  10. Relaxed Phylogenetics and Dating with Confidence

    PubMed Central

    Ho, Simon Y. W; Phillips, Matthew J

    2006-01-01

    In phylogenetics, the unrooted model of phylogeny and the strict molecular clock model are two extremes of a continuum. Despite their dominance in phylogenetic inference, it is evident that both are biologically unrealistic and that the real evolutionary process lies between these two extremes. Fortunately, intermediate models employing relaxed molecular clocks have been described. These models open the gate to a new field of “relaxed phylogenetics.” Here we introduce a new approach to performing relaxed phylogenetic analysis. We describe how it can be used to estimate phylogenies and divergence times in the face of uncertainty in evolutionary rates and calibration times. Our approach also provides a means for measuring the clocklikeness of datasets and comparing this measure between different genes and phylogenies. We find no significant rate autocorrelation among branches in three large datasets, suggesting that autocorrelated models are not necessarily suitable for these data. In addition, we place these datasets on the continuum of clocklikeness between a strict molecular clock and the alternative unrooted extreme. Finally, we present analyses of 102 bacterial, 106 yeast, 61 plant, 99 metazoan, and 500 primate alignments. From these we conclude that our method is phylogenetically more accurate and precise than the traditional unrooted model while adding the ability to infer a timescale to evolution. PMID:16683862

  11. The evolution of HPV by means of a phylogenetic study.

    PubMed

    Isea, Raúl; Chaves, Juan L; Montes, Esther; Rubio-Montero, Antonio J; Mayo, Rafael

    2009-01-01

    In this work we demonstrate the adequacy of revising the classification systems based on molecular phylogenetic calculations by allowing an arbitrary number of taxas that take advantage of high performance computing platforms for the Human papillomavirus (HPV) case. To do so, we have analysed several phylogenetic trees which have been calculated with the PhyloGrid tool, a workflow developed in the framework of the EELA-2 Project. PMID:19593062

  12. Genomic Repeat Abundances Contain Phylogenetic Signal

    PubMed Central

    Dodsworth, Steven; Chase, Mark W.; Kelly, Laura J.; Leitch, Ilia J.; Macas, Jiří; Novák, Petr; Piednoël, Mathieu; Weiss-Schneeweiss, Hanna; Leitch, Andrew R.

    2015-01-01

    A large proportion of genomic information, particularly repetitive elements, is usually ignored when researchers are using next-generation sequencing. Here we demonstrate the usefulness of this repetitive fraction in phylogenetic analyses, utilizing comparative graph-based clustering of next-generation sequence reads, which results in abundance estimates of different classes of genomic repeats. Phylogenetic trees are then inferred based on the genome-wide abundance of different repeat types treated as continuously varying characters; such repeats are scattered across chromosomes and in angiosperms can constitute a majority of nuclear genomic DNA. In six diverse examples, five angiosperms and one insect, this method provides generally well-supported relationships at interspecific and intergeneric levels that agree with results from more standard phylogenetic analyses of commonly used markers. We propose that this methodology may prove especially useful in groups where there is little genetic differentiation in standard phylogenetic markers. At the same time as providing data for phylogenetic inference, this method additionally yields a wealth of data for comparative studies of genome evolution. PMID:25261464

  13. Reasoning over Taxonomic Change: Exploring Alignments for the Perelleschus Use Case

    PubMed Central

    Franz, Nico M.; Chen, Mingmin; Yu, Shizhuo; Kianmajd, Parisa; Bowers, Shawn; Ludäscher, Bertram

    2015-01-01

    Classifications and phylogenetic inferences of organismal groups change in light of new insights. Over time these changes can result in an imperfect tracking of taxonomic perspectives through the re-/use of Code-compliant or informal names. To mitigate these limitations, we introduce a novel approach for aligning taxonomies through the interaction of human experts and logic reasoners. We explore the performance of this approach with the Perelleschus use case of Franz & Cardona-Duque (2013). The use case includes six taxonomies published from 1936 to 2013, 54 taxonomic concepts (i.e., circumscriptions of names individuated according to their respective source publications), and 75 expert-asserted Region Connection Calculus articulations (e.g., congruence, proper inclusion, overlap, or exclusion). An Open Source reasoning toolkit is used to analyze 13 paired Perelleschus taxonomy alignments under heterogeneous constraints and interpretations. The reasoning workflow optimizes the logical consistency and expressiveness of the input and infers the set of maximally informative relations among the entailed taxonomic concepts. The latter are then used to produce merge visualizations that represent all congruent and non-congruent taxonomic elements among the aligned input trees. In this small use case with 6-53 input concepts per alignment, the information gained through the reasoning process is on average one order of magnitude greater than in the input. The approach offers scalable solutions for tracking provenance among succeeding taxonomic perspectives that may have differential biases in naming conventions, phylogenetic resolution, ingroup and outgroup sampling, or ostensive (member-referencing) versus intensional (property-referencing) concepts and articulations. PMID:25700173

  14. Measuring community similarity with phylogenetic networks.

    PubMed

    Parks, Donovan H; Beiko, Robert G

    2012-12-01

    Environmental drivers of biodiversity can be identified by relating patterns of community similarity to ecological factors. Community variation has traditionally been assessed by considering changes in species composition and more recently by incorporating phylogenetic information to account for the relative similarity of taxa. Here, we describe how an important class of measures including Bray-Curtis, Canberra, and UniFrac can be extended to allow community variation to be computed on a phylogenetic network. We focus on phylogenetic split systems, networks that are produced by the widely used median network and neighbor-net methods, which can represent incongruence in the evolutionary history of a set of taxa. Calculating β diversity over a split system provides a measure of community similarity averaged over uncertainty or conflict in the available phylogenetic signal. Our freely available software, Network Diversity, provides 11 qualitative (presence-absence, unweighted) and 14 quantitative (weighted) network-based measures of community similarity that model different aspects of community richness and evenness. We demonstrate the broad applicability of network-based diversity approaches by applying them to three distinct data sets: pneumococcal isolates from distinct geographic regions, human mitochondrial DNA data from the Indonesian island of Nias, and proteorhodopsin sequences from the Sargasso and Mediterranean Seas. Our results show that major expected patterns of variation for these data sets are recovered using network-based measures, which indicates that these patterns are robust to phylogenetic uncertainty and conflict. Nonetheless, network-based measures of community similarity can differ substantially from measures ignoring phylogenetic relationships or from tree-based measures when incongruent signals are present in the underlying data. Network-based measures provide a methodology for assessing the robustness of β-diversity results in light of

  15. In Silico Phylogenetic Analysis and Molecular Modelling Study of 2-Haloalkanoic Acid Dehalogenase Enzymes from Bacterial and Fungal Origin

    PubMed Central

    Satpathy, Raghunath; Konkimalla, V. B.; Ratha, Jagnyeswar

    2016-01-01

    2-Haloalkanoic acid dehalogenase enzymes have broad range of applications, starting from bioremediation to chemical synthesis of useful compounds that are widely distributed in fungi and bacteria. In the present study, a total of 81 full-length protein sequences of 2-haloalkanoic acid dehalogenase from bacteria and fungi were retrieved from NCBI database. Sequence analysis such as multiple sequence alignment (MSA), conserved motif identification, computation of amino acid composition, and phylogenetic tree construction were performed on these primary sequences. From MSA analysis, it was observed that the sequences share conserved lysine (K) and aspartate (D) residues in them. Also, phylogenetic tree indicated a subcluster comprised of both fungal and bacterial species. Due to nonavailability of experimental 3D structure for fungal 2-haloalkanoic acid dehalogenase in the PDB, molecular modelling study was performed for both fungal and bacterial sources of enzymes present in the subcluster. Further structural analysis revealed a common evolutionary topology shared between both fungal and bacterial enzymes. Studies on the buried amino acids showed highly conserved Leu and Ser in the core, despite variation in their amino acid percentage. Additionally, a surface exposed tryptophan was conserved in all of these selected models. PMID:26880911

  16. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics

    PubMed Central

    Jobb, Gangolf; von Haeseler, Arndt; Strimmer, Korbinian

    2004-01-01

    Background Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. Results TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004) are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. Conclusions TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies. PMID:15222900

  17. PROMALS web server for accurate multiple protein sequence alignments.

    PubMed

    Pei, Jimin; Kim, Bong-Hyun; Tang, Ming; Grishin, Nick V

    2007-07-01

    Multiple sequence alignments are essential in homology inference, structure modeling, functional prediction and phylogenetic analysis. We developed a web server that constructs multiple protein sequence alignments using PROMALS, a progressive method that improves alignment quality by using additional homologs from PSI-BLAST searches and secondary structure predictions from PSIPRED. PROMALS shows higher alignment accuracy than other advanced methods, such as MUMMALS, ProbCons, MAFFT and SPEM. The PROMALS web server takes FASTA format protein sequences as input. The output includes a colored alignment augmented with information about sequence grouping, predicted secondary structures and positional conservation. The PROMALS web server is available at: http://prodata.swmed.edu/promals/ PMID:17452345

  18. Identification of Tunisian Leishmania spp. by PCR amplification of cysteine proteinase B (cpb) genes and phylogenetic analysis.

    PubMed

    Chaouch, Melek; Fathallah-Mili, Akila; Driss, Mehdi; Lahmadi, Ramzi; Ayari, Chiraz; Guizani, Ikram; Ben Said, Moncef; Benabderrazak, Souha

    2013-03-01

    Discrimination of the Old World Leishmania parasites is important for diagnosis and epidemiological studies of leishmaniasis. We have developed PCR assays that allow the discrimination between Leishmania major, Leishmania tropica and Leishmania infantum Tunisian species. The identification was performed by a simple PCR targeting cysteine protease B (cpb) gene copies. These PCR can be a routine molecular biology tools for discrimination of Leishmania spp. from different geographical origins and different clinical forms. Our assays can be an informative source for cpb gene studying concerning drug, diagnostics and vaccine research. The PCR products of the cpb gene and the N-acetylglucosamine-1-phosphate transferase (nagt) Leishmania gene were sequenced and aligned. Phylogenetic trees of Leishmania based cpb and nagt sequences are close in topology and present the classic distribution of Leishmania in the Old World. The phylogenetic analysis has enabled the characterization and identification of different strains, using both multicopy (cpb) and single copy (nagt) genes. Indeed, the cpb phylogenetic analysis allowed us to identify the Tunisian Leishmania killicki species, and a group which gathers the least evolved isolates of the Leishmania donovani complex, that was originated from East Africa. This clustering confirms the African origin for the visceralizing species of the L. donovani complex. PMID:23228525

  19. IUS prerelease alignment

    NASA Technical Reports Server (NTRS)

    Evans, F. A.

    1978-01-01

    Space shuttle orbiter/IUS alignment transfer was evaluated. Although the orbiter alignment accuracy was originally believed to be the major contributor to the overall alignment transfer error, it was shown that orbiter alignment accuracy is not a factor affecting IUS alignment accuracy, if certain procedures are followed. Results are reported of alignment transfer accuracy analysis.

  20. Impacts of Terraces on Phylogenetic Inference.

    PubMed

    Sanderson, Michael J; McMahon, Michelle M; Stamatakis, Alexandros; Zwickl, Derrick J; Steel, Mike

    2015-09-01

    Terraces are sets of trees with precisely the same likelihood or parsimony score, which can be induced by missing sequences in partitioned multi-locus phylogenetic data matrices. The potentially large set of trees on a terrace can be characterized by enumeration algorithms or consensus methods that exploit the pattern of partial taxon coverage in the data, independent of the sequence data themselves. Terraces can add ambiguity and complexity to phylogenetic inference, particularly in settings where inference is already challenging: data sets with many taxa and relatively few loci. In this article we present five new findings about terraces and their impacts on phylogenetic inference. First, we clarify assumptions about partitioning scheme model parameters that are necessary for the existence of terraces. Second, we explore the dependence of terrace size on partitioning scheme and indicate how to find the partitioning scheme associated with the largest terrace containing a given tree. Third, we highlight the impact of terrace size on bootstrap estimates of confidence limits in clades, and characterize the surprising result that the bootstrap proportion for a clade, as it is usually calculated, can be entirely determined by the frequency of bipartitions on a terrace, with some bipartitions receiving high support even when incorrect. Fourth, we dissect some effects of prior distributions of edge lengths on the computed posterior probabilities of clades on terraces, to understand an example in which long edges "attract" each other in Bayesian inference. Fifth, we describe how assuming relationships between edge-lengths of different loci, as an attempt to avoid terraces, can also be problematic when taxon coverage is partial, specifically when heterotachy is present. Finally, we discuss strategies for remediation of some of these problems. One promising approach finds a minimal set of taxa which, when deleted from the data matrix, reduces the size of a terrace to a

  1. Evolutionary relationships of the Critically Endangered frog Ericabatrachus baleensis Largen, 1991 with notes on incorporating previously unsampled taxa into large-scale phylogenetic analyses

    PubMed Central

    2014-01-01

    Background The phylogenetic relationships of many taxa remain poorly known because of a lack of appropriate data and/or analyses. Despite substantial recent advances, amphibian phylogeny remains poorly resolved in many instances. The phylogenetic relationships of the Ethiopian endemic monotypic genus Ericabatrachus has been addressed thus far only with phenotypic data and remains contentious. Results We obtained fresh samples of the now rare and Critically Endangered Ericabatrachus baleensis and generated DNA sequences for two mitochondrial and four nuclear genes. Analyses of these new data using de novo and constrained-tree phylogenetic reconstructions strongly support a close relationship between Ericabatrachus and Petropedetes, and allow us to reject previously proposed alternative hypotheses of a close relationship with cacosternines or Phrynobatrachus. Conclusions We discuss the implications of our results for the taxonomy, biogeography and conservation of E. baleensis, and suggest a two-tiered approach to the inclusion and analyses of new data in order to assess the phylogenetic relationships of previously unsampled taxa. Such approaches will be important in the future given the increasing availability of relevant mega-alignments and potential framework phylogenies. PMID:24612655

  2. Phylogenetic lineages in Entomophthoromycota

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Entomophthoromycota Humber is one of five major phylogenetic lineages among the former phylum Zygomycota. These early terrestrial fungi share evolutionarily ancestral characters such as coenocytic mycelium and gametangiogamy as a sexual process resulting in zygospore formation. Previous molecular st...

  3. PSAR: measuring multiple sequence alignment reliability by probabilistic sampling

    PubMed Central

    Kim, Jaebum; Ma, Jian

    2011-01-01

    Multiple sequence alignment, which is of fundamental importance for comparative genomics, is a difficult problem and error-prone. Therefore, it is essential to measure the reliability of the alignments and incorporate it into downstream analyses. We propose a new probabilistic sampling-based alignment reliability (PSAR) score. Instead of relying on heuristic assumptions, such as the correlation between alignment quality and guide tree uncertainty in progressive alignment methods, we directly generate suboptimal alignments from an input multiple sequence alignment by a probabilistic sampling method, and compute the agreement of the input alignment with the suboptimal alignments as the alignment reliability score. We construct the suboptimal alignments by an approximate method that is based on pairwise comparisons between each single sequence and the sub-alignment of the input alignment where the chosen sequence is left out. By using simulation-based benchmarks, we find that our approach is superior to existing ones, supporting that the suboptimal alignments are highly informative source for assessing alignment reliability. We apply the PSAR method to the alignments in the UCSC Genome Browser to measure the reliability of alignments in different types of regions, such as coding exons and conserved non-coding regions, and use it to guide cross-species conservation study. PMID:21576232

  4. ALVIS: interactive non-aggregative visualization and explorative analysis of multiple sequence alignments.

    PubMed

    Schwarz, Roland F; Tamuri, Asif U; Kultys, Marek; King, James; Godwin, James; Florescu, Ana M; Schultz, Jörg; Goldman, Nick

    2016-05-01

    Sequence Logos and its variants are the most commonly used method for visualization of multiple sequence alignments (MSAs) and sequence motifs. They provide consensus-based summaries of the sequences in the alignment. Consequently, individual sequences cannot be identified in the visualization and covariant sites are not easily discernible. We recently proposed Sequence Bundles, a motif visualization technique that maintains a one-to-one relationship between sequences and their graphical representation and visualizes covariant sites. We here present Alvis, an open-source platform for the joint explorative analysis of MSAs and phylogenetic trees, employing Sequence Bundles as its main visualization method. Alvis combines the power of the visualization method with an interactive toolkit allowing detection of covariant sites, annotation of trees with synapomorphies and homoplasies, and motif detection. It also offers numerical analysis functionality, such as dimension reduction and classification. Alvis is user-friendly, highly customizable and can export results in publication-quality figures. It is available as a full-featured standalone version (http://www.bitbucket.org/rfs/alvis) and its Sequence Bundles visualization module is further available as a web application (http://science-practice.com/projects/sequence-bundles). PMID:26819408

  5. ALVIS: interactive non-aggregative visualization and explorative analysis of multiple sequence alignments

    PubMed Central

    Schwarz, Roland F.; Tamuri, Asif U.; Kultys, Marek; King, James; Godwin, James; Florescu, Ana M.; Schultz, Jörg; Goldman, Nick

    2016-01-01

    Sequence Logos and its variants are the most commonly used method for visualization of multiple sequence alignments (MSAs) and sequence motifs. They provide consensus-based summaries of the sequences in the alignment. Consequently, individual sequences cannot be identified in the visualization and covariant sites are not easily discernible. We recently proposed Sequence Bundles, a motif visualization technique that maintains a one-to-one relationship between sequences and their graphical representation and visualizes covariant sites. We here present Alvis, an open-source platform for the joint explorative analysis of MSAs and phylogenetic trees, employing Sequence Bundles as its main visualization method. Alvis combines the power of the visualization method with an interactive toolkit allowing detection of covariant sites, annotation of trees with synapomorphies and homoplasies, and motif detection. It also offers numerical analysis functionality, such as dimension reduction and classification. Alvis is user-friendly, highly customizable and can export results in publication-quality figures. It is available as a full-featured standalone version (http://www.bitbucket.org/rfs/alvis) and its Sequence Bundles visualization module is further available as a web application (http://science-practice.com/projects/sequence-bundles). PMID:26819408

  6. Phylogenetic Relationships and Species Delimitation in Pinus Section Trifoliae Inferrred from Plastid DNA

    PubMed Central

    Hernández-León, Sergio; Gernandt, David S.; Pérez de la Rosa, Jorge A.; Jardón-Barbolla, Lev

    2013-01-01

    Recent diversification followed by secondary contact and hybridization may explain complex patterns of intra- and interspecific morphological and genetic variation in the North American hard pines (Pinus section Trifoliae), a group of approximately 49 tree species distributed in North and Central America and the Caribbean islands. We concatenated five plastid DNA markers for an average of 3.9 individuals per putative species and assessed the suitability of the five regions as DNA bar codes for species identification, species delimitation, and phylogenetic reconstruction. The ycf1 gene accounted for the greatest proportion of the alignment (46.9%), the greatest proportion of variable sites (74.9%), and the most unique sequences (75 haplotypes). Phylogenetic analysis recovered clades corresponding to subsections Australes, Contortae, and Ponderosae. Sequences for 23 of the 49 species were monophyletic and sequences for another 9 species were paraphyletic. Morphologically similar species within subsections usually grouped together, but there were exceptions consistent with incomplete lineage sorting or introgression. Bayesian relaxed molecular clock analyses indicated that all three subsections diversified relatively recently during the Miocene. The general mixed Yule-coalescent method gave a mixed model estimate of only 22 or 23 evolutionary entities for the plastid sequences, which corresponds to less than half the 49 species recognized based on morphological species assignments. Including more unique haplotypes per species may result in higher estimates, but low mutation rates, recent diversification, and large effective population sizes may limit the effectiveness of this method to detect evolutionary entities. PMID:23936218

  7. The cation/Ca(2+) exchanger superfamily: phylogenetic analysis and structural implications.

    PubMed

    Cai, Xinjiang; Lytton, Jonathan

    2004-09-01

    Cation/Ca(2+) exchangers are an essential component of Ca(2+) signaling pathways and function to transport cytosolic Ca(2+) across membranes against its electrochemical gradient by utilizing the downhill gradients of other cation species such as H(+), Na(+), or K(+). The cation/Ca(2+) exchanger superfamily is composed of H(+)/Ca(2+) exchangers and Na(+)/Ca(2+) exchangers, which have been investigated extensively in both plant cells and animal cells. Recently, information from completely sequenced genomes of bacteria, archaea, and eukaryotes has revealed the presence of genes that encode homologues of cation/Ca(2+) exchangers in many organisms in which the role of these exchangers has not been clearly demonstrated. In this study, we report a comprehensive sequence alignment and the first phylogenetic analysis of the cation/Ca(2+) exchanger superfamily of 147 sequences. The results present a framework for structure-function relationships of cation/Ca(2+) exchangers, suggesting unique signature motifs of conserved residues that may underlie divergent functional properties. Construction of a phylogenetic tree with inclusion of cation/Ca(2+) exchangers with known functional properties defines five protein families and the evolutionary relationships between the members. Based on this analysis, the cation/Ca(2+) exchanger superfamily is classified into the YRBG, CAX, NCX, and NCKX families and a newly recognized family, designated CCX. These findings will provide guides for future studies concerning structures, functions, and evolutionary origins of the cation/Ca(2+) exchangers. PMID:15163769

  8. Phylogenetic analysis of chloroplast matK gene from Zingiberaceae for plant DNA barcoding.

    PubMed

    Selvaraj, Dhivya; Sarma, Rajeev Kumar; Sathishkumar, Ramalingam

    2008-01-01

    MaturaseK gene (MatK) of chloroplast is highly conserved in plant systematics which is involved in Group II intron splicing. The size of the gene is 1500 bp in length, located with in the intron of trnK. In the present study, matK gene from Zingiberaceae was taken for the analysis of variants, parsimony site, patterns, transition/tranversion rates and phylogeny. The family of Zingiberaceae comprises 47 genera with medicinal values. The matK gene sequence have been obtained from genbank and used for the analysis. The sequence alignments were performed by Clustal X, transition/transversion rates were predicted by MEGA and phylogenetic analyses were carried out by PHYLIP package. The result indicates that the Zingiberaceae genus Afromonum, Alpinia, Globba, Curcuma and Zingiber shows polyphylogeny. The overall variants between the species are 24% and transition/transversion rate is 1.54. Phylogenetic tree was designed to identify the ideal regions that could be used for defining the inter and intera-generic relationships. From this study it could be concluded that the matK gene is a good candidate for DNA barcoding of plant family Zingiberaceae. PMID:19052662

  9. A genome-wide SNP-based phylogenetic analysis distinguishes different biovars of Brucella suis.

    PubMed

    Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2016-07-01

    Brucellosis is an important zoonotic disease caused by Brucella spp. Brucella suis is the etiological agent of porcine brucellosis. B. suis is the most genetically diverged species within the genus Brucella. We present the first large-scale B. suis phylogenetic analysis based on an alignment-free k-mer approach of gathering polymorphic sites from whole genome sequences. Genome-wide core-SNP based phylogenetic tree clearly differentiated and discriminated the B. suis biovars and the vaccine strain into different clades. A total of 16,756 SNPs were identified from the genome sequences of 54 B. suis strains. Also, biovar-specific SNPs were identified. The vaccine strain B. suis S2-30 is extensively used in China, which was discriminated from all biovars with the accumulation of the highest number of SNPs. We have also identified the SNPs between B. suis vaccine strain S2-30 and its closest homolog, B. suis biovar 513UK. The highest number of mutations (22) was observed in the phosphomannomutase (pmm) gene essential for the synthesis of O-antigen. Also, mutations were identified in several virulent genes including genes coding for type IV secretion system and the effector proteins, which could be responsible for the attenuated virulence of B. suis S2-30. PMID:27085292

  10. Tree Scanning

    PubMed Central

    Templeton, Alan R.; Maxwell, Taylor; Posada, David; Stengård, Jari H.; Boerwinkle, Eric; Sing, Charles F.

    2005-01-01

    We use evolutionary trees of haplotypes to study phenotypic associations by exhaustively examining all possible biallelic partitions of the tree, a technique we call tree scanning. If the first scan detects significant associations, additional rounds of tree scanning are used to partition the tree into three or more allelic classes. Two worked examples are presented. The first is a reanalysis of associations between haplotypes at the Alcohol Dehydrogenase locus in Drosophila melanogaster that was previously analyzed using a nested clade analysis, a more complicated technique for using haplotype trees to detect phenotypic associations. Tree scanning and the nested clade analysis yield the same inferences when permutation testing is used with both approaches. The second example is an analysis of associations between variation in various lipid traits and genetic variation at the Apolipoprotein E (APOE) gene in three human populations. Tree scanning successfully identified phenotypic associations expected from previous analyses. Tree scanning for the most part detected more associations and provided a better biological interpretative framework than single SNP analyses. We also show how prior information can be incorporated into the tree scan by starting with the traditional three electrophoretic alleles at APOE. Tree scanning detected genetically determined phenotypic heterogeneity within all three electrophoretic allelic classes. Overall, tree scanning is a simple, powerful, and flexible method for using haplotype trees to detect phenotype/genotype associations at candidate loci. PMID:15371364

  11. RFLP analysis of mtDNA from six platyrrhine genera: phylogenetic inferences.

    PubMed

    Ruiz-García, M; Alvarez, D

    2003-01-01

    This study investigates the phylogenetic relationships of 10 species of platyrrhine primates using RFLP analysis of mtDNA. Three restriction enzymes were used to determine the restriction site haplotypes for a total of 276 individuals. Phylogenetic analysis using maximum parsimony was employed to construct phylogenetic trees. We found close phylogenetic relationships between Alouatta, Lagothrix and Ateles. We also found a close relationship between Cebus and Aotus, with Saimiri clustering with the atelines. Haplotype diversity was found in four of the species studied, in Cebus albifrons, Saimiri sciureus, Lagothrix lagotricha and Ateles fusciceps. These data provide additional information concerning the phylogenetic relationships between these platyrrhine genera and species. PMID:12759493

  12. Phylogenetic Stochastic Mapping Without Matrix Exponentiation

    PubMed Central

    Irvahn, Jan; Minin, Vladimir N.

    2014-01-01

    Abstract Phylogenetic stochastic mapping is a method for reconstructing the history of trait changes on a phylogenetic tree relating species/organism carrying the trait. State-of-the-art methods assume that the trait evolves according to a continuous-time Markov chain (CTMC) and works well for small state spaces. The computations slow down considerably for larger state spaces (e.g., space of codons), because current methodology relies on exponentiating CTMC infinitesimal rate matrices—an operation whose computational complexity grows as the size of the CTMC state space cubed. In this work, we introduce a new approach, based on a CTMC technique called uniformization, which does not use matrix exponentiation for phylogenetic stochastic mapping. Our method is based on a new Markov chain Monte Carlo (MCMC) algorithm that targets the distribution of trait histories conditional on the trait data observed at the tips of the tree. The computational complexity of our MCMC method grows as the size of the CTMC state space squared. Moreover, in contrast to competing matrix exponentiation methods, if the rate matrix is sparse, we can leverage this sparsity and increase the computational efficiency of our algorithm further. Using simulated data, we illustrate advantages of our MCMC algorithm and investigate how large the state space needs to be for our method to outperform matrix exponentiation approaches. We show that even on the moderately large state space of codons our MCMC method can be significantly faster than currently used matrix exponentiation methods. PMID:24918812

  13. Distance-Based Phylogenetic Methods Around a Polytomy.

    PubMed

    Davidson, Ruth; Sullivant, Seth

    2014-01-01

    Distance-based phylogenetic algorithms attempt to solve the NP-hard least-squares phylogeny problem by mapping an arbitrary dissimilarity map representing biological data to a tree metric. The set of all dissimilarity maps is a Euclidean space properly containing the space of all tree metrics as a polyhedral fan. Outputs of distance-based tree reconstruction algorithms such as UPGMA and neighbor-joining are points in the maximal cones in the fan. Tree metrics with polytomies lie at the intersections of maximal cones. A phylogenetic algorithm divides the space of all dissimilarity maps into regions based upon which combinatorial tree is reconstructed by the algorithm. Comparison of phylogenetic methods can be done by comparing the geometry of these regions. We use polyhedral geometry to compare the local nature of the subdivisions induced by least-squares phylogeny, UPGMA, and neighbor-joining when the true tree has a single polytomy with exactly four neighbors. Our results suggest that in some circumstances, UPGMA and neighbor-joining poorly match least-squares phylogeny. PMID:26355780

  14. Complete mitochondrial genome of Cervus elaphus songaricus (Cetartiodactyla: Cervinae) and a phylogenetic analysis with related species.

    PubMed

    Li, Yiqing; Ba, Hengxing; Yang, Fuhe

    2016-01-01

    Complete mitochondrial genome of Tianshan wapiti, Cervus elaphus songaricus, is 16,419 bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 1 control region. The phylogenetic trees were reconstructed with the concatenated nucleotide sequences of the 13 protein-coding genes using maximum parsimony (MP) and Bayesian inference (BI) methods. MP and BI phylogenetic trees here showed an identical tree topology. The monopoly of red deer, wapiti and sika deer was well supported, and wapiti was found to share a closer relationship with sika deer. Tianshan wapiti shared a closer relationship with xanthopygus than yarkandensis. Rusa unicolor and Rucervus eldi were given a basal phylogenetic position. Our phylogenetic analysis provided a robust phylogenetic resolution spanning the entire evolutionary relationship of the subfamily Cervinae. PMID:24725059

  15. Spaces of phylogenetic networks from generalized nearest-neighbor interchange operations.

    PubMed

    Huber, Katharina T; Linz, Simone; Moulton, Vincent; Wu, Taoyang

    2016-02-01

    Phylogenetic networks are a generalization of evolutionary or phylogenetic trees that are used to represent the evolution of species which have undergone reticulate evolution. In this paper we consider spaces of such networks defined by some novel local operations that we introduce for converting one phylogenetic network into another. These operations are modeled on the well-studied nearest-neighbor interchange operations on phylogenetic trees, and lead to natural generalizations of the tree spaces that have been previously associated to such operations. We present several results on spaces of some relatively simple networks, called level-1 networks, including the size of the neighborhood of a fixed network, and bounds on the diameter of the metric defined by taking the smallest number of operations required to convert one network into another. We expect that our results will be useful in the development of methods for systematically searching for optimal phylogenetic networks using, for example, likelihood and Bayesian approaches. PMID:26037483

  16. The molecular symplesiomorphies shared by the stem groups of metazoan evolution: can sites as few as 1% have a significant impact on recognizing the phylogenetic position of myzostomida?

    PubMed

    Wang, Yanhui; Xie, Qiang

    2014-08-01

    Although it is clear that taxon sampling, alignments, gene sampling, tree reconstruction methods and the total length of the sequences used are critical to the reconstruction of evolutionary history, weakly supported or misleading nodes exist in phylogenetic studies with no obvious flaw in those aspects. The phylogenetic studies focusing on the basal part of bilaterian evolution are such a case. During the past decade, Myzostomida has appeared in the basal part of Bilateria in several phylogenetic studies of Metazoa. However, most researchers have entertained only two competing hypotheses about the position of Myzostomida-an affinity with Annelida and an affinity with Platyhelminthes. In this study, dozens of symplesiomorphies were discovered by means of ancestral state reconstruction in the complete 18S and 28S rDNAs shared by the stem groups of Metazoa. By contrastive analysis on the datasets with or without such symplesiomorphic sites, we discovered that Myzostomida and other basal groups are basal lineages of Bilateria due to the corresponding symplesiomorphies shared with earlier lineages. As such, symplesiomorphies account for approximately 1-2% of the whole dataset have an essential impact on phylogenetic inference, and this study reminds molecular systematists of the importance of carrying out ancestral state reconstruction at each site in sequence-based phylogenetic studies. In addition, reasons should be explored for the low support of the hypothesis that Myzostomida belongs to Annelida in the results of phylogenomic studies. Future phylogenetic studies concerning Myzostomida should include all of the basal lineages of Bilateria to avoid directly neglecting the stand-alone basal position of Myzostomida as a potential hypothesis. PMID:25128981

  17. Tree Lifecycle.

    ERIC Educational Resources Information Center

    Nature Study, 1998

    1998-01-01

    Presents a Project Learning Tree (PLT) activity that has students investigate and compare the lifecycle of a tree to other living things and the tree's role in the ecosystem. Includes background material as well as step-by-step instructions, variation and enrichment ideas, assessment opportunities, and student worksheets. (SJR)

  18. Evolution of glutamate dehydrogenase genes: evidence for two paralogous protein families and unusual branching patterns of the archaebacteria in the universal tree of life.

    PubMed

    Benachenhou-Lahfa, N; Forterre, P; Labedan, B

    1993-04-01

    The existence of two families of genes coding for hexameric glutamate dehydrogenases has been deduced from the alignment of 21 primary sequences and the determination of the percentages of similarity between each pair of proteins. Each family could also be characterized by specific motifs. One family (Family I) was composed of gdh genes from six eubacteria and six lower eukaryotes (the primitive protozoan Giardia lamblia, the green alga Chlorella sorokiniana, and several fungi and yeasts). The other one (Family II) was composed of gdh genes from two eubacteria, two archaebacteria, and five higher eukaryotes (vertebrates). Reconstruction of phylogenetic trees using several parsimony and distance methods confirmed the existence of these two families. Therefore, these results reinforced our previously proposed hypothesis that two close but already different gdh genes were present in the last common ancestor to the three Ur-kingdoms (eubacteria, archaebacteria, and eukaryotes). The branching order of the different species of Family I was found to be the same whatever the method of tree reconstruction although it varied slightly according the region analyzed. Similarly, the topological positions of eubacteria and eukaryotes of Family II were independent of the method used. However, the branching of the two archaebacteria in Family II appeared to be unexpected: (1) the thermoacidophilic Sulfolobus solfataricus was found clustered with the two eubacteria of this family both in parsimony and distance trees, a situation not predicted by either one of the contradictory trees recently proposed; and (2) the branching of the halophilic Halobacterium salinarium varied according to the method of tree construction: it was closer to the eubacteria in the maximum parsimony tree and to eukaryotes in distance trees. Therefore, whatever the actual position of the halophilic species, archaebacteria did not appear to be monophyletic in these gdh gene trees. This result questions the

  19. A Distance Measure for Genome Phylogenetic Analysis

    NASA Astrophysics Data System (ADS)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  20. Phycas: software for Bayesian phylogenetic analysis.

    PubMed

    Lewis, Paul O; Holder, Mark T; Swofford, David L

    2015-05-01

    Phycas is open source, freely available Bayesian phylogenetics software written primarily in C++ but with a Python interface. Phycas specializes in Bayesian model selection for nucleotide sequence data, particularly the estimation of marginal likelihoods, central to computing Bayes Factors. Marginal likelihoods can be estimated using newer methods (Thermodynamic Integration and Generalized Steppingstone) that are more accurate than the widely used Harmonic Mean estimator. In addition, Phycas supports two posterior predictive approaches to model selection: Gelfand-Ghosh and Conditional Predictive Ordinates. The General Time Reversible family of substitution models, as well as a codon model, are available, and data can be partitioned with all parameters unlinked except tree topology and edge lengths. Phycas provides for analyses in which the prior on tree topologies allows polytomous trees as well as fully resolved trees, and provides for several choices for edge length priors, including a hierarchical model as well as the recently described compound Dirichlet prior, which helps avoid overly informative induced priors on tree length. PMID:25577605

  1. On the analysis of phylogenetically paired designs

    PubMed Central

    Funk, Jennifer L; Rakovski, Cyril S; Macpherson, J Michael

    2015-01-01

    As phylogenetically controlled experimental designs become increasingly common in ecology, the need arises for a standardized statistical treatment of these datasets. Phylogenetically paired designs circumvent the need for resolved phylogenies and have been used to compare species groups, particularly in the areas of invasion biology and adaptation. Despite the widespread use of this approach, the statistical analysis of paired designs has not been critically evaluated. We propose a mixed model approach that includes random effects for pair and species. These random effects introduce a “two-layer” compound symmetry variance structure that captures both the correlations between observations on related species within a pair as well as the correlations between the repeated measurements within species. We conducted a simulation study to assess the effect of model misspecification on Type I and II error rates. We also provide an illustrative example with data containing taxonomically similar species and several outcome variables of interest. We found that a mixed model with species and pair as random effects performed better in these phylogenetically explicit simulations than two commonly used reference models (no or single random effect) by optimizing Type I error rates and power. The proposed mixed model produces acceptable Type I and II error rates despite the absence of a phylogenetic tree. This design can be generalized to a variety of datasets to analyze repeated measurements in clusters of related subjects/species. PMID:25750719

  2. Molecular and phylogenetic analysis of pyridoxal phosphate-dependent acyltransferase of Exiguobacterium acetylicum.

    PubMed

    Rajendran, Narayanan; Smith, Colby; Mazhawidza, Williard

    2009-01-01

    The pyridoxal-5'-phosphate (PLP)-dependent family of enzymes is a very diverse group of proteins that metabolize small molecules like amino acids and sugars, and synthesize cofactors for other metabolic pathways through transamination, decarboxylation, racemization, and substitution reactions. In this study we employed degenerated primer-based PCR amplification, using genomic DNA isolated from the soil bacterium Exiguobacterium acetylicum strain SN as template. We revealed the presence of a PLP-dependent family of enzymes, such as PLP-dependent acyltransferase, and similarity to 8-amino-7-oxononoate synthase. Sequencing analysis and multiple alignment of the thymidine-adenine-cloned PCR amplicon revealed PLP-dependent family enzymes with specific confering codes and consensus amino acid residues specific to this group of functional proteins. Amino acid residues common to the majority of PLP-dependent enzymes were also revealed by the Lasergene MegAlign software. A phylogenetic tree was constructed. Its analysis revealed a close relationship of E. acetylicum to other bacteria isolated from extreme environments suggesting similarities in anabolic adaptability and evolutionary development. PMID:20158163

  3. Phylogenetic diversity (PD) and biodiversity conservation: some bioinformatics challenges

    PubMed Central

    Faith, Daniel P.; Baker, Andrew M.

    2007-01-01

    Biodiversity conservation addresses information challenges through estimations encapsulated in measures of diversity. A quantitative measure of phylogenetic diversity, “PD”, has been defined as the minimum total length of all the phylogenetic branches required to span a given set of taxa on the phylogenetic tree (Faith 1992a). While a recent paper incorrectly characterizes PD as not including information about deeper phylogenetic branches, PD applications over the past decade document the proper incorporation of shared deep branches when assessing the total PD of a set of taxa. Current PD applications to macroinvertebrate taxa in streams of New South Wales, Australia illustrate the practical importance of this definition. Phylogenetic lineages, often corresponding to new, “cryptic”, taxa, are restricted to a small number of stream localities. A recent case of human impact causing loss of taxa in one locality implies a higher PD value for another locality, because it now uniquely represents a deeper branch. This molecular-based phylogenetic pattern supports the use of DNA barcoding programs for biodiversity conservation planning. Here, PD assessments side-step the contentious use of barcoding-based “species” designations. Bio-informatics challenges include combining different phylogenetic evidence, optimization problems for conservation planning, and effective integration of phylogenetic information with environmental and socio-economic data. PMID:19455206

  4. 11. GAS STATION AND OLD ROAD ALIGNMENT, FACING S. VISITOR ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    11. GAS STATION AND OLD ROAD ALIGNMENT, FACING S. VISITOR CENTER BEHIND TREES. SAME CAMERA POSITION AS AZ-45-10. - South Entrance Road, Between South park boundary & Village Loop Road, Grand Canyon, Coconino County, AZ

  5. Molecular phylogenetics of mastodon and Tyrannosaurus rex.

    PubMed

    Organ, Chris L; Schweitzer, Mary H; Zheng, Wenxia; Freimark, Lisa M; Cantley, Lewis C; Asara, John M

    2008-04-25

    We report a molecular phylogeny for a nonavian dinosaur, extending our knowledge of trait evolution within nonavian dinosaurs into the macromolecular level of biological organization. Fragments of collagen alpha1(I) and alpha2(I) proteins extracted from fossil bones of Tyrannosaurus rex and Mammut americanum (mastodon) were analyzed with a variety of phylogenetic methods. Despite missing sequence data, the mastodon groups with elephant and the T. rex groups with birds, consistent with predictions based on genetic and morphological data for mastodon and on morphological data for T. rex. Our findings suggest that molecular data from long-extinct organisms may have the potential for resolving relationships at critical areas in the vertebrate evolutionary tree that have, so far, been phylogenetically intractable. PMID:18436782

  6. Determining phylogenetic networks from inter-taxa distances.

    PubMed

    Bordewich, Magnus; Semple, Charles

    2016-08-01

    We consider the problem of determining the topological structure of a phylogenetic network given only information about the path-length distances between taxa. In particular, one of the main results of the paper shows that binary tree-child networks are essentially determined by such information. PMID:26666756

  7. Cloning, in Vitro expression, and novel phylogenetic classification of a channel catfish estrogen receptor

    USGS Publications Warehouse

    Xia, Z.; Patino, R.; Gale, W.L.; Maule, A.G.; Densmore, L.D.

    1999-01-01

    We obtained two channel catfish estrogen receptor (ccER) cDNA from liver of female fish using RT–PCR. The two fragments were identical in sequence except that the smaller one had an out-of-frame deletion in the E domain, suggesting the existence of ccER splice variants. The larger fragment was used to screen a cDNA library from liver of a prepubescent female. A cDNA was obtained that encoded a 581-amino-acid ER with a deduced molecular weight of 63.8 kDa. Extracts of COS-7 cells transfected with ccER cDNA bound estrogen with high affinity (Kd = 4.7 nM) and specificity. Maximum parsimony and Neighbor Joining analyses were used to generate a phylogenetic classification of ccER on the basis of 18 full-length ER sequences. The tree suggested the existence of two major ER branches. One branch contained two clearly divergent clades which included all piscine ER (except Japanese eel ER) and all tetrapod ERα, respectively. The second major branch contained the eel ER and the mammalian ERβ. The high degree of divergence between the eel ER and mammalian ERβ suggested that they also represent distinct piscine and tetrapod ER. These data suggest that ERα and ERβ are present throughout vertebrates and that these two major ER types evolved by duplication of an ancestral ER gene. Sequence alignments with other members of the nuclear hormone receptor superfamily indicated the presence of 8 amino acids in the E domain that align exclusively among ER. Four of these amino acids have not received prior research attention and their function is unknown. The novel finding of putative ER splice variants in a nonmammalian vertebrate and the novel phylogenetic classification of ER offer new perspectives in understanding the diversification and function of ER.

  8. Phylogenetic analysis of the triterpene cyclase protein family in prokaryotes and eukaryotes suggests bidirectional lateral gene transfer.

    PubMed

    Frickey, Tancred; Kannenberg, Elmar

    2009-05-01

    Functional constraints to modifications in triterpene cyclase amino acid sequences make them good candidates for evolutionary studies on the phylogenetic relatedness of these enzymes in prokaryotes as well as in eukaryotes. In this study, we used a set of identified triterpene cyclases, a group of mainly bacterial squalene cyclases and a group of predominantly eukaryotic oxidosqualene cyclases, as seed sequences to identify 5288 putative triterpene cyclase homologues in publicly available databases. The Cluster Analysis of Sequences software was used to detect groups of sequences with increased pairwise sequence similarity. The sequences fall into two main clusters, a bacterial and a eukaryotic. The conserved, informative regions of a multiple sequence alignment of the family were used to construct a neighbour-joining phylogenetic tree using the AsaturA and maximum likelihood phylogenetic tree using the PhyML software. Both analyses showed that most of the triterpene cyclase sequences were similarly grouped to the accepted taxonomic relationships of the organism the sequences originated from, supporting the idea of vertical transfer of cyclase genes from parent to offspring as the main evolutionary driving force in this protein family. However, a small group of sequences from three bacterial species (Stigmatella, Gemmata and Methylococcus) grouped with an otherwise purely eukaryotic cluster of oxidosqualene cyclases, while a small group of sequences from seven fungal species and a sequence from the fern Adiantum grouped consistently with a cluster of otherwise purely bacterial squalene cyclases. This suggests that lateral gene transfer may have taken place, entailing a transfer of oxidosqualene cyclases from eukaryotes to bacteria and a transfer of squalene cyclase from bacteria to an ancestor of the group of Pezizomycotina fungi. PMID:19207562

  9. A taxonomic and phylogenetic re-appraisal of the genus Curvularia

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Species of Curvularia are important plant and human pathogens worldwide. In this study, the genus Curvularia is re-assessed based on molecular phylogenetic analysis and morphological observations of available isolates and specimens. A multi-gene phylogenetic tree inferred from ITS, TEF and GPDH gene...

  10. Phylogenetic analysis of the spirochetes.

    PubMed Central

    Paster, B J; Dewhirst, F E; Weisburg, W G; Tordoff, L A; Fraser, G J; Hespell, R B; Stanton, T B; Zablen, L; Mandelco, L; Woese, C R

    1991-01-01

    The 16S rRNA sequences were determined for species of Spirochaeta, Treponema, Borrelia, Leptospira, Leptonema, and Serpula, using a modified Sanger method of direct RNA sequencing. Analysis of aligned 16S rRNA sequences indicated that the spirochetes form a coherent taxon composed of six major clusters or groups. The first group, termed the treponemes, was divided into two subgroups. The first treponeme subgroup consisted of Treponema pallidum, Treponema phagedenis, Treponema denticola, a thermophilic spirochete strain, and two species of Spirochaeta, Spirochaeta zuelzerae and Spirochaeta stenostrepta, with an average interspecies similarity of 89.9%. The second treponeme subgroup contained Treponema bryantii, Treponema pectinovorum, Treponema saccharophilum, Treponema succinifaciens, and rumen strain CA, with an average interspecies similarity of 86.2%. The average interspecies similarity between the two treponeme subgroups was 84.2%. The division of the treponemes into two subgroups was verified by single-base signature analysis. The second spirochete group contained Spirochaeta aurantia, Spirochaeta halophila, Spirochaeta bajacaliforniensis, Spirochaeta litoralis, and Spirochaeta isovalerica, with an average similarity of 87.4%. The Spirochaeta group was related to the treponeme group, with an average similarity of 81.9%. The third spirochete group contained borrelias, including Borrelia burgdorferi, Borrelia anserina, Borrelia hermsii, and a rabbit tick strain. The borrelias formed a tight phylogenetic cluster, with average similarity of 97%. THe borrelia group shared a common branch with the Spirochaeta group and was closer to this group than to the treponemes. A single spirochete strain isolated fromt the shew constituted the fourth group. The fifth group was composed of strains of Serpula (Treponema) hyodysenteriae and Serpula (Treponema) innocens. The two species of this group were closely related, with a similarity of greater than 99%. Leptonema illini

  11. Phylogenetic and evolutionary analysis of Chinese Leishmania isolates based on multilocus sequence typing.

    PubMed

    Zhang, Chun-Ying; Lu, Xiao-Jun; Du, Xiao-Qing; Jian, Jun; Shu, Ling; Ma, Ying

    2013-01-01

    Leishmaniasis is a debilitating infectious disease that has a variety of clinical forms. In China, visceral leishmaniasis (VL) is the most common symptom, and L. donovani and/or L. infantum are the likely pathogens. In this study, multilocus sequence typing (MLST) of five enzyme-coding genes (fh, g6pdh, icd, mpi, pgd) and two conserved genes (hsp70, lack) was used to investigate the phylogenetic relationships of Chinese Leishmania strains. Concatenated alignment of the nucleotide sequences of the seven genes was analyzed and phylogenetic trees were constructed using neighbor-joining and maximum parsimony models. A set of additional sequences from 25 strains (24 strains belong to the L. donovani complex and one strain belongs to L. gerbilli) were retrieved from GenBank to infer the molecular evolutionary history of Leishmania from China and other endemic areas worldwide. Phylogenetic analyses consolidated Chinese Leishmania into four groups: (i) one clade A population comprised 13 isolates from different foci in China, which were pathogenic to humans and canines. This population was subdivided into two subclades, clade A1 and clade A2, which comprised sister organisms to the remaining members of the worldwide L. donovani complex; (ii) a population in clade B consisted of one reference strain of L. turanica and five Chinese strains from Xinjiang; (iii) clade C (SELF-7 and EJNI-154) formed a population that was closely related to clade B, and both isolates were identified as L. gerbilli; and (iv) the final group, clade D, included Sauroleishmania (LIZRD and KXG-E) and was distinct from the other strains. We hypothesize that the phylogeny of Chinese Leishmania is associated with the geographical origins rather than with the clinical forms (VL or CL) of leishmaniasis. To conclude, this study provides further molecular information on Chinese Leishmania isolates and the Chinese isolates appear to have a more complex evolutionary history than previously thought. PMID

  12. Phylogenetic analysis of the Australian rosella parrots (Platycercus) reveals discordance among molecules and plumage.

    PubMed

    Shipham, Ashlee; Schmidt, Daniel J; Joseph, Leo; Hughes, Jane M

    2015-10-01

    Relationships and species limits among the colourful Australian parrots known as rosellas (Platycercus) are contentious because of poorly understood patterns of parapatry, sympatry and hybridization as well as complex patterns of geographical replacement of phenotypic forms. Two subgenera are, however, conventionally recognised: Platycercus comprises the blue-cheeked crimson rosella complex (Crimson Rosella P. elegans and Green Rosella P. caledonicus), and Violania contains the remaining four currently recognised species (Pale-headed Rosella P. adscitus, Eastern Rosella P. eximius, Northern Rosella P. venustus, and Western Rosella P. icterotis). We used phylogenetic analysis of ten loci (one mitochondrial, eight autosomal and one z-linked) and several individuals per nominal species primarily to examine relationships within the subgenera, especially the relationships and species limits within Violania. Of these, P. adscitus and P. eximius have long been considered sister species or conspecific due to a morphology-based hybrid zone and an early phylogenetic analysis of mitochondrial DNA restriction fragment length polymorphisms. The multilocus phylogenetic analysis presented here supports an alternative hypothesis aligning P. adscitus and P. venustus as sister species. Using divergence rates published in other avian studies, we estimated the divergence between P. venustus and P. adscitus at 0.0148-0.6124MYA and that between the P. adscitus/P. venustus ancestor and P. eximius earlier at 0.1617-1.0816MYA, both within the Pleistocene. Discordant topologies among gene and species trees are discussed and proposed to be the result of historical gene flow and/or incomplete lineage sorting (ILS). In particular, we suggest that discordance between mitochondrial and nuclear data may be the result of asymmetrical mitochondrial introgression from P. adscitus into P. eximius. The biogeographical implications of our findings are discussed relative to similarly distributed groups

  13. Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?

    PubMed Central

    Stadler, Tanja; Degnan, James H.; Rosenberg, Noah A.

    2016-01-01

    Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth–death and multispecies coalescent model can explain the difference in empirical trees and birth–death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785

  14. Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?

    PubMed

    Stadler, Tanja; Degnan, James H; Rosenberg, Noah A

    2016-07-01

    Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth-death and multispecies coalescent model can explain the difference in empirical trees and birth-death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785

  15. SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees

    PubMed Central

    Mallo, Diego; De Oliveira Martins, Leonardo; Posada, David

    2016-01-01

    We present a fast and flexible software package—SimPhy—for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer—all three potentially leading to species tree/gene tree discordance—and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus, and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon, and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible. We validate SimPhy's output using theoretical expectations and other programs, and show that it scales extremely well with complex models and/or large trees, being an order of magnitude faster than the most similar program (DLCoal-Sim). In addition, we demonstrate how SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods. SimPhy is available at https://github.com/adamallo/SimPhy, where users can find the source code, precompiled executables, a detailed manual and example cases. PMID:26526427

  16. SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees.

    PubMed

    Mallo, Diego; De Oliveira Martins, Leonardo; Posada, David

    2016-03-01

    We present a fast and flexible software package--SimPhy--for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer--all three potentially leading to species tree/gene tree discordance--and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus, and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon, and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible. We validate SimPhy's output using theoretical expectations and other programs, and show that it scales extremely well with complex models and/or large trees, being an order of magnitude faster than the most similar program (DLCoal-Sim). In addition, we demonstrate how SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods. SimPhy is available at https://github.com/adamallo/SimPhy, where users can find the source code, precompiled executables, a detailed manual and example cases. PMID:26526427

  17. DNA Align Editor: DNA Alignment Editor Tool

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The SNPAlignEditor is a DNA sequence alignment editor that runs on Windows platforms. The purpose of the program is to provide an intuitive, user-friendly tool for manual editing of multiple sequence alignments by providing functions for input, editing, and output of nucleotide sequence alignments....

  18. CSA: An efficient algorithm to improve circular DNA multiple alignment

    PubMed Central

    Fernandes, Francisco; Pereira, Luísa; Freitas, Ana T

    2009-01-01

    Background The comparison of homologous sequences from different species is an essential approach to reconstruct the evolutionary history of species and of the genes they harbour in their genomes. Several complete mitochondrial and nuclear genomes are now available, increasing the importance of using multiple sequence alignment algorithms in comparative genomics. MtDNA has long been used in phylogenetic analysis and errors in the alignments can lead to errors in the interpretation of evolutionary information. Although a large number of multiple sequence alignment algorithms have been proposed to date, they all deal with linear DNA and cannot handle directly circular DNA. Researchers interested in aligning circular DNA sequences must first rotate them to the "right" place using an essentially manual process, before they can use multiple sequence alignment tools. Results In this paper we propose an efficient algorithm that identifies the most interesting region to cut circular genomes in order to improve phylogenetic analysis when using standard multiple sequence alignment algorithms. This algorithm identifies the largest chain of non-repeated longest subsequences common to a set of circular mitochondrial DNA sequences. All the sequences are then rotated and made linear for multiple alignment purposes. To evaluate the effectiveness of this new tool, three different sets of mitochondrial DNA sequences were considered. Other tests considering randomly rotated sequences were also performed. The software package Arlequin was used to evaluate the standard genetic measures of the alignments obtained with and without the use of the CSA algorithm with two well known multiple alignment algorithms, the CLUSTALW and the MAVID tools, and also the visualization tool SinicView. Conclusion The results show that a circularization and rotation pre-processing step significantly improves the efficiency of public available multiple sequence alignment algorithms when used in the

  19. Synthesis of phylogeny and taxonomy into a comprehensive tree of life

    PubMed Central

    Hinchliff, Cody E.; Smith, Stephen A.; Allman, James F.; Burleigh, J. Gordon; Chaudhary, Ruchi; Coghill, Lyndon M.; Crandall, Keith A.; Deng, Jiabin; Drew, Bryan T.; Gazis, Romina; Gude, Karl; Hibbett, David S.; Katz, Laura A.; Laughinghouse, H. Dail; McTavish, Emily Jane; Midford, Peter E.; Owen, Christopher L.; Ree, Richard H.; Rees, Jonathan A.; Soltis, Douglas E.; Williams, Tiffani; Cranston, Karen A.

    2015-01-01

    Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips—the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics. PMID:26385966

  20. Synthesis of phylogeny and taxonomy into a comprehensive tree of life.

    PubMed

    Hinchliff, Cody E; Smith, Stephen A; Allman, James F; Burleigh, J Gordon; Chaudhary, Ruchi; Coghill, Lyndon M; Crandall, Keith A; Deng, Jiabin; Drew, Bryan T; Gazis, Romina; Gude, Karl; Hibbett, David S; Katz, Laura A; Laughinghouse, H Dail; McTavish, Emily Jane; Midford, Peter E; Owen, Christopher L; Ree, Richard H; Rees, Jonathan A; Soltis, Douglas E; Williams, Tiffani; Cranston, Karen A

    2015-10-13

    Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips-the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics. PMID:26385966

  1. Investigating how students communicate tree-thinking

    NASA Astrophysics Data System (ADS)

    Boyce, Carrie Jo

    Learning is often an active endeavor that requires students work at building conceptual understandings of complex topics. Personal experiences, ideas, and communication all play large roles in developing knowledge of and understanding complex topics. Sometimes these experiences can promote formation of scientifically inaccurate or incomplete ideas. Representations are tools used to help individuals understand complex topics. In biology, one way that educators help people understand evolutionary histories of organisms is by using representations called phylogenetic trees. In order to understand phylogenetics trees, individuals need to understand the conventions associated with phylogenies. My dissertation, supported by the Tree-Thinking Representational Competence and Word Association frameworks, is a mixed-methods study investigating the changes in students' tree-reading, representational competence and mental association of phylogenetic terminology after participation in varied instruction. Participants included 128 introductory biology majors from a mid-sized southern research university. Participants were enrolled in either Introductory Biology I, where they were not taught phylogenetics, or Introductory Biology II, where they were explicitly taught phylogenetics. I collected data using a pre- and post-assessment consisting of a word association task and tree-thinking diagnostic (n=128). Additionally, I recruited a subset of students from both courses (n=37) to complete a computer simulation designed to teach students about phylogenetic trees. I then conducted semi-structured interviews consisting of a word association exercise with card sort task, a retrospective pre-assessment discussion, a post-assessment discussion, and interview questions. I found that students who received explicit lecture instruction had a significantly higher increase in scores on a tree-thinking diagnostic than students who did not receive lecture instruction. Students who received both

  2. Genome-Wide Analysis of Oleosin Gene Family in 22 Tree Species: An Accelerator for Metabolic Engineering of BioFuel Crops and Agrigenomics Industrial Applications?

    PubMed Central

    2015-01-01

    Abstract Trees contribute to enormous plant oil reserves because many trees contain 50%–80% of oil (triacylglycerols, TAGs) in the fruits and kernels. TAGs accumulate in subcellular structures called oil bodies/droplets, in which TAGs are covered by low-molecular-mass hydrophobic proteins called oleosins (OLEs). The OLEs/TAGs ratio determines the size and shape of intracellular oil bodies. There is a lack of comprehensive sequence analysis and structural information of OLEs among diverse trees. The objectives of this study were to identify OLEs from 22 tree species (e.g., tung tree, tea-oil tree, castor bean), perform genome-wide analysis of OLEs, classify OLEs, identify conserved sequence motifs and amino acid residues, and predict secondary and three-dimensional structures in tree OLEs and OLE subfamilies. Data mining identified 65 OLEs with perfect conservation of the “proline knot” motif (PX5SPX3P) from 19 trees. These OLEs contained >40% hydrophobic amino acid residues. They displayed similar properties and amino acid composition. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that these proteins could be classified into five OLE subfamilies. There were distinct patterns of sequence conservation among the OLE subfamilies and within individual tree species. Computational modeling indicated that OLEs were composed of at least three α-helixes connected with short coils without any β-strand and that they exhibited distinct 3D structures and ligand binding sites. These analyses provide fundamental information in the similarity and specificity of diverse OLE isoforms within the same subfamily and among the different species, which should facilitate studying the structure-function relationship and identify critical amino acid residues in OLEs for metabolic engineering of tree TAGs. PMID:26258573

  3. Genome-Wide Analysis of Oleosin Gene Family in 22 Tree Species: An Accelerator for Metabolic Engineering of BioFuel Crops and Agrigenomics Industrial Applications?

    PubMed

    Cao, Heping

    2015-09-01

    Trees contribute to enormous plant oil reserves because many trees contain 50%-80% of oil (triacylglycerols, TAGs) in the fruits and kernels. TAGs accumulate in subcellular structures called oil bodies/droplets, in which TAGs are covered by low-molecular-mass hydrophobic proteins called oleosins (OLEs). The OLEs/TAGs ratio determines the size and shape of intracellular oil bodies. There is a lack of comprehensive sequence analysis and structural information of OLEs among diverse trees. The objectives of this study were to identify OLEs from 22 tree species (e.g., tung tree, tea-oil tree, castor bean), perform genome-wide analysis of OLEs, classify OLEs, identify conserved sequence motifs and amino acid residues, and predict secondary and three-dimensional structures in tree OLEs and OLE subfamilies. Data mining identified 65 OLEs with perfect conservation of the "proline knot" motif (PX5SPX3P) from 19 trees. These OLEs contained >40% hydrophobic amino acid residues. They displayed similar properties and amino acid composition. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that these proteins could be classified into five OLE subfamilies. There were distinct patterns of sequence conservation among the OLE subfamilies and within individual tree species. Computational modeling indicated that OLEs were composed of at least three α-helixes connected with short coils without any β-strand and that they exhibited distinct 3D structures and ligand binding sites. These analyses provide fundamental information in the similarity and specificity of diverse OLE isoforms within the same subfamily and among the different species, which should facilitate studying the structure-function relationship and identify critical amino acid residues in OLEs for metabolic engineering of tree TAGs. PMID:26258573

  4. Phyloclimatic modeling: combining phylogenetics and bioclimatic modeling.

    PubMed

    Yesson, C; Culham, A

    2006-10-01

    We investigate the impact of past climates on plant diversification by tracking the "footprint" of climate change on a phylogenetic tree. Diversity within the cosmopolitan carnivorous plant genus Drosera (Droseraceae) is focused within Mediterranean climate regions. We explore whether this diversity is temporally linked to Mediterranean-type climatic shifts of the mid-Miocene and whether climate preferences are conservative over phylogenetic timescales. Phyloclimatic modeling combines environmental niche (bioclimatic) modeling with phylogenetics in order to study evolutionary patterns in relation to climate change. We present the largest and most complete such example to date using Drosera. The bioclimatic models of extant species demonstrate clear phylogenetic patterns; this is particularly evident for the tuberous sundews from southwestern Australia (subgenus Ergaleium). We employ a method for establishing confidence intervals of node ages on a phylogeny using replicates from a Bayesian phylogenetic analysis. This chronogram shows that many clades, including subgenus Ergaleium and section Bryastrum, diversified during the establishment of the Mediterranean-type climate. Ancestral reconstructions of bioclimatic models demonstrate a pattern of preference for this climate type within these groups. Ancestral bioclimatic models are projected into palaeo-climate reconstructions for the time periods indicated by the chronogram. We present two such examples that each generate plausible estimates of ancestral lineage distribution, which are similar to their current distributions. This is the first study to attempt bioclimatic projections on evolutionary time scales. The sundews appear to have diversified in response to local climate development. Some groups are specialized for Mediterranean climates, others show wide-ranging generalism. This demonstrates that Phyloclimatic modeling could be repeated for other plant groups and is fundamental to the understanding of

  5. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    PubMed

    Chen, Shu-Chuan; Ogata, Aaron

    2015-01-01

    The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process. PMID:25826378

  6. Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics.

    PubMed

    Andújar, Carmelo; Arribas, Paula; Ruzicka, Filip; Crampton-Platt, Alex; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-07-01

    High-throughput DNA methods hold great promise for the study of taxonomically intractable mesofauna of the soil. Here, we assess species diversity and community structure in a phylogenetic framework, by sequencing total DNA from bulk specimen samples and assembly of mitochondrial genomes. The combination of mitochondrial metagenomics and DNA barcode sequencing of 1494 specimens in 69 soil samples from three geographic regions in southern Iberia revealed >300 species of soil Coleoptera (beetles) from a broad spectrum of phylogenetic lineages. A set of 214 mitochondrial sequences longer than 3000 bp was generated and used to estimate a well-supported phylogenetic tree of the order Coleoptera. Shorter sequences, including cox1 barcodes, were placed on this mitogenomic tree. Raw Illumina reads were mapped against all available sequences to test for species present in local samples. This approach simultaneously established the species richness, phylogenetic composition and community turnover at species and phylogenetic levels. We find a strong signature of vertical structuring in soil fauna that shows high local community differentiation between deep soil and superficial horizons at phylogenetic levels. Within the two vertical layers, turnover among regions was primarily at the tip (species) level and was stronger in the deep soil than leaf litter communities, pointing to layer-mediated drivers determining species diversification, spatial structure and evolutionary assembly of soil communities. This integrated phylogenetic framework opens the application of phylogenetic community ecology to the mesofauna of the soil, among the most diverse and least well-understood ecosystems, and will propel both theoretical and applied soil science. PMID:25865150

  7. Fast Tree: Computing Large Minimum-Evolution Trees with Profiles instead of a Distance Matrix

    SciTech Connect

    N. Price, Morgan; S. Dehal, Paramvir; P. Arkin, Adam

    2009-07-31

    Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

  8. 6. Aerial view of turnpike alignment running from lower left ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    6. Aerial view of turnpike alignment running from lower left diagonally up to right along row of trees. Migel Estate and Farm buildings (HABS No. NY-6356) located at lower right of photograph. W.K. Smith house (HABS No. NY-6356-A) located within clump of trees at lower center, with poultry houses (HABS No. NY-6356-F and G) visible left of the clump of trees. View looking south. - Orange Turnpike, Parallel to new Orange Turnpike, Monroe, Orange County, NY

  9. Talking Trees

    ERIC Educational Resources Information Center

    Tolman, Marvin

    2005-01-01

    Students love outdoor activities and will love them even more when they build confidence in their tree identification and measurement skills. Through these activities, students will learn to identify the major characteristics of trees and discover how the pace--a nonstandard measuring unit--can be used to estimate not only distances but also the…

  10. Tree Amigos.

    ERIC Educational Resources Information Center

    Center for Environmental Study, Grand Rapids, MI.

    Tree Amigos is a special cross-cultural program that uses trees as a common bond to bring the people of the Americas together in unique partnerships to preserve and protect the shared global environment. It is a tangible program that embodies the philosophy that individuals, acting together, can make a difference. This resource book contains…

  11. Evaluating Phylogenetic Informativeness as a Predictor of Phylogenetic Signal for Metazoan, Fungal, and Mammalian Phylogenomic Data Sets

    PubMed Central

    López-Giráldez, Francesc; Moeller, Andrew H.; Townsend, Jeffrey P.

    2013-01-01

    Phylogenetic research is often stymied by selection of a marker that leads to poor phylogenetic resolution despite considerable cost and effort. Profiles of phylogenetic informativeness provide a quantitative measure for prioritizing gene sampling to resolve branching order in a particular epoch. To evaluate the utility of these profiles, we analyzed phylogenomic data sets from metazoans, fungi, and mammals, thus encompassing diverse time scales and taxonomic groups. We also evaluated the utility of profiles created based on simulated data sets. We found that genes selected via their informativeness dramatically outperformed haphazard sampling of markers. Furthermore, our analyses demonstrate that the original phylogenetic informativeness method can be extended to trees with more than four taxa. Thus, although the method currently predicts phylogenetic signal without specifically accounting for the misleading effects of stochastic noise, it is robust to the effects of homoplasy. The phylogenetic informativeness rankings obtained will allow other researchers to select advantageous genes for future studies within these clades, maximizing return on effort and investment. Genes identified might also yield efficient experimental designs for phylogenetic inference for many sister clades and outgroup taxa that are closely related to the diverse groups of organisms analyzed. PMID:23878813

  12. A phylogenetic blueprint for a modern whale.

    PubMed

    Gatesy, John; Geisler, Jonathan H; Chang, Joseph; Buell, Carl; Berta, Annalisa; Meredith, Robert W; Springer, Mark S; McGowen, Michael R

    2013-02-01

    The emergence of Cetacea in the Paleogene represents one of the most profound macroevolutionary transitions within Mammalia. The move from a terrestrial habitat to a committed aquatic lifestyle engendered wholesale changes in anatomy, physiology, and behavior. The results of this remarkable transformation are extant whales that include the largest, biggest brained, fastest swimming, loudest, deepest diving mammals, some of which can detect prey with a sophisticated echolocation system (Odontoceti - toothed whales), and others that batch feed using racks of baleen (Mysticeti - baleen whales). A broad-scale reconstruction of the evolutionary remodeling that culminated in extant cetaceans has not yet been based on integration of genomic and paleontological information. Here, we first place Cetacea relative to extant mammalian diversity, and assess the distribution of support among molecular datasets for relationships within Artiodactyla (even-toed ungulates, including Cetacea). We then merge trees derived from three large concatenations of molecular and fossil data to yield a composite hypothesis that encompasses many critical events in the evolutionary history of Cetacea. By combining diverse evidence, we infer a phylogenetic blueprint that outlines the stepwise evolutionary development of modern whales. This hypothesis represents a starting point for more detailed, comprehensive phylogenetic reconstructions in the future, and also highlights the synergistic interaction between modern (genomic) and traditional (morphological+paleontological) approaches that ultimately must be exploited to provide a rich understanding of evolutionary history across the entire tree of Life. PMID:23103570

  13. Phylogenetic relationships of marine bacteria, mainly members of the family Vibrionaceae, determined on the basis of 16S rRNA sequences.

    PubMed

    Kita-Tsukamoto, K; Oyaizu, H; Nanba, K; Simidu, U

    1993-01-01

    The phylogenetic relationships of 50 reference strains, mostly marine bacteria which require Na+ for growth, were determined on the basis of 600 16S rRNA nucleotides by using reverse transcriptase sequencing. Strains belonging to 10 genera were included (four genera of the family Vibrionaceae, the genus Aeromonas of the family Aeromonadaceae, and the genera Alteromonas, Marinomonas, Shewanella, Pseudomonas, and Deleya). The sequences were aligned, the similarity values and evolutionary distance values were determined, and a phylogenetic tree was constructed by using the neighbor-joining method. On the basis of our results, the family Vibrionaceae was separated into at least seven groups (genera and families). Vibrio marinus clearly was on a line of descent that was remote from other vibrios. As determined by the similarity and evolutionary distance values, V. marinus is more distantly related to the family Vibrionaceae than the members of the Aeromonadaceae are. Also, Vibrio cholerae strains formed a separate group with Vibrio mimicus at the genus level. Of 30 species of the Vibrionaceae, 17 formed a large phylogenetic cluster. The genus Listonella was found to be a heterogeneous group, and the species were distributed in various subgroups of the Vibrionaceae. The separation of the family Aeromonadaceae from the family Vibrionaceae and the separation of the genera Marinomonas and Shewanella from the genus Alteromonas were confirmed in this phylogenetic study. However, a marine Pseudomonas species, Pseudomonas nautica, was clearly separated from two terrestrial Pseudomonas species. Each group that was separated by the phylogenetic analysis had characteristic 16S rRNA sequence patterns that were common only to species in that group. Therefore, the characteristic sequences described in this paper may be useful for identification purposes. PMID:8427811

  14. The gene tree delusion.

    PubMed

    Springer, Mark S; Gatesy, John

    2016-01-01

    Higher-level relationships among placental mammals are mostly resolved, but several polytomies remain contentious. Song et al. (2012) claimed to have resolved three of these using shortcut coalescence methods (MP-EST, STAR) and further concluded that these methods, which assume no within-locus recombination, are required to unravel deep-level phylogenetic problems that have stymied concatenation. Here, we reanalyze Song et al.'s (2012) data and leverage these re-analyses to explore key issues in systematics including the recombination ratchet, gene tree stoichiometry, the proportion of gene tree incongruence that results from deep coalescence versus other factors, and simulations that compare the performance of coalescence and concatenation methods in species tree estimation. Song et al. (2012) reported an average locus length of 3.1 kb for the 447 protein-coding genes in their phylogenomic dataset, but the true mean length of these loci (start codon to stop codon) is 139.6 kb. Empirical estimates of recombination breakpoints in primates, coupled with consideration of the recombination ratchet, suggest that individual coalescence genes (c-genes) approach ∼12 bp or less for Song et al.'s (2012) dataset, three to four orders of magnitude shorter than the c-genes reported by these authors. This result has general implications for the application of coalescence methods in species tree estimation. We contend that it is illogical to apply coalescence methods to complete protein-coding sequences. Such analyses amalgamate c-genes with different evolutionary histories (i.e., exons separated by >100,000 bp), distort true gene tree stoichiometry that is required for accurate species tree inference, and contradict the central rationale for applying coalescence methods to difficult phylogenetic problems. In addition, Song et al.'s (2012) dataset of 447 genes includes 21 loci with switched taxonomic names, eight duplicated loci, 26 loci with non-homologous sequences that are

  15. Partial gene sequences for the A subunit of methyl-coenzyme M reductase (mcrI) as a phylogenetic tool for the family Methanosarcinaceae

    NASA Technical Reports Server (NTRS)

    Springer, E.; Sachs, M. S.; Woese, C. R.; Boone, D. R.

    1995-01-01

    Representatives of the family Methanosarcinaceae were analyzed phylogenetically by comparing partial sequences of their methyl-coenzyme M reductase (mcrI) genes. A 490-bp fragment from the A subunit of the gene was selected, amplified by the PCR, cloned, and sequenced for each of 25 strains belonging to the Methanosarcinaceae. The sequences obtained were aligned with the corresponding portions of five previously published sequences, and all of the sequences were compared to determine phylogenetic distances by Fitch distance matrix methods. We prepared analogous trees based on 16S rRNA sequences; these trees corresponded closely to the mcrI trees, although the mcrI sequences of pairs of organisms had 3.01 +/- 0.541 times more changes than the respective pairs of 16S rRNA sequences, suggesting that the mcrI fragment evolved about three times more rapidly than the 16S rRNA gene. The qualitative similarity of the mcrI and 16S rRNA trees suggests that transfer of genetic information between dissimilar organisms has not significantly affected these sequences, although we found inconsistencies between some mcrI distances that we measured and and previously published DNA reassociation data. It is unlikely that multiple mcrI isogenes were present in the organisms that we examined, because we found no major discrepancies in multiple determinations of mcrI sequences from the same organism. Our primers for the PCR also match analogous sites in the previously published mcrII sequences, but all of the sequences that we obtained from members of the Methanosarcinaceae were more closely related to mcrI sequences than to mcrII sequences, suggesting that members of the Methanosarcinaceae do not have distinct mcrII genes.

  16. Quantitative developmental data in a phylogenetic framework.

    PubMed

    Giannini, Norberto Pedro

    2014-12-01

    Following the embryonic period of organogenesis, most development is allometric growth, which is thought to produce most of the evolutionary morphological divergence between related species. Bivariate or multivariate coefficients of allometry are used to describe quantitative developmental data and are comparable across taxa; as such, these coefficients are amenable to direct treatment in a phylogenetic framework. Mapping of actual allometric coefficients onto phylogenetic trees is supported on the basis of the evolving nature of growth programs and the type of character (continuous) that they represent. This procedure depicts evolutionary allometry accurately and allows for the generation of reliable reconstructions of ancestral allometry, as shown here with a previously published case study on rodent cranial ontogeny. Results reconstructed the signature allometric patterns of rodents to the root of the phylogeny, which could be traced back into a (minimum) Paleocene age. Both character and statistical dependence need to be addressed, so this approach can be integrated with phylogenetic comparative methods that deal with those issues. It is shown that, in this particular sample of rodents, common ancestry explains little allometric variation given the level of divergence present within, and convergence between, major rodent lineages. Furthermore, all that variation is independent of body mass. Thus, from an evolutionary perspective, allometry appears to have a strong functional and likely adaptive basis. PMID:25130201

  17. Testing and quantifying phylogenetic signals and homoplasy in morphometric data.

    PubMed

    Klingenberg, Christian Peter; Gidaszewski, Nelly A

    2010-05-01

    The relationship between morphometrics and phylogenetic analysis has long been controversial. Here we propose an approach that is based on mapping morphometric traits onto phylogenies derived from other data and thus avoids the pitfalls encountered by previous studies. This method treats shape as a single, multidimensional character. We propose a test for the presence of a phylogenetic signal in morphometric data, which simulates the null hypothesis of the complete absence of phylogenetic structure by permutation of the shape data among the terminal taxa. We also propose 2 measures of the fit of morphometric data to the phylogeny that are direct extensions of the consistency index and retention index used in traditional cladistics. We apply these methods to a small study of the evolution of wing shape in the Drosophila melanogaster subgroup, for which a very strongly supported phylogeny is available. This case study reveals a significant phylogenetic signal and a relatively low degree of homoplasy. Despite the low homoplasy, the shortest tree computed from landmark data on wing shape is inconsistent with the well-supported phylogenetic tree from molecular data, underscoring that morphometric data may not provide reliable information for inferring phylogeny. PMID:20525633

  18. Phylogenetic analysis of anaerobic thermophilic bacteria: aid for their reclassification.

    PubMed Central

    Rainey, F A; Ward, N L; Morgan, H W; Toalster, R; Stackebrandt, E

    1993-01-01

    Small subunit rDNA sequences were determined for 20 species of the genera Acetogenium, Clostridium, Thermoanaerobacter, Thermoanaerobacterium, Thermoanaerobium, and Thermobacteroides, 3 non-validly described species, and 5 isolates of anaerobic thermophilic bacteria, providing a basis for a phylogenetic analysis of these organisms. Several species contain a version of the molecule significantly longer than that of Escherichia coli because of the presence of inserts. On the basis of normal evolutionary distances, the phylogenetic tree indicates that all bacteria investigated in this study with a maximum growth temperature above 65 degrees C form a supercluster within the subphylum of gram-positive bacteria that also contains Clostridium thermosaccharolyticum and Clostridium thermoaceticum, which have been previously sequenced. This supercluster appears to be equivalent in its phylogenetic depth to the supercluster of mesophilic clostridia and their nonspore-forming relatives. Several phylogenetically and phenotypically coherent clusters that are defined by sets of signature nucleotides emerge within the supercluster of thermophiles. Clostridium thermobutyricum and Clostridium thermopalmarium are members of Clostridium group I. A phylogenetic tree derived from transversion distances demonstrated the artificial clustering of some organisms with high rDNA G+C moles percent, i.e., Clostridium fervidus and the thermophilic, cellulolytic members of the genus Clostridium. The results of this study can be used as an aid for future taxonomic restructuring of anaerobic sporogenous and asporogenous thermophillic, gram-positive bacteria. PMID:7687600

  19. Reconstructible Phylogenetic Networks: Do Not Distinguish the Indistinguishable

    PubMed Central

    Pardi, Fabio; Scornavacca, Celine

    2015-01-01

    Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are indistinguishable. This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks. PMID:25849429

  20. Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

    PubMed

    Pardi, Fabio; Scornavacca, Celine

    2015-04-01

    Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable". This is true for all methods that evaluate a phylogenetic network solely on the basis of how well the displayed trees fit the available data, including all methods based on input data consisting of clades, triples, quartets, or trees with any number of taxa, and also sequence-based approaches such as popular formalisations of maximum parsimony and maximum likelihood for networks. This identifiability problem is partially solved by accounting for branch lengths, although this merely reduces the frequency of the problem. Here we propose that network inference methods should only attempt to reconstruct what they can uniquely identify. To this end, we introduce a novel definition of what constitutes a uniquely reconstructible network. For any given set of indistinguishable networks, we define a canonical network that, under mild assumptions, is unique and thus representative of the entire set. Given data that underwent reticulate evolution, only the canonical form of the underlying phylogenetic network can be uniquely reconstructed. While on the methodological side this will imply a drastic reduction of the solution space in network inference, for the study of reticulate evolution this is a fundamental limitation that will require an important change of perspective when interpreting phylogenetic networks. PMID:25849429

  1. Revealing pancrustacean relationships: Phylogenetic analysis of ribosomal protein genes places Collembola (springtails) in a monophyletic Hexapoda and reinforces the discrepancy between mitochondrial and nuclear DNA markers

    PubMed Central

    2008-01-01

    Background In recent years, several new hypotheses on phylogenetic relations among arthropods have been proposed on the basis of DNA sequences. One of the challenged hypotheses is the monophyly of hexapods. This discussion originated from analyses based on mitochondrial DNA datasets that, due to an unusual positioning of Collembola, suggested that the hexapod body plan evolved at least twice. Here, we re-evaluate the position of Collembola using ribosomal protein gene sequences. Results In total 48 ribosomal proteins were obtained for the collembolan Folsomia candida. These 48 sequences were aligned with sequence data on 35 other ecdysozoans. Each ribosomal protein gene was available for 25% to 86% of the taxa. However, the total sequence information was unequally distributed over the taxa and ranged between 4% and 100%. A concatenated dataset was constructed (5034 inferred amino acids in length), of which ~66% of the positions were filled. Phylogenetic tree reconstructions, using Maximum Likelihood, Maximum Parsimony, and Bayesian methods, resulted in a topology that supports monophyly of Hexapoda. Conclusion Although ribosomal proteins in general may not evolve independently, they once more appear highly valuable for phylogenetic reconstruction. Our analyses clearly suggest that Hexapoda is monophyletic. This underpins the inconsistency between nuclear and mitochondrial datasets when analyzing pancrustacean relationships. Caution is needed when applying mitochondrial markers in deep phylogeny. PMID:18366624

  2. Mammalian phylogenetic diversity-area relationships at a continental scale

    PubMed Central

    Mazel, Florent; Renaud, Julien; Guilhaumon, François; Mouillot, David; Gravel, Dominique; Thuiller, Wilfried

    2015-01-01

    In analogy to the species-area relationship (SAR), one of the few laws in Ecology, the phylogenetic diversity-area relationship (PDAR) describes the tendency of phylogenetic diversity (PD) to increase with area. Although investigating PDAR has the potential to unravel the underlying processes shaping assemblages across spatial scales and to predict PD loss through habitat reduction, it has been little investigated so far. Focusing on PD has noticeable advantages compared to species richness (SR) since PD also gives insights on processes such as speciation/extinction, assembly rules and ecosystem functioning. Here we investigate the universality and pervasiveness of the PDAR at continental scale using terrestrial mammals as study case. We define the relative robustness of PD (compared to SR) to habitat loss as the area between the standardized PDAR and standardized SAR (i.e. standardized by the diversity of the largest spatial window) divided by the area under the standardized SAR only. This metric quantifies the relative increase of PD robustness compared to SR robustness. We show that PD robustness is higher than SR robustness but that it varies among continents. We further use a null model approach to disentangle the relative effect of phylogenetic tree shape and non random spatial distribution of evolutionary history on the PDAR. We find that for most spatial scales and for all continents except Eurasia, PDARs are not different from expected by a model using only the observed SAR and the shape of the phylogenetic tree at continental scale. Interestingly, we detect a strong phylogenetic structure of the Eurasian PDAR that can be predicted by a model that specifically account for a finer biogeographical delineation of this continent. In conclusion, the relative robustness of PD to habitat loss compared to species richness is determined by the phylogenetic tree shape but also depends on the spatial structure of PD. PMID:26649401

  3. Mammalian phylogenetic diversity-area relationships at a continental scale.

    PubMed

    Mazel, Florent; Renaud, Julien; Guilhaumon, François; Mouillot, David; Gravel, Dominique; Thuiller, Wilfried

    2015-10-01

    In analogy to the species-area relationship (SAR), one of the few laws in ecology, the phylogenetic diversity-area relationship (PDAR) describes the tendency of phylogenetic diversity (PD) to increase with area. Although investigating PDAR has the potential to unravel the underlying processes shaping assemblages across spatial scales and to predict PD loss through habitat reduction, it has been little investigated so far. Focusing on PD has noticeable advantages compared to species richness (SR), since PD also gives insights on processes such as speciation/extinction, assembly rules and ecosystem functioning. Here we investigate the universality and pervasiveness of the PDAR at continental scale using terrestrial mammals as study case. We define the relative robustness of PD (compared to SR) to habitat loss as the area between the standardized PDAR and standardized SAR (i.e., standardized by the diversity of the largest spatial window) divided by the area under the standardized SAR only. This metric quantifies the relative increase of PD robustness compared to SR robustness. We show that PD robustness is higher than SR robustness but that it varies among continents. We further use a null model approach to disentangle the relative effect of phylogenetic tree shape and nonrandom spatial distribution of evolutionary history on the PDAR. We find that, for most spatial scales and for all continents except Eurasia, PDARs are not different from expected by a model using only the observed SAR and the shape of the phylogenetic tree at continental scale. Interestingly, we detect a strong phylogenetic structure of the Eurasian PDAR that can be predicted by a model that specifically account for a finer biogeographical delineation of this continent. In conclusion, the relative robustness of PD to habitat loss compared to species richness is determined by the phylogenetic tree shape but also depends on the spatial structure of PD. PMID:26649401

  4. Inferring polyploid phylogenies from multiply-labeled gene trees

    PubMed Central

    Lott, Martin; Spillner, Andreas; Huber, Katharina T; Petri, Anna; Oxelman, Bengt; Moulton, Vincent

    2009-01-01

    Background Gene trees that arise in the context of reconstructing the evolutionary history of polyploid species are often multiply-labeled, that is, the same leaf label can occur several times in a single tree. This property considerably complicates the task of forming a consensus of a collection of such trees compared to usual phylogenetic trees. Results We present a method for computing a consensus tree of multiply-labeled trees. As with the well-known greedy consensus tree approach for phylogenetic trees, our method first breaks the given collection of gene trees into a set of clusters. It then aims to insert these clusters one at a time into a tree, starting with the clusters that are supported by most of the gene trees. As the problem to decide whether a cluster can be inserted into a multiply-labeled tree is computationally hard, we have developed a heuristic method for solving this problem. Conclusion We illustrate the applicability of our method using two collections of trees for plants of the genus Silene, that involve several allopolyploids at different levels. PMID:19715596

  5. Phylogenetic position of the acariform mites: sensitivity to homology assessment under total evidence

    PubMed Central

    2010-01-01

    Background Mites (Acari) have traditionally been treated as monophyletic, albeit composed of two major lineages: Acariformes and Parasitiformes. Yet recent studies based on morphology, molecular data, or combinations thereof, have increasingly drawn their monophyly into question. Furthermore, the usually basal (molecular) position of one or both mite lineages among the chelicerates is in conflict to their morphology, and to the widely accepted view that mites are close relatives of Ricinulei. Results The phylogenetic position of the acariform mites is examined through employing SSU, partial LSU sequences, and morphology from 91 chelicerate extant terminals (forty Acariformes). In a static homology framework, molecular sequences were aligned using their secondary structure as guide, whereby regions of ambiguous alignment were discarded, and pre-aligned sequences analyzed under parsimony and different mixed models in a Bayesian inference. Parsimony and Bayesian analyses led to trees largely congruent concerning infra-ordinal, well-supported branches, but with low support for inter-ordinal relationships. An exception is Solifugae + Acariformes (P. P = 100%, J. = 0.91). In a dynamic homology framework, two analyses were run: a standard POY analysis and an analysis constrained by secondary structure. Both analyses led to largely congruent trees; supporting a (Palpigradi (Solifugae Acariformes)) clade and Ricinulei as sister group of Tetrapulmonata with the topology (Ricinulei (Amblypygi (Uropygi Araneae))). Combined analysis with two different morphological data matrices were run in order to evaluate the impact of constraining the analysis on the recovered topology when employing secondary structure as a guide for homology establishment. The constrained combined analysis yielded two topologies similar to the exclusively molecular analysis for both morphological matrices, except for the recovery of Pedipalpi instead of the (Uropygi Araneae) clade. The standard (direct

  6. Molecular phylogenetic relationships between prostanoid-containing Okinawan soft coral ( Clavularia viridis) and nonprostanoid-containing Clavularia species based on ribosomal ITS sequence.

    PubMed

    Fujiwara, Shoko; Yasui, Kazuyuki; Watanabe, Kinzo; Wakabayashi, Takako; Tsuzuki, Mikio; Iguchi, Kazuo

    2003-01-01

    To study phylogenetic relationships among Okinawan soft corals of the genus Clavularia, the ribosomal internal transcribed spacer sequences of host corals and the 18S rDNA sequences of symbiotic algae were analyzed. The molecular phylogenetic trees of hosts showed that a prostanoid-containing species, Clavularia viridis, is deeply diverged from other species of Clavularia which do not biosynthesize the prostanoids as the main secondary metabolites. Comparison of their trees suggested poor phylogenetic concordance between hosts and symbionts. PMID:14719169

  7. Molecular evolution of rDNA in early diverging Metazoa: First comparative analysis and phylogenetic application of complete SSU rRNA secondary structures in Porifera

    PubMed Central

    2008-01-01

    Background The cytoplasmic ribosomal small subunit (SSU, 18S) ribosomal RNA (rRNA) is the most frequently-used gene for molecular phylogenetic studies. However, information regarding its secondary structure is neglected in most phylogenetic analyses. Incorporation of this information is essential in order to apply specific rRNA evolutionary models to overcome the problem of co-evolution of paired sites, which violates the basic assumption of the independent evolution of sites made by most phylogenetic methods. Information about secondary structure also supports the process of aligning rRNA sequences across taxa. Both aspects have been shown to increase the accuracy of phylogenetic reconstructions within various taxa. Here, we explore SSU rRNA secondary structures from the three extant classes of Phylum Porifera (Grant, 1836), a pivotal, but largely unresolved taxon of early branching Metazoa. This is the first phylogenetic study of poriferan SSU rRNA data to date that includes detailed comparative secondary structure information for all three sponge classes. Results We found base compositional and structural differences in SSU rRNA among Demospongiae, Hexactinellida (glass sponges) and Calcarea (calcareous sponges). We showed that analyses of primary rRNA sequences, including secondary structure-specific evolutionary models, in combination with reconstruction of the evolution of unusual structural features, reveal a substantial amount of additional information. Of special note was the finding that the gene tree topologies of marine haplosclerid demosponges, which are inconsistent with the current morphology-based classification, are supported by our reconstructed evolution of secondary structure features. Therefore, these features can provide alternative support for sequence-based topologies and give insights into the evolution of the molecule itself. To encourage and facilitate the application of rRNA models in phylogenetics of early metazoans, we present 52 SSU r

  8. Effects of memory on the shapes of simple outbreak trees

    PubMed Central

    Plazzotta, Giacomo; Kwan, Christopher; Boyd, Michael; Colijn, Caroline

    2016-01-01

    Genomic tools, including phylogenetic trees derived from sequence data, are increasingly used to understand outbreaks of infectious diseases. One challenge is to link phylogenetic trees to patterns of transmission. Particularly in bacteria that cause chronic infections, this inference is affected by variable infectious periods and infectivity over time. It is known that non-exponential infectious periods can have substantial effects on pathogens’ transmission dynamics. Here we ask how this non-Markovian nature of an outbreak process affects the branching trees describing that process, with particular focus on tree shapes. We simulate Crump-Mode-Jagers branching processes and compare different patterns of infectivity over time. We find that memory (non-Markovian-ness) in the process can have a pronounced effect on the shapes of the outbreak’s branching pattern. However, memory also has a pronounced effect on the sizes of the trees, even when the duration of the simulation is fixed. When the sizes of the trees are constrained to a constant value, memory in our processes has little direct effect on tree shapes, but can bias inference of the birth rate from trees. We compare simulated branching trees to phylogenetic trees from an outbreak of tuberculosis in Canada, and discuss the relevance of memory to this dataset. PMID:26888437

  9. Effects of memory on the shapes of simple outbreak trees.

    PubMed

    Plazzotta, Giacomo; Kwan, Christopher; Boyd, Michael; Colijn, Caroline

    2016-01-01

    Genomic tools, including phylogenetic trees derived from sequence data, are increasingly used to understand outbreaks of infectious diseases. One challenge is to link phylogenetic trees to patterns of transmission. Particularly in bacteria that cause chronic infections, this inference is affected by variable infectious periods and infectivity over time. It is known that non-exponential infectious periods can have substantial effects on pathogens' transmission dynamics. Here we ask how this non-Markovian nature of an outbreak process affects the branching trees describing that process, with particular focus on tree shapes. We simulate Crump-Mode-Jagers branching processes and compare different patterns of infectivity over time. We find that memory (non-Markovian-ness) in the process can have a pronounced effect on the shapes of the outbreak's branching pattern. However, memory also has a pronounced effect on the sizes of the trees, even when the duration of the simulation is fixed. When the sizes of the trees are constrained to a constant value, memory in our processes has little direct effect on tree shapes, but can bias inference of the birth rate from trees. We compare simulated branching trees to phylogenetic trees from an outbreak of tuberculosis in Canada, and discuss the relevance of memory to this dataset. PMID:26888437

  10. TreeRipper web application: towards a fully automated optical tree recognition software

    PubMed Central

    2011-01-01

    Background Relationships between species, genes and genomes have been printed as trees for over a century. Whilst this may have been the best format for exchanging and sharing phylogenetic hypotheses during the 20th century, the worldwide web now provides faster and automated ways of transferring and sharing phylogenetic knowledge. However, novel software is needed to defrost these published phylogenies for the 21st century. Results TreeRipper is a simple website for the fully-automated recognition of multifurcating phylogenetic trees (http://linnaeus.zoology.gla.ac.uk/~jhughes/treeripper/). The program accepts a range of input image formats (PNG, JPG/JPEG or GIF). The underlying command line c++ program follows a number of cleaning steps to detect lines, remove node labels, patch-up broken lines and corners and detect line edges. The edge contour is then determined to detect the branch length, tip label positions and the topology of the tree. Optical Character Recognition (OCR) is used to convert the tip labels into text with the freely available tesseract-ocr software. 32% of images meeting the prerequisites for TreeRipper were successfully recognised, the largest tree had 115 leaves. Conclusions Despite the diversity of ways phylogenies have been illustrated making the design of a fully automated tree recognition software difficult, TreeRipper is a step towards automating the digitization of past phylogenies. We also provide a dataset of 100 tree images and associated tree files for training and/or benchmarking future software. TreeRipper is an open source project licensed under the GNU General Public Licence v3. PMID:21599881

  11. Nuclear Ribosomal ITS Functional Paralogs Resolve the Phylogenetic Relationships of a Late-Miocene Radiation Cycad Cycas (Cycadaceae)

    PubMed Central

    Xiao, Long-Qian; Möller, Michael

    2015-01-01

    Cycas is the most widespread and diverse genus among the ancient cycads, but the extant species could be the product of late Miocene rapid radiations. Taxonomic treatments to date for this genus are quite controversial, which makes it difficult to elucidate its evolutionary history. We cloned 161 genomic ITS sequences from 31 species representing all sections of Cycas. The divergent ITS paralogs were examined within each species and identified as putative pseudogenes, recombinants and functional paralogs. Functional paralogs were used to reconstruct phylogenetic relationships with pseudogene sequences as molecular outgroups, since an unambiguous ITS sequence alignment with their closest relatives, the Zamiaceae, is unachievable. A fully resolved and highly supported tree topology was obtained at the section level, with two major clades including six minor clades. The results fully supported the classification scheme proposed by Hill (2004) at the section level, with the minor clades representing his six sections. The two major clades could be recognised as two subgenera. The obtained pattern of phylogenetic relationships, combined with the different seed dispersal capabilities and paleogeography, allowed us to propose a late Miocene rapid radiation of Cycas that might have been promoted by vicariant events associated with the complex topography and orogeny of South China and adjacent regions. In contrast, transoceanic dispersals might have played an important role in the rapid diversification of sect. Cycas, whose members have evolved a spongy layer in their seeds aiding water dispersals. PMID:25635842

  12. A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data.

    PubMed

    Pagel, Mark; Meade, Andrew

    2004-08-01

    We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate-variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program. PMID:15371247

  13. nWayComp: a genome-wide sequence comparison tool for multiple strains/species of phylogenetically related microorganisms.

    PubMed

    Yao, Jiqiang; Lin, Hong; Doddapaneni, Harshavardhan; Civerolo, Edwin L

    2007-01-01

    The increasing number of whole genomic sequences of microorganisms has led to the complexity of genome-wide annotation and gene sequence comparison among multiple microorganisms. To address this problem, we have developed nWayComp software that compares DNA and protein sequences of phylogenetically-related microorganisms. This package integrates a series of bioinformatics tools such as BLAST, ClustalW, ALIGN, PHYLIP and PRIMER3 for sequence comparison. It searches for homologous sequences among multiple organisms and identifies genes that are unique to a particular organism. The homologous gene sets are then ranked in the descending order of the sequence similarity. For each set of homologous sequences, a table of sequence identity among homologous genes along with sequence variations such as SNPs and INDELS is developed, and a phylogenetic tree is constructed. In addition, a common set of primers that can amplify all the homologous sequences are generated. The nWayComp package provides users with a quick and convenient tool to compare genomic sequences among multiple organisms at the whole-genome level. PMID:17688445

  14. On the existence of infinitely many universal tree-based networks.

    PubMed

    Hayamizu, Momoko

    2016-05-01

    A tree-based network on a set X of n leaves is said to be universal if any rooted binary phylogenetic tree on X can be its base tree. Francis and Steel showed that there is a universal tree-based network on X in the case of n = 3, and asked whether such a network exists in general. We settle this problem by proving that there are infinitely many universal tree-based networks for any n>1. PMID:26921465

  15. Phylogenetic relationships among arecoid palms (Arecaceae: Arecoideae)

    PubMed Central

    Baker, William J.; Norup, Maria V.; Clarkson, James J.; Couvreur, Thomas L. P.; Dowe, John L.; Lewis, Carl E.; Pintaud, Jean-Christophe; Savolainen, Vincent; Wilmot, Tomas; Chase, Mark W.

    2011-01-01

    Background and Aims The Arecoideae is the largest and most diverse of the five subfamilies of palms (Arecaceae/Palmae), containing >50 % of the species in the family. Despite its importance, phylogenetic relationships among Arecoideae are poorly understood. Here the most densely sampled phylogenetic analysis of Arecoideae available to date is presented. The results are used to test the current classification of the subfamily and to identify priority areas for future research. Methods DNA sequence data for the low-copy nuclear genes PRK and RPB2 were collected from 190 palm species, covering 103 (96 %) genera of Arecoideae. The data were analysed using the parsimony ratchet, maximum likelihood, and both likelihood and parsimony bootstrapping. Key Results and Conclusions Despite the recovery of paralogues and pseudogenes in a small number of taxa, PRK and RPB2 were both highly informative, producing well-resolved phylogenetic trees with many nodes well supported by bootstrap analyses. Simultaneous analyses of the combined data sets provided additional resolution and support. Two areas of incongruence between PRK and RPB2 were strongly supported by the bootstrap relating to the placement of tribes Chamaedoreeae, Iriarteeae and Reinhardtieae; the causes of this incongruence remain uncertain. The current classification within Arecoideae was strongly supported by the present data. Of the 14 tribes and 14 sub-tribes in the classification, only five sub-tribes from tribe Areceae (Basseliniinae, Linospadicinae, Oncospermatinae, Rhopalostylidinae and Verschaffeltiinae) failed to receive support. Three major higher level clades were strongly supported: (1) the RRC clade (Roystoneeae, Reinhardtieae and Cocoseae), (2) the POS clade (Podococceae, Oranieae and Sclerospermeae) and (3) the core arecoid clade (Areceae, Euterpeae, Geonomateae, Leopoldinieae, Manicarieae and Pelagodoxeae). However, new data sources are required to elucidate ambiguities that remain in phylogenetic

  16. Bayesian phylogenetic estimation of fossil ages.

    PubMed

    Drummond, Alexei J; Stadler, Tanja

    2016-07-19

    Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth-death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the 'morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses.This article is part of the themed issue 'Dating species divergences using

  17. Bayesian phylogenetic estimation of fossil ages

    PubMed Central

    Drummond, Alexei J.; Stadler, Tanja

    2016-01-01

    Recent advances have allowed for both morphological fossil evidence and molecular sequences to be integrated into a single combined inference of divergence dates under the rule of Bayesian probability. In particular, the fossilized birth–death tree prior and the Lewis-Mk model of discrete morphological evolution allow for the estimation of both divergence times and phylogenetic relationships between fossil and extant taxa. We exploit this statistical framework to investigate the internal consistency of these models by producing phylogenetic estimates of the age of each fossil in turn, within two rich and well-characterized datasets of fossil and extant species (penguins and canids). We find that the estimation accuracy of fossil ages is generally high with credible intervals seldom excluding the true age and median relative error in the two datasets of 5.7% and 13.2%, respectively. The median relative standard error (RSD) was 9.2% and 7.2%, respectively, suggesting good precision, although with some outliers. In fact, in the two datasets we analyse, the phylogenetic estimate of fossil age is on average less than 2 Myr from the mid-point age of the geological strata from which it was excavated. The high level of internal consistency found in our analyses suggests that the Bayesian statistical model employed is an adequate fit for both the geological and morphological data, and provides evidence from real data that the framework used can accurately model the evolution of discrete morphological traits coded from fossil and extant taxa. We anticipate that this approach will have diverse applications beyond divergence time dating, including dating fossils that are temporally unconstrained, testing of the ‘morphological clock', and for uncovering potential model misspecification and/or data errors when controversial phylogenetic hypotheses are obtained based on combined divergence dating analyses. This article is part of the themed issue ‘Dating species divergences

  18. Phylogenetic Comparative Assembly

    NASA Astrophysics Data System (ADS)

    Husemann, Peter; Stoye, Jens

    Recent high throughput sequencing technologies are capable of generating a huge amount of data for bacterial genome sequencing projects. Although current sequence assemblers successfully merge the overlapping reads, often several contigs remain which cannot be assembled any further. It is still costly and time consuming to close all the gaps in order to acquire the whole genomic sequence. Here we propose an algorithm that takes several related genomes and their phylogenetic relationships into account to create a contig adjacency graph. From this a layout graph can be computed which indicates putative adjacencies of the contigs in order to aid biologists in finishing the complete genomic sequence.

  19. Molecular phylogenetic analysis among bryophytes and tracheophytes based on combined data of plastid coded genes and the 18S rRNA gene.

    PubMed

    Nishiyama, T; Kato, M

    1999-08-01

    The basal relationship of bryophytes and tracheophytes is problematic in land plant phylogeny. In addition to cladistic analyses of morphological data, molecular phylogenetic analyses of the nuclear small-subunit ribosomal RNA gene and the plastic gene rbcL have been performed, but no confident conclusions have been reached. Using the maximum-likelihood (ML) method, we analyzed 4,563 bp of aligned sequences from plastid protein-coding genes and 1,680 bp from the nuclear 18S rRNA gene. In the ML tree of deduced amino acid sequences of the plastid genes, hornworts were basal among the land plants, while mosses and liverworts each formed a clade and were sister to each other. Total-evidence evaluation of rRNA data and plastid protein-coding genes by TOTALML had an almost identical result. PMID:10474899

  20. The probability of topological concordance of gene trees and species trees.

    PubMed

    Rosenberg, Noah A

    2002-03-01

    The concordance of gene trees and species trees is reconsidered in detail, allowing for samples of arbitrary size to be taken from the species. A sense of concordance for gene tree and species tree topologies is clarified, such that if the "collapsed gene tree" produced by a gene tree has the same topology as the species tree, the gene tree is said to be topologically concordant with the species tree. The term speciodendric is introduced to refer to genes whose trees are topologically concordant with species trees. For a given three-species topology, probabilities of each of the three possible collapsed gene tree topologies are given, as are probabilities of monophyletic concordance and concordance in the sense of N. Takahata (1989), Genetics 122, 957-966. Increasing the sample size is found to increase the probability of topological concordance, but a limit exists on how much the topological concordance probability can be increased. Suggested sample sizes beyond which this probability can be increased only minimally are given. The results are discussed in terms of implications for molecular studies of phylogenetics and speciation. PMID:11969392

  1. Molecular phylogenetic perspectives for character classification and convergence: Framing some issues with nematode vulval appendages and telotylenchid tail termini

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Characters flagged as convergent based on newer molecular phylogenetic trees inform both practical identification and more esoteric classification. Nematode morphological characters such as lateral lines, bullae and laciniae are quite independent structures from those similarly named in other organi...

  2. Breakpoint Distance and PQ-Trees

    NASA Astrophysics Data System (ADS)

    Jiang, Haitao; Chauve, Cedric; Zhu, Binhai

    The PQ-tree is a fundamental data structure that can encode large sets of permutations. It has recently been used in comparative genomics to model ancestral genomes with some uncertainty: given a phylogeny for some species, extant genomes are represented by permutations on the leaves of the tree, and each internal node in the phylogenetic tree represents an extinct ancestral genome, represented by a PQ-tree. An open problem related to this approach is then to quantify the evolution between genomes represented by PQ-trees. In this paper we present results for two problems of PQ-tree comparison motivated by this application. First, we show that the problem of comparing two PQ-trees by computing the minimum breakpoint distance among all pairs of permutations generated respectively by the two considered PQ-trees is NP-complete for unsigned permutations. Next, we consider a generalization of the classical Breakpoint Median problem, where an ancestral genome is represented by a PQ-tree and p permutations are given, with p ≥ 1, and we want to compute a permutation generated by the PQ-tree that minimizes the sum of the breakpoint distances to the p permutations. We show that this problem is Fixed-Parameter Tractable with respect to the breakpoint distance value. This last result applies both on signed and unsigned permutations, and to uni-chromosomal and multi-chromosomal permutations.

  3. Molecular cloning and expression of a novel MYB transcription factor gene in rubber tree.

    PubMed

    Qin, Bi; Zhang, Yu; Wang, Meng

    2014-12-01

    MYB family proteins regulate a variety of cellular processes in plants. Tapping panel dryness (TPD) in rubber tree (Hevea brasiliensis Muell. Arg.) affects latex biosynthesis and causes serious losses to rubber producers. In this study, a novel SANT/MYB transcription factor gene down-regulated in TPD rubber tree, named as HbSM1, was isolated from rubber tree. The complete HbSM1 open reading frame (ORF) was 948 bp in length. The deduced HbSM1 protein is 315 amino acids. HbSM1 belonged to 1RMYB subfamily with a single SANT domain. Sequence alignment revealed that HbSM1 had high homology with MYB members from Ricinus communis and Manihot esculenta, with 72 and 78 % identity, respectively. Moreover, HbSM1 shared 56 % identity with Glycine max GmMYB176. Phylogenetic analysis revealed that HbSM1, GmMYB176, rice OsMYBS2, and OsMYBS3 fell into the same cluster with 93 % bootstrap support value. Comparing expression among different tissues demonstrated that HbSM1 was ubiquitously expressed in all tissues, but it appeared to be preferentially expressed in leaf and latex. Furthermore, HbSM1 transcripts were significantly induced by various phytohormones (including gibberellic acid, ethephon, methyl jasmonate, salicylic acid, and abscisic acid) and wounding treatments. These results suggested that HbSM1 might play multiple roles in plant development via different phytohormones signaling pathways. PMID:25195053

  4. Nearest Alignment Space Termination

    2006-07-13

    Near Alignment Space Termination (NAST) is the Greengenes algorithm that matches up submitted sequences with the Greengenes database to look for similarities and align the submitted sequences based on those similarities.

  5. The Phylogenetic Diversity of Metagenomes

    PubMed Central

    Kembel, Steven W.; Eisen, Jonathan A.; Pollard, Katherine S.; Green, Jessica L.

    2011-01-01

    Phylogenetic diversity—patterns of phylogenetic relatedness among organisms in ecological communities—provides important insights into the mechanisms underlying community assembly. Studies that measure phylogenetic diversity in microbial communities have primarily been limited to a single marker gene approach, using the small subunit of the rRNA gene (SSU-rRNA) to quantify phylogenetic relationships among microbial taxa. In this study, we present an approach for inferring phylogenetic relationships among microorganisms based on the random metagenomic sequencing of DNA fragments. To overcome challenges caused by the fragmentary nature of metagenomic data, we leveraged fully sequenced bacterial genomes as a scaffold to enable inference of phylogenetic relationships among metagenomic sequences from multiple phylogenetic marker gene families. The resulting metagenomic phylogeny can be used to quantify the phylogenetic diversity of microbial communities based on metagenomic data sets. We applied this method to understand patterns of microbial phylogenetic diversity and community assembly along an oceanic depth gradient, and compared our findings to previous studies of this gradient using SSU-rRNA gene and metagenomic analyses. Bacterial phylogenetic diversity was highest at intermediate depths beneath the ocean surface, whereas taxonomic diversity (diversity measured by binning sequences into taxonomically similar groups) showed no relationship with depth. Phylogenetic diversity estimates based on the SSU-rRNA gene and the multi-gene metagenomic phylogeny were broadly concordant, suggesting that our approach will be applicable to other metagenomic data sets for which corresponding SSU-rRNA gene sequences are unavailable. Our approach opens up the possibility of using metagenomic data to study microbial diversity in a phylogenetic context. PMID:21912589

  6. [Partial sequence homology of FtsZ in phylogenetics analysis of lactic acid bacteria].

    PubMed

    Zhang, Bin; Dong, Xiu-zhu

    2005-10-01

    FtsZ is a structurally conserved protein, which is universal among the prokaryotes. It plays a key role in prokaryote cell division. A partial fragment of the ftsZ gene about 800bp in length was amplified and sequenced and a partial FtsZ protein phylogenetic tree for the lactic acid bacteria was constructed. By comparing the FtsZ phylogenetic tree with the 16S rDNA tree, it was shown that the two trees were similar in topology. Both trees revealed that Pediococcus spp. were closely related with L. casei group of Lactobacillus spp. , but less related with other lactic acid cocci such as Enterococcus and Streptococcus. The results also showed that the discriminative power of FtsZ was higher than that of 16S rDNA for either inter-species or inter-genus and could be a very useful tool in species identification of lactic acid bacteria. PMID:16342751

  7. Shiva automatic pinhole alignment

    SciTech Connect

    Suski, G.J.

    1980-09-05

    This paper describes a computer controlled closed loop alignment subsystem for Shiva, which represents the first use of video sensors for large laser alignment at LLNL. The techniques used on this now operational subsystem are serving as the basis for all closed loop alignment on Nova, the 200 terawatt successor to Shiva.

  8. Constructing Phylogenetic Networks Based on the Isomorphism of Datasets

    PubMed Central

    Zhang, Zhibin; Li, Yanjuan

    2016-01-01

    Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK, and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topology G and construct a network for every dataset. For any one dataset 𝒞, we can compute a network from the network representing the simplest dataset which is isomorphic to 𝒞. This process will save more time for the algorithms when constructing networks. PMID:27547759

  9. Constructing Phylogenetic Networks Based on the Isomorphism of Datasets.

    PubMed

    Wang, Juan; Zhang, Zhibin; Li, Yanjuan

    2016-01-01

    Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK, and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topology G and construct a network for every dataset. For any one dataset , we can compute a network from the network representing the simplest dataset which is isomorphic to . This process will save more time for the algorithms when constructing networks. PMID:27547759

  10. Identification, Classification and Differential Expression of Oleosin Genes in Tung Tree (Vernicia fordii)

    PubMed Central

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M.

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the “proline knot” motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1–3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants. PMID:24516650

  11. Phylogenetic relationships and new genetic tools for the detection and discrimination of the three feline Demodex mites.

    PubMed

    Silbermayr, Katja; Horvath-Ungerboeck, Christa; Eigner, Barbara; Joachim, Anja; Ferrer, Lluis

    2015-02-01

    Two feline Demodex mite species have been described as causative agents of feline demodicosis, until recently a third species was detected. We provide an updated analysis on the phylogenetic relationship of Demodex mites. In addition, we present the first qPCR assay for the detection and differentiation of all three feline mite species in a single reaction. Specimen of Demodex cati, Demodex gatoi, and the recently discovered third species were collected from skin scrapings and fecal flotation for DNA extraction, conventional PCR, sequencing, and alignment. A total of 24 sequences of the partial 16S rRNA gene were used to estimate the evolutionary divergence in a p-distance model and a maximum likelihood phylogenetic tree. For the qPCR assay, new primers and fluorescent probes for the simultaneous detection of all three feline Demodex mites were designed. A consensus fragment of 351 bp was phylogenetically analyzed. The third species sequence of our study shares 98.6 % similarity to the available sequence in GenBank®. It is most similar to D. gatoi (82.41 %) and most distant to the canine Demodex injai (78.28 %). In contrast, D. gatoi is most similar to human Demodex brevis (87.01 %). The multiplex qPCR detected and discriminated the three different mite species in one reaction. The detection limit is ≤1.4 ng of mite DNA. The three feline Demodex species have distinct genotypes and did not cluster in one genetic clade. The species differentiation and assessment of evolutionary relationships will ultimately support correct diagnostics and treatment approaches. PMID:25468382

  12. Phylogenetic Analysis of the Bifidobacterium Genus Using Glycolysis Enzyme Sequences

    PubMed Central

    Brandt, Katelyn; Barrangou, Rodolphe

    2016-01-01

    Bifidobacteria are important members of the human gastrointestinal tract that promote the establishment of a healthy microbial consortium in the gut of infants. Recent studies have established that the Bifidobacterium genus is a polymorphic phylogenetic clade, which encompasses a diversity of species and subspecies that encode a broad range of proteins implicated in complex and non-digestible carbohydrate uptake and catabolism, ranging from human breast milk oligosaccharides, to plant fibers. Recent genomic studies have created a need to properly place Bifidobacterium species in a phylogenetic tree. Current approaches, based on core-genome analyses come at the cost of intensive sequencing and demanding analytical processes. Here, we propose a typing method based on sequences of glycolysis genes and the proteins they encode, to provide insights into diversity, typing, and phylogeny in this complex and broad genus. We show that glycolysis genes occur broadly in these genomes, to encode the machinery necessary for the biochemical spine of the cell, and provide a robust phylogenetic marker. Furthermore, glycolytic sequences-based trees are congruent with both the classical 16S rRNA phylogeny, and core genome-based strain clustering. Furthermore, these glycolysis markers can also be used to provide insights into the adaptive evolution of this genus, especially with regards to trends toward a high GC content. This streamlined method may open new avenues for phylogenetic studies on a broad scale, given the widespread occurrence of the glycolysis pathway in bacteria, and the diversity of the sequences they encode. PMID:27242688

  13. Phylogenetic Analysis of the Bifidobacterium Genus Using Glycolysis Enzyme Sequences.

    PubMed

    Brandt, Katelyn; Barrangou, Rodolphe

    2016-01-01

    Bifidobacteria are important members of the human gastrointestinal tract that promote the establishment of a healthy microbial consortium in the gut of infants. Recent studies have established that the Bifidobacterium genus is a polymorphic phylogenetic clade, which encompasses a diversity of species and subspecies that encode a broad range of proteins implicated in complex and non-digestible carbohydrate uptake and catabolism, ranging from human breast milk oligosaccharides, to plant fibers. Recent genomic studies have created a need to properly place Bifidobacterium species in a phylogenetic tree. Current approaches, based on core-genome analyses come at the cost of intensive sequencing and demanding analytical processes. Here, we propose a typing method based on sequences of glycolysis genes and the proteins they encode, to provide insights into diversity, typing, and phylogeny in this complex and broad genus. We show that glycolysis genes occur broadly in these genomes, to encode the machinery necessary for the biochemical spine of the cell, and provide a robust phylogenetic marker. Furthermore, glycolytic sequences-based trees are congruent with both the classical 16S rRNA phylogeny, and core genome-based strain clustering. Furthermore, these glycolysis markers can also be used to provide insights into the adaptive evolution of this genus, especially with regards to trends toward a high GC content. This streamlined method may open new avenues for phylogenetic studies on a broad scale, given the widespread occurrence of the glycolysis pathway in bacteria, and the diversity of the sequences they encode. PMID:27242688

  14. Phylogenetic plant community structure along elevation is lineage specific

    PubMed Central

    Ndiribe, Charlotte; Pellissier, Loïc; Antonelli, Silvia; Dubuis, Anne; Pottier, Julien; Vittoz, Pascal; Guisan, Antoine; Salamin, Nicolas

    2013-01-01

    The trend of closely related taxa to retain similar environmental preferences mediated by inherited traits suggests that several patterns observed at the community scale originate from longer evolutionary processes. While the effects of phylogenetic relatedness have been previously studied within a single genus or family, lineage-specific effects on the ecological processes governing community assembly have rarely been studied for entire communities or flora. Here, we measured how community phylogenetic structure varies across a wide elevation gradient for plant lineages represented by 35 families, using a co-occurrence index and net relatedness index (NRI). We propose a framework that analyses each lineage separately and reveals the trend of ecological assembly at tree nodes. We found prevailing phylogenetic clustering for more ancient nodes and overdispersion in more recent tree nodes. Closely related species may thus rapidly evolve new environmental tolerances to radiate into distinct communities, while older lineages likely retain inherent environmental tolerances to occupy communities in similar environments, either through efficient dispersal mechanisms or the exclusion of older lineages with more divergent environmental tolerances. Our study illustrates the importance of disentangling the patterns of community assembly among lineages to better interpret the ecological role of traits. It also sheds light on studies reporting absence of phylogenetic signal, and opens new perspectives on the analysis of niche and trait conservatism across lineages. PMID:24455126

  15. Systematic Conservation Planning for Groundwater Ecosystems Using Phylogenetic Diversity

    PubMed Central

    Asmyhr, Maria G.; Linke, Simon; Hose, Grant; Nipperess, David A.

    2014-01-01

    Aquifer ecosystems provide a range of important services including clean drinking water. These ecosystems, which are largely inaccessible to humans, comprise a distinct invertebrate fauna (stygofauna), which is characterized by narrow distributions, high levels of endemism and cryptic species. Although being under enormous anthropogenic pressure, aquifers have rarely been included in conservation planning because of the general lack of knowledge of species diversity and distribution. Here we use molecular sequence data and phylogenetic diversity as surrogates for stygofauna diversity in aquifers of New South Wales, Australia. We demonstrate how to incorporate these data as conservation features in the systematic conservation planning software Marxan. We designated each branch of the phylogenetic tree as a conservation feature, with the branch length as a surrogate for the number of distinct characters represented by each branch. Two molecular markers (nuclear 18S ribosomal DNA and mitochondrial cytochrome oxidase subunit I) were used to evaluate how marker variability and the resulting tree topology affected the site-selection process. We found that the sites containing the deepest phylogenetic branches were deemed the most irreplaceable by Marxan. By integrating phylogenetic data, we provide a method for including taxonomically undescribed groundwater fauna in systematic conservation planning. PMID:25514422

  16. Inferring Phylogenetic Networks with Maximum Pseudolikelihood under Incomplete Lineage Sorting

    PubMed Central

    Solís-Lemus, Claudia; Ané, Cécile

    2016-01-01

    Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (Xiphophorus: Poeciliidae), which is characterized by widespread hybridizations. PMID:26950302

  17. Greenhouse trees

    SciTech Connect

    Hanover, J.W.; Hart, J.W.

    1980-05-09

    Michigan State University has been conducting research on growth control of woody plants with emphasis on commercial plantations. The objective was to develop the optimum levels for the major factors that affect tree seedling growth and development so that high quality plants can be produced for a specific use. This article describes the accelerated-optimal-growth (AOG) concept, describes precautions to take in its application, and shows ways to maximize the potential of AOG for producing ornamental trees. Factors considered were container growing system; protective culture including light, temperature, mineral nutrients, water, carbon dioxide, growth regulators, mycorrhizae, growing media, competition, and pests; size of seedlings; and acclamation. 1 table. (DP)

  18. Girder Alignment Plan

    SciTech Connect

    Wolf, Zackary; Ruland, Robert; LeCocq, Catherine; Lundahl, Eric; Levashov, Yurii; Reese, Ed; Rago, Carl; Poling, Ben; Schafer, Donald; Nuhn, Heinz-Dieter; Wienands, Uli; /SLAC

    2010-11-18

    The girders for the LCLS undulator system contain components which must be aligned with high accuracy relative to each other. The alignment is one of the last steps before the girders go into the tunnel, so the alignment must be done efficiently, on a tight schedule. This note documents the alignment plan which includes efficiency and high accuracy. The motivation for girder alignment involves the following considerations. Using beam based alignment, the girder position will be adjusted until the beam goes through the center of the quadrupole and beam finder wire. For the machine to work properly, the undulator axis must be on this line and the center of the undulator beam pipe must be on this line. The physics reasons for the undulator axis and undulator beam pipe axis to be centered on the beam are different, but the alignment tolerance for both are similar. In addition, the beam position monitor must be centered on the beam to preserve its calibration. Thus, the undulator, undulator beam pipe, quadrupole, beam finder wire, and beam position monitor axes must all be aligned to a common line. All relative alignments are equally important, not just, for example, between quadrupole and undulator. We begin by making the common axis the nominal beam axis in the girder coordinate system. All components will be initially aligned to this axis. A more accurate alignment will then position the components relative to each other, without incorporating the girder itself.

  19. Adelaide River virus nucleoprotein gene: analysis of phylogenetic relationships of ephemeroviruses and other rhabdoviruses.

    PubMed

    Wang, Y; Cowley, J A; Walker, P J

    1995-04-01

    The nucleotide sequence of the Adelaide River virus (ARV) genome was determined from the 3' terminus to the end of the nucleoprotein (N) gene. The 3' leader sequence comprises 50 nucleotides and shares a common terminal trinucleotide (3' UGC-), a conserved U-rich domain and a variable AU-rich domain with other animal rhabdoviruses. The N gene comprises 1355 nucleotides from the transcription start sequence (AACAGG) to the poly(A) sequence [CATG(A)7] and encodes a polypeptide of 429 amino acids. The N protein has a calculated molecular mass of 49429 Da and a pI of 5.4 and, like the bovine ephemeral fever virus (BEFV) N protein, features a highly acidic C-terminal domain. Analysis of amino acid sequence relationships between all available rhabdovirus N proteins indicated that ARV and BEFV are closely related viruses (48.3% similarity) which share higher sequence similarity to vesiculoviruses than to lyssaviruses. Phylogenetic trees based on a multiple sequence alignment of all available rhabdovirus N protein sequences demonstrated clustering of viruses according to genome organization, host range and established taxonomic relationships. PMID:9049348

  20. Audubon Tree Study Program.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    Included are an illustrated student reader, "The Story of Trees," a leaders' guide, and a large tree chart with 37 colored pictures. The student reader reviews several aspects of trees: a definition of a tree; where and how trees grow; flowers, pollination and seed production; how trees make their food; how to recognize trees; seasonal changes;…

  1. Combinatorics of distance-based tree inference.

    PubMed

    Pardi, Fabio; Gascuel, Olivier

    2012-10-01

    Several popular methods for phylogenetic inference (or hierarchical clustering) are based on a matrix of pairwise distances between taxa (or any kind of objects): The objective is to construct a tree with branch lengths so that the distances between the leaves in that tree are as close as possible to the input distances. If we hold the structure (topology) of the tree fixed, in some relevant cases (e.g., ordinary least squares) the optimal values for the branch lengths can be expressed using simple combinatorial formulae. Here we define a general form for these formulae and show that they all have two desirable properties: First, the common tree reconstruction approaches (least squares, minimum evolution), when used in combination with these formulae, are guaranteed to infer the correct tree when given enough data (consistency); second, the branch lengths of all the simple (nearest neighbor interchange) rearrangements of a tree can be calculated, optimally, in quadratic time in the size of the tree, thus allowing the efficient application of hill climbing heuristics. The study presented here is a continuation of that by Mihaescu and Pachter on branch length estimation [Mihaescu R, Pachter L (2008) Proc Natl Acad Sci USA 105:13206-13211]. The focus here is on the inference of the tree itself and on providing a basis for novel algorithms to reconstruct trees from distances. PMID:23012403

  2. Combinatorics of distance-based tree inference

    PubMed Central

    Pardi, Fabio; Gascuel, Olivier

    2012-01-01

    Several popular methods for phylogenetic inference (or hierarchical clustering) are based on a matrix of pairwise distances between taxa (or any kind of objects): The objective is to construct a tree with branch lengths so that the distances between the leaves in that tree are as close as possible to the input distances. If we hold the structure (topology) of the tree fixed, in some relevant cases (e.g., ordinary least squares) the optimal values for the branch lengths can be expressed using simple combinatorial formulae. Here we define a general form for these formulae and show that they all have two desirable properties: First, the common tree reconstruction approaches (least squares, minimum evolution), when used in combination with these formulae, are guaranteed to infer the correct tree when given enough data (consistency); second, the branch lengths of all the simple (nearest neighbor interchange) rearrangements of a tree can be calculated, optimally, in quadratic time in the size of the tree, thus allowing the efficient application of hill climbing heuristics. The study presented here is a continuation of that by Mihaescu and Pachter on branch length estimation [Mihaescu R, Pachter L (2008) Proc Natl Acad Sci USA 105:13206–13211]. The focus here is on the inference of the tree itself and on providing a basis for novel algorithms to reconstruct trees from distances. PMID:23012403

  3. Dating human cultural capacity using phylogenetic principles

    PubMed Central

    Lind, J.; Lindenfors, P.; Ghirlanda, S.; Lidén, K.; Enquist, M.

    2013-01-01

    Humans have genetically based unique abilities making complex culture possible; an assemblage of traits which we term “cultural capacity”. The age of this capacity has for long been subject to controversy. We apply phylogenetic principles to date this capacity, integrating evidence from archaeology, genetics, paleoanthropology, and linguistics. We show that cultural capacity is older than the first split in the modern human lineage, and at least 170,000 years old, based on data on hyoid bone morphology, FOXP2 alleles, agreement between genetic and language trees, fire use, burials, and the early appearance of tools comparable to those of modern hunter-gatherers. We cannot exclude that Neanderthals had cultural capacity some 500,000 years ago. A capacity for complex culture, therefore, must have existed before complex culture itself. It may even originated long before. This seeming paradox is resolved by theoretical models suggesting that cultural evolution is exceedingly slow in its initial stages. PMID:23648831

  4. Dating human cultural capacity using phylogenetic principles.

    PubMed

    Lind, J; Lindenfors, P; Ghirlanda, S; Lidén, K; Enquist, M

    2013-01-01

    Humans have genetically based unique abilities making complex culture possible; an assemblage of traits which we term "cultural capacity". The age of this capacity has for long been subject to controversy. We apply phylogenetic principles to date this capacity, integrating evidence from archaeology, genetics, paleoanthropology, and linguistics. We show that cultural capacity is older than the first split in the modern human lineage, and at least 170,000 years old, based on data on hyoid bone morphology, FOXP2 alleles, agreement between genetic and language trees, fire use, burials, and the early appearance of tools comparable to those of modern hunter-gatherers. We cannot exclude that Neanderthals had cultural capacity some 500,000 years ago. A capacity for complex culture, therefore, must have existed before complex culture itself. It may even originated long before. This seeming paradox is resolved by theoretical models suggesting that cultural evolution is exceedingly slow in its initial stages. PMID:23648831

  5. Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  6. Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search.

    PubMed

    Wheeler, Ward C

    2003-06-01

    A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. PMID:12901383

  7. Alignment-free analysis of barcode sequences by means of compression-based methods

    PubMed Central

    2013-01-01

    Background The key idea of DNA barcode initiative is to identify, for each group of species belonging to different kingdoms of life, a short DNA sequence that can act as a true taxon barcode. DNA barcode represents a valuable type of information that can be integrated with ecological, genetic, and morphological data in order to obtain a more consistent taxonomy. Recent studies have shown that, for the animal kingdom, the mitochondrial gene cytochrome c oxidase I (COI), about 650 bp long, can be used as a barcode sequence for identification and taxonomic purposes of animals. In the present work we aims at introducing the use of an alignment-free approach in order to make taxonomic analysis of barcode sequences. Our approach is based on the use of two compression-based versions of non-computable Universal Similarity Metric (USM) class of distances. Our purpose is to justify the employ of USM also for the analysis of short DNA barcode sequences, showing how USM is able to correctly extract taxonomic information among those kind of sequences. Results We downloaded from Barcode of Life Data System (BOLD) database 30 datasets of barcode sequences belonging to different animal species. We built phylogenetic trees of every dataset, according to compression-based and classic evolutionary methods, and compared them in terms of topology preservation. In the experimental tests, we obtained scores with a percentage of similarity between evolutionary and compression-based trees between 80% and 100% for the most of datasets (94%). Moreover we carried out experimental tests using simulated barcode datasets composed of 100, 150, 200 and 500 sequences, each simulation replicated 25-fold. In this case, mean similarity scores between evolutionary and compression-based trees span between 83% and 99% for all simulated datasets. Conclusions In the present work we aims at introducing the use of an alignment-free approach in order to make taxonomic analysis of barcode sequences. Our

  8. Molecular Evolution of Alternative Oxidase Proteins: A Phylogenetic and Structure Modeling Approach.

    PubMed

    Pennisi, Rosa; Salvi, Daniele; Brandi, Valentina; Angelini, Riccardo; Ascenzi, Paolo; Polticelli, Fabio

    2016-05-01

    Alternative oxidases (AOXs) are mitochondrial cyanide-resistant membrane-bound metallo-proteins catalyzing the oxidation of ubiquinol and the reduction of oxygen to water bypassing two sites of proton pumping, thus dissipating a major part of redox energy into heat. Here, the structure of Arabidopsis thaliana AOX 1A has been modeled using the crystal structure of Trypanosoma brucei AOX as a template. Analysis of this model and multiple sequence alignment of members of the AOX family from all kingdoms of Life indicate that AOXs display a high degree of conservation of the catalytic core, which is formed by a four-α-helix bundle, hosting the di-iron catalytic site, and is flanked by two additional α-helices anchoring the protein to the membrane. Plant AOXs display a peculiar covalent dimerization mode due to the conservation in the N-terminal region of a Cys residue forming the inter-monomer disulfide bond. The multiple sequence alignment has also been used to infer a phylogenetic tree of AOXs whose analysis shows a polyphyletic origin for the AOXs found in Fungi and a monophyletic origin of the AOXs of Eubacteria, Mycetozoa, Euglenozoa, Metazoa, and Land Plants. This suggests that AOXs evolved from a common ancestral protein in each of these kingdoms. Within the Plant AOX clade, the AOXs of monocotyledon plants form two distinct clades which have unresolved relationships relative to the monophyletic clade of the AOXs of dicotyledonous plants. This reflects the sequence divergence of the N-terminal region, probably due to a low selective pressure for sequence conservation linked to the covalent homo-dimerization mode. PMID:27090422

  9. Continental monophyly of cichlid fishes and the phylogenetic position of Heterochromis multidens.

    PubMed

    Keck, Benjamin P; Hulsey, C Darrin

    2014-04-01

    The incredibly species-rich cichlid fish faunas of both the Neotropics and Africa are generally thought to be reciprocally monophyletic. However, the phylogenetic affinity of the African cichlid Heterochromis multidens is ambiguous, and this distinct lineage could make African cichlids paraphyletic. In past studies, Heterochromis has been variously suggested to be one of the earliest diverging lineages within either the Neotropical or the African cichlid radiations, and it has even been hypothesized to be the sister lineage to a clade containing all Neotropical and African cichlids. We examined the phylogenetic relationships among a representative sample of cichlids with a dataset of 29 nuclear loci to assess the support for the different hypotheses of the phylogenetic position of Heterochromis. Although individual gene trees in some instances supported alternative relationships, a majority of gene trees, integration of genes into species trees, and hypothesis testing of putative topologies all supported Heterochromis as belonging to the clade of African cichlids. PMID:24472673

  10. Phylogenetic analysis of honey bee behavioral evolution.

    PubMed

    Raffiudin, Rika; Crozier, Ross H

    2007-05-01

    DNA sequences from three mitochondrial (rrnL, cox2, nad2) and one nuclear gene (itpr) from all 9 known honey bee species (Apis), a 10th possible species, Apis dorsata binghami, and three outgroup species (Bombus terrestris, Melipona bicolor and Trigona fimbriata) were used to infer Apis phylogenetic relationships using Bayesian analysis. The dwarf honey bees were confirmed as basal, and the giant and cavity-nesting species to be monophyletic. All nodes were strongly supported except that grouping Apis cerana with A. nigrocincta. Two thousand post-burnin trees from the phylogenetic analysis were used in a Bayesian comparative analysis to explore the evolution of dance type, nest structure, comb structure and dance sound within Apis. The ancestral honey bee species was inferred with high support to have nested in the open, and to have more likely than not had a silent vertical waggle dance and a single comb. The common ancestor of the giant and cavity-dwelling bees is strongly inferred to have had a buzzing vertical directional dance. All pairwise combinations of characters showed strong association, but the multiple comparisons problem reduces the ability to infer associations between states between characters. Nevertheless, a buzzing dance is significantly associated with cavity-nesting, several vertical combs, and dancing vertically, a horizontal dance is significantly associated with a nest with a single comb wrapped around the support, and open nesting with a single pendant comb and a silent waggle dance. PMID:17123837

  11. Phylesystem: a git-based data store for community-curated phylogenetic estimates

    PubMed Central

    McTavish, Emily Jane; Hinchliff, Cody E.; Allman, James F.; Brown, Joseph W.; Cranston, Karen A.; Rees, Jonathan A.; Smith, Stephen A.

    2015-01-01

    Motivation: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al., 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct. Results: Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git’s version control features. Hosting this data store on GitHub (http://github.com/) provides open access to the data store using tools familiar to many developers. We have deployed a server running the ‘phylesystem-api’, which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements. Availability and implementation: Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api. The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem. A web

  12. TreeQ-VISTA: An Interactive Tree Visualization Tool withFunctional Annotation Query Capabilities

    SciTech Connect

    Gu, Shengyin; Anderson, Iain; Kunin, Victor; Cipriano, Michael; Minovitsky, Simon; Weber, Gunther; Amenta, Nina; Hamann, Bernd; Dubchak,Inna

    2007-05-07

    Summary: We describe a general multiplatform exploratorytool called TreeQ-Vista, designed for presenting functional annotationsin a phylogenetic context. Traits, such as phenotypic and genomicproperties, are interactively queried from a relational database with auser-friendly interface which provides a set of tools for users with orwithout SQL knowledge. The query results are projected onto aphylogenetic tree and can be displayed in multiple color groups. A richset of browsing, grouping and query tools are provided to facilitatetrait exploration, comparison and analysis.Availability: The program,detailed tutorial and examples are available online athttp://genome-test.lbl.gov/vista/TreeQVista.

  13. [Phylogenetic relations between frit fly groups from the genus Meromyza based on genetic and morphological analysis].

    PubMed

    Triseleva, T A; Akent'eva, N A; Safonkin, A F

    2014-01-01

    Phylogenetic relations between groups of frit fly species from the genus Meromyza were studied on the mtDNA COI locus and on the morphology of the male reproductive apparatus. Branching of the phylogenetic tree constructed by the Neighbor-Joining method unites sequences of samples from species of the genus Meromyza in five clusters with high support. It was demonstrated that joining of species in a certain cluster corresponds to uniformity of morphological traits of male parameres. PMID:25731030

  14. SUMAC: Constructing Phylogenetic Supermatrices and Assessing Partially Decisive Taxon Coverage

    PubMed Central

    Freyman, William A.

    2015-01-01

    The amount of phylogenetically informative sequence data in GenBank is growing at an exponential rate, and large phylogenetic trees are increasingly used in research. Tools are needed to construct phylogenetic sequence matrices from GenBank data and evaluate the effect of missing data. Supermatrix Constructor (SUMAC) is a tool to data-mine GenBank, construct phylogenetic supermatrices, and assess the phylogenetic decisiveness of a matrix given the pattern of missing sequence data. SUMAC calculates a novel metric, Missing Sequence Decisiveness Scores (MSDS), which measures how much each individual missing sequence contributes to the decisiveness of the matrix. MSDS can be used to compare supermatrices and prioritize the acquisition of new sequence data. SUMAC constructs supermatrices either through an exploratory clustering of all GenBank sequences within a taxonomic group or by using guide sequences to build homologous clusters in a more targeted manner. SUMAC assembles supermatrices for any taxonomic group recognized in GenBank and is optimized to run on multicore computer systems by parallelizing multiple stages of operation. SUMAC is implemented as a Python package that can run as a stand-alone command-line program, or its modules and objects can be incorporated within other programs. SUMAC is released under the open source GPLv3 license and is available at https://github.com/wf8/sumac. PMID:26648681

  15. SUMAC: Constructing Phylogenetic Supermatrices and Assessing Partially Decisive Taxon Coverage.

    PubMed

    Freyman, William A

    2015-01-01

    The amount of phylogenetically informative sequence data in GenBank is growing at an exponential rate, and large phylogenetic trees are increasingly used in research. Tools are needed to construct phylogenetic sequence matrices from GenBank data and evaluate the effect of missing data. Supermatrix Constructor (SUMAC) is a tool to data-mine GenBank, construct phylogenetic supermatrices, and assess the phylogenetic decisiveness of a matrix given the pattern of missing sequence data. SUMAC calculates a novel metric, Missing Sequence Decisiveness Scores (MSDS), which measures how much each individual missing sequence contributes to the decisiveness of the matrix. MSDS can be used to compare supermatrices and prioritize the acquisition of new sequence data. SUMAC constructs supermatrices either through an exploratory clustering of all GenBank sequences within a taxonomic group or by using guide sequences to build homologous clusters in a more targeted manner. SUMAC assembles supermatrices for any taxonomic group recognized in GenBank and is optimized to run on multicore computer systems by parallelizing multiple stages of operation. SUMAC is implemented as a Python package that can run as a stand-alone command-line program, or its modules and objects can be incorporated within other programs. SUMAC is released under the open source GPLv3 license and is available at https://github.com/wf8/sumac. PMID:26648681

  16. Charles Darwin, beetles and phylogenetics.

    PubMed

    Beutel, Rolf G; Friedrich, Frank; Leschen, Richard A B

    2009-11-01

    Here, we review Charles Darwin's relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in "The Descent of Man". During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig's new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data. This has

  17. Charles Darwin, beetles and phylogenetics

    NASA Astrophysics Data System (ADS)

    Beutel, Rolf G.; Friedrich, Frank; Leschen, Richard A. B.

    2009-11-01

    Here, we review Charles Darwin’s relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in “The Descent of Man”. During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig’s new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data

  18. Diversity Dynamics in Nymphalidae Butterflies: Effect of Phylogenetic Uncertainty on Diversification Rate Shift Estimates

    PubMed Central

    Peña, Carlos; Espeland, Marianne

    2015-01-01

    The species rich butterfly family Nymphalidae has been used to study evolutionary interactions between plants and insects. Theories of insect-hostplant dynamics predict accelerated diversification due to key innovations. In evolutionary biology, analysis of maximum credibility trees in the software MEDUSA (modelling evolutionary diversity using stepwise AIC) is a popular method for estimation of shifts in diversification rates. We investigated whether phylogenetic uncertainty can produce different results by extending the method across a random sample of trees from the posterior distribution of a Bayesian run. Using the MultiMEDUSA approach, we found that phylogenetic uncertainty greatly affects diversification rate estimates. Different trees produced diversification rates ranging from high values to almost zero for the same clade, and both significant rate increase and decrease in some clades. Only four out of 18 significant shifts found on the maximum clade credibility tree were consistent across most of the sampled trees. Among these, we found accelerated diversification for Ithomiini butterflies. We used the binary speciation and extinction model (BiSSE) and found that a hostplant shift to Solanaceae is correlated with increased net diversification rates in Ithomiini, congruent with the diffuse cospeciation hypothesis. Our results show that taking phylogenetic uncertainty into account when estimating net diversification rate shifts is of great importance, as very different results can be obtained when using the maximum clade credibility tree and other trees from the posterior distribution. PMID:25830910

  19. Diversity dynamics in Nymphalidae butterflies: effect of phylogenetic uncertainty on diversification rate shift estimates.

    PubMed

    Peña, Carlos; Espeland, Marianne

    2015-01-01

    The species rich butterfly family Nymphalidae has been used to study evolutionary interactions between plants and insects. Theories of insect-hostplant dynamics predict accelerated diversification due to key innovations. In evolutionary biology, analysis of maximum credibility trees in the software MEDUSA (modelling evolutionary diversity using stepwise AIC) is a popular method for estimation of shifts in diversification rates. We investigated whether phylogenetic uncertainty can produce different results by extending the method across a random sample of trees from the posterior distribution of a Bayesian run. Using the MultiMEDUSA approach, we found that phylogenetic uncertainty greatly affects diversification rate estimates. Different trees produced diversification rates ranging from high values to almost zero for the same clade, and both significant rate increase and decrease in some clades. Only four out of 18 significant shifts found on the maximum clade credibility tree were consistent across most of the sampled trees. Among these, we found accelerated diversification for Ithomiini butterflies. We used the binary speciation and extinction model (BiSSE) and found that a hostplant shift to Solanaceae is correlated with increased net diversification rates in Ithomiini, congruent with the diffuse cospeciation hypothesis. Our results show that taking phylogenetic uncertainty into account when estimating net diversification rate shifts is of great importance, as very different results can be obtained when using the maximum clade credibility tree and other trees from the posterior distribution. PMID:25830910

  20. RBT-GA: a novel metaheuristic for solving the multiple sequence alignment problem

    PubMed Central

    Taheri, Javid; Zomaya, Albert Y

    2009-01-01

    Background Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. Results This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. Conclusion RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences. PMID:19594869

  1. Phylogenetic relationship of Wolverine Gulo gulo in Mustelidae revealed by complete mitochondrial genome.

    PubMed

    Zhu, Shibing; Gao, Yingying; Liu, Hui; Zhang, Shifang; Bai, Xiaojie; Zhang, Minghai

    2016-07-01

    The Wolverine Gulo gulo is an endangered species in China. We first obtained blood sample, extracted the sample DNA and sequenced the whole mtDNA genome of wolverine in Northeast China. We built the phylogenetic tree of wolverine and 10 other most closely related Mustelidae species. The wolverine's complete mitogenome is 16 575 bp in length, includes 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and one control region. The phylogenetic tree indicates that Wolverine is mostly close to the genus Martes. PMID:26702734

  2. Phylogenetic relationship of Eurasian lynx (Lynx lynx) revealed by complete mitochondrial genome.

    PubMed

    Ning, Yao; Liu, Hui; Jiang, Guangshun; Ma, Jianzhang

    2016-09-01

    The Eurasian lynx (Lynx lynx) is an Endangered species in northeast China. We first obtained muscle sample, extracted the sample DNA and sequenced the whole mtDNA genome of lynx from northeast China. We reconstructed the phylogenetic tree of Eurasian lynx and 10 other most closely related Felidae species. This lynx's complete mitogenome is 17 054bp in length, includes 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and one control region. The phylogenetic tree confirmed previous research results. PMID:26195214

  3. Horizontal carbon nanotube alignment.

    PubMed

    Cole, Matthew T; Cientanni, Vito; Milne, William I

    2016-09-21

    The production of horizontally aligned carbon nanotubes offers a rapid means of realizing a myriad of self-assembled near-atom-scale technologies - from novel photonic crystals to nanoscale transistors. The ability to reproducibly align anisotropic nanostructures has huge technological value. Here we review the present state-of-the-art in horizontal carbon nanotube alignment. For both in and ex situ approaches, we quantitatively assess the reported linear packing densities alongside the degree of alignment possible for each of these core methodologies. PMID:27546174

  4. Orthodontics and Aligners

    MedlinePlus

    ... Repairing Chipped Teeth Teeth Whitening Tooth-Colored Fillings Orthodontics and Aligners Straighten teeth for a healthier smile. Orthodontics When consumers think about orthodontics, braces are the ...

  5. Alignability of Optical Interconnects

    NASA Astrophysics Data System (ADS)

    Beech, Russell Scott

    With the continuing drive towards higher speed, density, and functionality in electronics, electrical interconnects become inadequate. Due to optics' high speed and bandwidth, freedom from capacitive loading effects, and freedom from crosstalk, optical interconnects can meet more stringent interconnect requirements. But, an optical interconnect requires additional components, such as an optical source and detector, lenses, holographic elements, etc. Fabrication and assembly of an optical interconnect requires precise alignment of these components. The successful development and deployment of optical interconnects depend on how easily the interconnect components can be aligned and/or how tolerant the interconnect is to misalignments. In this thesis, a method of quantitatively specifying the relative difficulty of properly aligning an optical interconnect is described. Ways of using this theory of alignment to obtain design and packaging guidelines for optical interconnects are examined. The measure of the ease with which an optical interconnect can be aligned, called the alignability, uses the efficiency of power transfer as a measure of alignment quality. The alignability is related to interconnect package design through the overall cost measure, which depends upon various physical parameters of the interconnect, such as the cost of the components and the time required for fabrication and alignment. Through a mutual dependence on detector size, the relationship between an interconnect's alignability and its bandwidth, signal-to-noise ratio, and bit-error -rate is examined. The results indicate that a range of device sizes exists for which given performance threshold values are satisfied. Next, the alignability of integrated planar-optic backplanes is analyzed in detail. The resulting data show that the alignability can be optimized by varying the substrate thickness or the angle of reflection. By including the effects of crosstalk, in a multi-channel backplane, the

  6. Tidal alignment of galaxies

    NASA Astrophysics Data System (ADS)

    Blazek, Jonathan; Vlah, Zvonimir; Seljak, Uroš

    2015-08-01

    We develop an analytic model for galaxy intrinsic alignments (IA) based on the theory of tidal alignment. We calculate all relevant nonlinear corrections at one-loop order, including effects from nonlinear density evolution, galaxy biasing, and source density weighting. Contributions from density weighting are found to be particularly important and lead to bias dependence of the IA amplitude, even on large scales. This effect may be responsible for much of the luminosity dependence in IA observations. The increase in IA amplitude for more highly biased galaxies reflects their locations in regions with large tidal fields. We also consider the impact of smoothing the tidal field on halo scales. We compare the performance of this consistent nonlinear model in describing the observed alignment of luminous red galaxies with the linear model as well as the frequently used "nonlinear alignment model," finding a significant improvement on small and intermediate scales. We also show that the cross-correlation between density and IA (the "GI" term) can be effectively separated into source alignment and source clustering, and we accurately model the observed alignment down to the one-halo regime using the tidal field from the fully nonlinear halo-matter cross correlation. Inside the one-halo regime, the average alignment of galaxies with density tracers no longer follows the tidal alignment prediction, likely reflecting nonlinear processes that must be considered when modeling IA on these scales. Finally, we discuss tidal alignment in the context of cosmic shear measurements.

  7. Taxonomic review and phylogenetic analysis of Enchodontoidei.

    PubMed

    Silva, Hilda M A; Gallo, Valéria

    2011-06-01

    Enchodontoidei are extinct marine teleost fishes with a long temporal range and a wide geographic distribution. As there has been no comprehensive phylogenetic study of this taxon, we performed a parsimony analysis using a data matrix with 87 characters, 31 terminal taxa for ingroup, and three taxa for outgroup. The analysis produced 93 equally parsimonious trees (L = 437 steps; CI = 0. 24; RI = 0. 49). The topology of the majority rule consensus tree was: (Sardinioides + Hemisaurida + (Nardorex + (Atolvorator + (Protostomias + Yabrudichthys ) + (Apateopholis + (Serrilepis + (Halec + Phylactocephalus ) + (Cimolichthys + (Prionolepis + ( (Eurypholis + Saurorhamphus ) + (Enchodus + (Paleolycus + Parenchodus ))))))) + ( (Ichthyotringa + Apateodus ) + (Rharbichthys + (Trachinocephalus + ( (Apuliadercetis + Brazilodercetis ) + (Benthesikyme + (Cyranichthys + Robertichthys ) + (Dercetis + Ophidercetis )) + (Caudadercetis + (Pelargorhynchus + (Nardodercetis + (Rhynchodercetis + (Dercetoides + Hastichthys )))))). The group Enchodontoidei is not monophyletic. Dercetidae form a clade supported by the presence of very reduced neural spines and possess a new composition. Enchodontidae are monophyletic by the presence of middorsal scutes, and Rharbichthys was excluded. Halecidae possess a new composition, with the exclusion of Hemisaurida. This taxon and Nardorex are Aulopiformes incertae sedis. PMID:21670874

  8. Phylogenetics and the Human Microbiome

    PubMed Central

    Matsen, Frederick A.

    2015-01-01

    The human microbiome is the ensemble of genes in the microbes that live inside and on the surface of humans. Because microbial sequencing information is now much easier to come by than phenotypic information, there has been an explosion of sequencing and genetic analysis of microbiome samples. Much of the analytical work for these sequences involves phylogenetics, at least indirectly, but methodology has developed in a somewhat different direction than for other applications of phylogenetics. In this article, I review the field and its methods from the perspective of a phylogeneticist, as well as describing current challenges for phylogenetics coming from this type of work. PMID:25102857

  9. A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria

    PubMed Central

    Gaby, John Christian; Buckley, Daniel H.

    2014-01-01

    We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396

  10. Another origin of coloniality in volvocaleans: the phylogenetic position of Pyrobotrys arnoldi (Spondylomoraceae, Volvocales).

    PubMed

    Nakada, Takashi; Nozaki, Hisayoshi; Tomita, Masaru

    2010-01-01

    Colonial volvocaleans (Chlorophyceae) are used as a standard model of multicellular evolution. However, the phylogenetic position of the colonial volvocalean family Spondylomoraceae has yet to be resolved. To examine this, the molecular phylogenies of Pyrobotrys stellata and Pyrobotrys squarrosa were analyzed using combined 18S rRNA, RUBISCO large subunit, and P700 chl a-apoprotein A2 gene sequences. In the phylogenetic trees, Pyrobotrys belonged to the clade Caudivolvoxa and was not closely related to other colonial volvocalean flagellates. The results indicate that colony formation of Spondylomoraceae independently evolved from unicellular volvocaleans. The phylogenetic position of problematic "Pascherina tetras" SAG 159-1 was also analyzed. PMID:20553352

  11. The more, the better: the use of multiple landmark configurations to solve the phylogenetic relationships in musteloids.

    PubMed

    Catalano, Santiago A; Ercoli, Marcos D; Prevosti, Francisco J

    2015-03-01

    Although the use of landmark data to study shape changes along a phylogenetic tree has become a common practice in evolutionary studies, the role of this sort of data for the inference of phylogenetic relationships remains under debate. Theoretical issues aside, the very existence of historical information in landmark data has been challenged, since phylogenetic analyses have often shown little congruence with alternative sources of evidence. However, most analyses conducted in the past were based upon a single landmark configuration, leaving it unsettled whether the incorporation of multiple configurations may improve the rather poor performance of this data source in most previous phylogenetic analyses. In the present study, we present a phylogenetic analysis of landmark data that combines information derived from several skeletal structures to derive a phylogenetic tree for musteloids. The analysis includes nine configurations representing different skeletal structures for 24 species. The resulting tree presents several notable concordances with phylogenetic hypotheses derived from molecular data. In particular, Mephitidae, Procyonidae, and Lutrinae plus the genera Martes, Mustela, Galictis, and Procyon were retrieved as monophyletic. In addition, other groupings were in agreement with molecular phylogenies or presented only minor discordances. Complementary analyses have also indicated that the results improve substantially when an increasing number of landmark configurations are included in the analysis. The results presented here thus highlight the importance of combining information from multiple structures to derive phylogenetic hypotheses from landmark data. PMID:25516268

  12. Technical Tree Climbing.

    ERIC Educational Resources Information Center

    Jenkins, Peter

    Tree climbing offers a safe, inexpensive adventure sport that can be performed almost anywhere. Using standard procedures practiced in tree surgery or rock climbing, almost any tree can be climbed. Tree climbing provides challenge and adventure as well as a vigorous upper-body workout. Tree Climbers International classifies trees using a system…

  13. Symbiosis between hydra and chlorella: molecular phylogenetic analysis and experimental study provide insight into its origin and evolution.

    PubMed

    Kawaida, Hitomi; Ohba, Kohki; Koutake, Yuhki; Shimizu, Hiroshi; Tachida, Hidenori; Kobayakawa, Yoshitaka

    2013-03-01

    Although many physiological studies have been reported on the symbiosis between hydra and green algae, very little information from a molecular phylogenetic aspect of symbiosis is available. In order to understand the origin and evolution of symbiosis between the two organisms, we compared the phylogenetic relationships among symbiotic green algae with the phylogenetic relationships among host hydra strains. To do so, we reconstructed molecular phylogenetic trees of several strains of symbiotic chlorella harbored in the endodermal epithelial cells of viridissima group hydra strains and investigated their congruence with the molecular phylogenetic trees of the host hydra strains. To examine the species specificity between the host and the symbiont with respect to the genetic distance, we also tried to introduce chlorella strains into two aposymbiotic strains of viridissima group hydra in which symbiotic chlorella had been eliminated in advance. We discussed the origin and history of symbiosis between hydra and green algae based on the analysis. PMID:23219706

  14. Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat

    PubMed Central

    Kirk Harris, J; Gregory Caporaso, J; Walker, Jeffrey J; Spear, John R; Gold, Nicholas J; Robertson, Charles E; Hugenholtz, Philip; Goodrich, Julia; McDonald, Daniel; Knights, Dan; Marshall, Paul; Tufo, Henry; Knight, Rob; Pace, Norman R

    2013-01-01

    The microbial mats of Guerrero Negro (GN), Baja California Sur, Mexico historically were considered a simple environment, dominated by cyanobacteria and sulfate-reducing bacteria. Culture-independent rRNA community profiling instead revealed these microbial mats as among the most phylogenetically diverse environments known. A preliminary molecular survey of the GN mat based on only ∼1500 small subunit rRNA gene sequences discovered several new phylum-level groups in the bacterial phylogenetic domain and many previously undetected lower-level taxa. We determined an additional ∼119 000 nearly full-length sequences and 28 000 >200 nucleotide 454 reads from a 10-layer depth profile of the GN mat. With this unprecedented coverage of long sequences from one environment, we confirm the mat is phylogenetically stratified, presumably corresponding to light and geochemical gradients throughout the depth of the mat. Previous shotgun metagenomic data from the same depth profile show the same stratified pattern and suggest that metagenome properties may be predictable from rRNA gene sequences. We verify previously identified novel lineages and identify new phylogenetic diversity at lower taxonomic levels, for example, thousands of operational taxonomic units at the family-genus levels differ considerably from known sequences. The new sequences populate parts of the bacterial phylogenetic tree that previously were poorly described, but indicate that any comprehensive survey of GN diversity has only begun. Finally, we show that taxonomic conclusions are generally congruent between Sanger and 454 sequencing technologies, with the taxonomic resolution achieved dependent on the abundance of reference sequences in the relevant region of the rRNA tree of life. PMID:22832344

  15. Phylogenetic analysis based on full-length large subunit ribosomal RNA gene sequence comparison reveals that Neospora caninum is more closely related to Hammondia heydorni than to Toxoplasma gondii.

    PubMed

    Mugridge, N B; Morrison, D A; Heckeroth, A R; Johnson, A M; Tenter, A M

    1999-10-01

    Since its first description in the late 1980s, Neospora caninum has been recognised as a prominent tissue cyst-forming parasite due to its ability to induce congenital disease and abortion in animals, especially cattle. It is found worldwide and is a cause of significant economic losses for the livestock industry. However, its place within the family Sarcocystidae, like that of several other taxa, remains unresolved. Neospora caninum shares several morphological and life cycle characters with Hammondia heydorni, although it is most commonly thought of as being a close relative of Toxoplasma gondii. This study presents information regarding the phylogenetic relationship of N. caninum to species currently classified into the genus Hammondia, as well as to two strains (RH and ME49) of T. gondii based on the full-length large subunit ribosomal RNA gene. Phylogenetic analyses using two alignment strategies and three different tree-building methods showed that the two species in the genus Hammondia are paraphyletic. Neospora caninum was shown to form a monophyletic clade with H. heydorni instead of T. gondii, which in turn was shown to be most closely related to H. hammondi. The finding that N. caninum and H. heydorni are closely related phylogenetically may aid the elucidation of currently unknown aspects of their biology and epidemiology, and suggests that H. heydorni should be considered in the differential diagnosis of N. caninum from other apicomplexan parasites. PMID:10608441

  16. Snake mitochondrial genomes: phylogenetic relationships and implications of extended taxon sampling for interpretations of mitogenomic evolution

    PubMed Central

    2010-01-01

    Background Snake mitochondrial genomes are of great interest in understanding mitogenomic evolution because of gene duplications and rearrangements and the fast evolutionary rate of their genes compared to other vertebrates. Mitochondrial gene sequences have also played an important role in attempts to resolve the contentious phylogenetic relationships of especially the early divergences among alethinophidian snakes. Two recent innovative studies found dramatic gene- and branch-specific relative acceleration in snake protein-coding gene evolution, particularly along internal branches leading to Serpentes and Alethinophidia. It has been hypothesized that some of these rate shifts are temporally (and possibly causally) associated with control region duplication and/or major changes in ecology and anatomy. Results The near-complete mitochondrial (mt) genomes of three henophidian snakes were sequenced: Anilius scytale, Rhinophis philippinus, and Charina trivirgata. All three genomes share a duplicated control region and translocated tRNALEU, derived features found in all alethinophidian snakes studied to date. The new sequence data were aligned with mt genome data for 21 other species of snakes and used in phylogenetic analyses. Phylogenetic results agreed with many other studies in recovering several robust clades, including Colubroidea, Caenophidia, and Cylindrophiidae+Uropeltidae. Nodes within Henophidia that have been difficult to resolve robustly in previous analyses remained uncompellingly resolved here. Comparisons of relative rates of evolution of rRNA vs. protein-coding genes were conducted by estimating branch lengths across the tree. Our expanded sampling revealed dramatic acceleration along the branch leading to Typhlopidae, particularly long rRNA terminal branches within Scolecophidia, and that most of the dramatic acceleration in protein-coding gene rate along Serpentes and Alethinophidia branches occurred before Anilius diverged from other

  17. Gene trees versus species trees: reassessing life-history evolution in a freshwater fish radiation.

    PubMed

    Waters, Jonathan M; Rowe, Diane L; Burridge, Christopher P; Wallis, Graham P

    2010-10-01

    Mechanisms of speciation are best understood in the context of phylogenetic relationships and as such have often been inferred from single gene trees, typically those derived from mitochondrial DNA (mtDNA) markers. Recent studies, however, have noted the potential for phylogenetic discordance between gene trees and underlying species trees (e.g., due to stochastic lineage sorting, introgression, or selection). Here, we employ a variety of nuclear DNA loci to reassess evolutionary relationships within a recent freshwater fish radiation to reappraise modes of speciation. New Zealand's freshwater-limited Galaxias vulgaris complex is thought to have evolved from G. brevipinnis, a widespread migratory species that retains a plesiomorphic marine juvenile phase. A well-resolved tree, based on four mtDNA regions, previously suggested that marine migratory ability has been lost on 3 independent occasions in the evolution of this species flock (assuming that loss of diadromy is irreversible). Here, we use pseudogene (galaxiid Numt: 1801 bp), intron (S: 903 bp), and exon (RAG-1: 1427 bp) markers, together with mtDNA, to reevaluate this hypothesis of parallel evolution. Interestingly, partitioned Bayesian analysis of concatenated nuclear sequences (3141 bp) and concatenated nuclear and mtDNA (4770 bp) both recover phylogenies implying a single loss of diadromy, not three parallel losses as previously inferred from mtDNA alone. This phylogenetic result is reinforced by a multilocus analysis performed using Bayesian estimation of species trees (BEST) software that estimates the posterior distribution of species trees under a coalescent model. We discuss factors that might explain the apparently misleading phylogenetic inferences generated by mtDNA. PMID:20603441

  18. Phylogenetic relationships of Russian far eastern flatfish (Pleuronectiformes, Pleuronectidae) based on two mitochondrial gene sequences, Co-1 and Cyt-b, with inferences in order phylogeny using complete mitogenome data.

    PubMed

    Kartavtsev, Yuri Phedorovich; Sharina, Svetlana N; Saitoh, Kenji; Imoto, Junichi M; Hanzawa, Naoto; Redin, Alexander D

    2016-01-01

    The systematics and phylogeny of flatfish is investigated on the complete sequence of nucleotides at subunit 1 cytochrome c oxidase (Co-1) and cytochrome b (Cyt-b) genes. In total 17 species from our collection and some species from GenBank were analyzed. Four types of trees were built: Bayesian (BA), maximum likelihood (ML), maximum parsimony (MP), and neighbor joining (NJ). These trees showed similar topology. Two separate clusters on the trees support subfamily Hippoglossoidinae and Hippoglossinae subdivision and monophyletic status of these taxa. The subfamily Pleuronectinae also can be considered monophyletic, if the tribe Microstomini is excluded from it and genus Lepidopsetta is moved into the tribe Pleuronectini. Mitogenomes represented by 25 complete sequences from NCBI GenBank were analyzed. After alignment two sets of nucleotide sequences were formed and investigated independently. One set included 13 structural genes (14,886 bp), the second set comprised by the mtDNA without ND6 gene (10,457 bp). Both data sets give congruent phylogenetic signal that agreed with conventional views on the taxonomy of the order Pleuronectiformes; however, the first set gives better topology. In BA gene tree there are two well supported nodes which include the representatives of suborders Pleuronectoidei and Psettoidei. Within Pleuronectoidei two superfamilies, Pleuronectoidea and Soleidea are highly supported in BA but in all four kinds of gene trees (BA, ML, MP and NJ) the only superfamily Pleuronectoidea is well supported. At the top of hierarchy, all flatfishes belonging to the order Pleuronectiformes forming also a monophyletic clade in our data, with support level of 100% but in BA tree only. The monophyly of the family Pleuronectidae is well supported both by single gene data and by complete mtDNA sequences. PMID:24841433

  19. SPEAR3 Construction Alignment

    SciTech Connect

    LeCocq, Catherine; Banuelos, Cristobal; Fuss, Brian; Gaudreault, Francis; Gaydosh, Michael; Griffin, Levirt; Imfeld, Hans; McDougal, John; Perry, Michael; Rogers, Michael; /SLAC

    2005-08-17

    An ambitious seven month shutdown of the existing SPEAR2 synchrotron radiation facility was successfully completed in March 2004 when the first synchrotron light was observed in the new SPEAR3 ring, SPEAR3 completely replaced SPEAR2 with new components aligned on a new highly-flat concrete floor. Devices such as magnets and vacuum chambers had to be fiducialized and later aligned on girder rafts that were then placed into the ring over pre-aligned support plates. Key to the success of aligning this new ring was to ensure that the new beam orbit matched the old SPEAR2 orbit so that existing experimental beamlines would not have to be reoriented. In this presentation a pictorial summary of the Alignment Engineering Group's surveying tasks for the construction of the SPEAR3 ring is provided. Details on the networking and analysis of various surveys throughout the project can be found in the accompanying paper.