Sample records for accurate phylogenetic classification

  1. Accurate phylogenetic classification of DNA fragments based onsequence composition

    SciTech Connect

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  2. A Higher-Level Phylogenetic Classification of the Fungi

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive phylogenetic classification of the Fungi is proposed, with reference to recent molecular phylogenetic analyses, and with input from members of the fungal taxonomic community. The classification includes 196 taxa, down to the level of order, of which 23 are described or are validated ...

  3. Phylogenetics and classification of the pantropical fern family Lindsaeaceae

    E-print Network

    Phylogenetics and classification of the pantropical fern family Lindsaeaceae SAMULI LEHTONEN1 for publication 2 June 2010 The classification and generic definition in the tropical­subtropical fern family, and c. 73% of the currently accepted species. The phylogenetic relationships of the lindsaeoid ferns

  4. A higher-level phylogenetic classification of the Fungi

    Microsoft Academic Search

    David S. Hibbett; Manfred Binder; Joseph F. Bischoff; Meredith Blackwell; F. Cannon; Ove E. Eriksson; Sabine Huhndorf; Timothy James; Paul M. Kirk; Robert Lu Cking; H. Thorsten Lumbsch; François Lutzoni; P. Brandon Matheny; David J. McLaughlin; Martha J. Powell; Scott Redhead; Conrad L. Schoch; Joseph W. Spatafora; Joost A. Stalpers; Rytas Vilgalys; M. Catherine Aime; André Aptroot; Robert Bauer; Dominik Begerow; Gerald L. Benny; A Lisa; Pedro W. Crous; Yu-Cheng Dai; Walter Gams; David M. Geiser; Gareth W. Griffith; Cécile Gueidan; David L. Hawksworth; Geir Hestmark; Kentaro Hosaka; Richard A. Humber; Kevin D. Hyde; Joseph E. Ironside; Ko Ljalg; Cletus P. Kurtzman; Karl-Henrik Larsson; Robert Lichtwardt; Jolanta Mia Þ Dlikowska; Jolanta Mi?dlikowska; Andrew Miller; Jean-Marc Moncalvo; Sharon Mozley-Standridge; Franz Oberwinkler; Erast Parmasto; Valérie Reeb; Jack D. Rogers; Claude Roux; Leif Ryvarden; José Paulo Sampaio; Arthur Schüßler; Junta Sugiyama; R. Greg Thorn; Leif Tibell; Wendy A. Untereiner; Christopher Walker; Zheng Wang; Alex Weir; Michael Weiss; Merlin M. White; Katarina Winka; Yi-Jian Yao; Ning Zhang

    2007-01-01

    A comprehensive phylogenetic classification of the kingdom Fungi is proposed, with reference to recent molecular phylogenetic analyses, and with input from diverse members of the fungal taxonomic community. The classification includes 195 taxa, down to the level of order, of which 16 are described or validated here: Dikarya subkingdom nov.; Chytridiomycota, Neocallimastigomycota phyla nov.; Monoblepharidomycetes, Neocallimastigomycetes class. nov.; Eurotiomycetidae, Lecanoromycetidae,

  5. Accurate Phylogenetic Tree Reconstruction from Quartets: A Heuristic Approach

    PubMed Central

    Reaz, Rezwana; Bayzid, Md. Shamsuzzoha; Rahman, M. Sohel

    2014-01-01

    Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A ‘quartet’ is an unrooted tree over taxa, hence the quartet-based supertree methods combine many -taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets. PMID:25117474

  6. Tumor classification using phylogenetic methods on expression data.

    PubMed

    Desper, Richard; Khan, Javed; Schäffer, Alejandro A

    2004-06-21

    Tumor classification is a well-studied problem in the field of bioinformatics. Developments in the field of DNA chip design have now made it possible to measure the expression levels of thousands of genes in sample tissue from healthy cell lines or tumors. A number of studies have examined the problems of tumor classification: class discovery, the problem of defining a number of classes of tumors using the data from a DNA chip, and class prediction, the problem of accurately classifying an unknown tumor, given expression data from the unknown tumor and from a learning set. The current work has applied phylogenetic methods to both problems. To solve the class discovery problem, we impose a metric on a set of tumors as a function of their gene expression levels, and impose a tree structure on this metric, using standard tree fitting methods borrowed from the field of phylogenetics. Phylogenetic methods provide a simple way of imposing a clear hierarchical relationship on the data, with branch lengths in the classification tree representing the degree of separation witnessed. We tested our method for class discovery on two data sets: a data set of 87 tissues, comprised mostly of small, round, blue-cell tumors (SRBCTs), and a data set of 22 breast tumors. We fit the 87 samples of the first set to a classification tree, which neatly separated into four major clusters corresponding exactly to the four groups of tumors, namely neuroblastomas, rhabdomyosarcomas, Burkitt's lymphomas, and the Ewing's family of tumors. The classification tree built using the breast cancer data separated tumors with BRCA1 mutations from those with BRCA2 mutations, with sporadic tumors separated from both groups and from each other. We also demonstrate the flexibility of the class discovery method with regard to standard resampling methodology such as jackknifing and noise perturbation. To solve the class prediction problem, we built a classification tree on the learning set, and then sought the optimal placement of each test sample within the classification tree. We tested this method on the SRBCT data set, and classified each tumor successfully. PMID:15178197

  7. Scaling up accurate phylogenetic reconstruction from gene-order data

    Microsoft Academic Search

    Jijun Tang; Bernard M. E. Moret

    2003-01-01

    Motivation: Phylogenetic reconstruction from gene-order data has attracted increasing attention from both biologists and computer scientists over the last few years. Methods used in reconstruction include distance-based methods (such as neighbor-joining), parsimony methods using sequence-based encodings, Bayesian approaches, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach,

  8. A Functional-Phylogenetic Classification System for Transmembrane Solute Transporters

    PubMed Central

    Saier, Milton H.

    2000-01-01

    A comprehensive classification system for transmembrane molecular transporters has been developed and recently approved by the transport panel of the nomenclature committee of the International Union of Biochemistry and Molecular Biology. This system is based on (i) transporter class and subclass (mode of transport and energy coupling mechanism), (ii) protein phylogenetic family and subfamily, and (iii) substrate specificity. Almost all of the more than 250 identified families of transporters include members that function exclusively in transport. Channels (115 families), secondary active transporters (uniporters, symporters, and antiporters) (78 families), primary active transporters (23 families), group translocators (6 families), and transport proteins of ill-defined function or of unknown mechanism (51 families) constitute distinct categories. Transport mode and energy coupling prove to be relatively immutable characteristics and therefore provide primary bases for classification. Phylogenetic grouping reflects structure, function, mechanism, and often substrate specificity and therefore provides a reliable secondary basis for classification. Substrate specificity and polarity of transport prove to be more readily altered during evolutionary history and therefore provide a tertiary basis for classification. With very few exceptions, a phylogenetic family of transporters includes members that function by a single transport mode and energy coupling mechanism, although a variety of substrates may be transported, sometimes with either inwardly or outwardly directed polarity. In this review, I provide cross-referencing of well-characterized constituent transporters according to (i) transport mode, (ii) energy coupling mechanism, (iii) phylogenetic grouping, and (iv) substrates transported. The structural features and distribution of recognized family members throughout the living world are also evaluated. The tabulations should facilitate familial and functional assignments of newly sequenced transport proteins that will result from future genome sequencing projects. PMID:10839820

  9. Accurate Reconstruction of Insertion-Deletion Histories by Statistical Phylogenetics

    Microsoft Academic Search

    Oscar Westesson; Gerton Lunter; Benedict Paten; Ian Holmes

    2012-01-01

    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we

  10. Robust and Accurate Cancer Classification with Gene Expression Profiling

    Microsoft Academic Search

    Haifeng Li; Keshu Zhang; Tao Jiang

    2005-01-01

    Robust and accurate cancer classification is critical in cancer treatment. Gene expression profiling is expected to enable us to diagnose tumors precisely and systematically. However, the classification task in this context is very chal- lenging because of the curse of dimensionality and the small sample size problem. In this paper, we propose a novel method to solve these two problems.

  11. A higher-level phylogenetic classification of the Fungi.

    PubMed

    Hibbett, David S; Binder, Manfred; Bischoff, Joseph F; Blackwell, Meredith; Cannon, Paul F; Eriksson, Ove E; Huhndorf, Sabine; James, Timothy; Kirk, Paul M; Lücking, Robert; Thorsten Lumbsch, H; Lutzoni, François; Matheny, P Brandon; McLaughlin, David J; Powell, Martha J; Redhead, Scott; Schoch, Conrad L; Spatafora, Joseph W; Stalpers, Joost A; Vilgalys, Rytas; Aime, M Catherine; Aptroot, André; Bauer, Robert; Begerow, Dominik; Benny, Gerald L; Castlebury, Lisa A; Crous, Pedro W; Dai, Yu-Cheng; Gams, Walter; Geiser, David M; Griffith, Gareth W; Gueidan, Cécile; Hawksworth, David L; Hestmark, Geir; Hosaka, Kentaro; Humber, Richard A; Hyde, Kevin D; Ironside, Joseph E; Kõljalg, Urmas; Kurtzman, Cletus P; Larsson, Karl-Henrik; Lichtwardt, Robert; Longcore, Joyce; Miadlikowska, Jolanta; Miller, Andrew; Moncalvo, Jean-Marc; Mozley-Standridge, Sharon; Oberwinkler, Franz; Parmasto, Erast; Reeb, Valérie; Rogers, Jack D; Roux, Claude; Ryvarden, Leif; Sampaio, José Paulo; Schüssler, Arthur; Sugiyama, Junta; Thorn, R Greg; Tibell, Leif; Untereiner, Wendy A; Walker, Christopher; Wang, Zheng; Weir, Alex; Weiss, Michael; White, Merlin M; Winka, Katarina; Yao, Yi-Jian; Zhang, Ning

    2007-05-01

    A comprehensive phylogenetic classification of the kingdom Fungi is proposed, with reference to recent molecular phylogenetic analyses, and with input from diverse members of the fungal taxonomic community. The classification includes 195 taxa, down to the level of order, of which 16 are described or validated here: Dikarya subkingdom nov.; Chytridiomycota, Neocallimastigomycota phyla nov.; Monoblepharidomycetes, Neocallimastigomycetes class. nov.; Eurotiomycetidae, Lecanoromycetidae, Mycocaliciomycetidae subclass. nov.; Acarosporales, Corticiales, Baeomycetales, Candelariales, Gloeophyllales, Melanosporales, Trechisporales, Umbilicariales ords. nov. The clade containing Ascomycota and Basidiomycota is classified as subkingdom Dikarya, reflecting the putative synapomorphy of dikaryotic hyphae. The most dramatic shifts in the classification relative to previous works concern the groups that have traditionally been included in the Chytridiomycota and Zygomycota. The Chytridiomycota is retained in a restricted sense, with Blastocladiomycota and Neocallimastigomycota representing segregate phyla of flagellated Fungi. Taxa traditionally placed in Zygomycota are distributed among Glomeromycota and several subphyla incertae sedis, including Mucoromycotina, Entomophthoromycotina, Kickxellomycotina, and Zoopagomycotina. Microsporidia are included in the Fungi, but no further subdivision of the group is proposed. Several genera of 'basal' Fungi of uncertain position are not placed in any higher taxa, including Basidiobolus, Caulochytrium, Olpidium, and Rozella. PMID:17572334

  12. Accurate Software Performance Estimation Using Domain Classification and Neural Networks

    E-print Network

    Wagner, Flávio Rech

    Accurate Software Performance Estimation Using Domain Classification and Neural Networks M to evaluate. In order to cope with this problem, this paper presents a neural network based approach for high software, neural networks 1. INTRODUCTION The existence of various architectures presenting different trade

  13. An Evolutionary Model-Based Algorithm for Accurate Phylogenetic Breakpoint Mapping and Subtype Prediction in HIV-1

    PubMed Central

    Kosakovsky Pond, Sergei L.; Posada, David; Stawiski, Eric; Chappey, Colombe; Poon, Art F.Y.; Hughes, Gareth; Fearnhill, Esther; Gravenor, Mike B.; Leigh Brown, Andrew J.; Frost, Simon D.W.

    2009-01-01

    Genetically diverse pathogens (such as Human Immunodeficiency virus type 1, HIV-1) are frequently stratified into phylogenetically or immunologically defined subtypes for classification purposes. Computational identification of such subtypes is helpful in surveillance, epidemiological analysis and detection of novel variants, e.g., circulating recombinant forms in HIV-1. A number of conceptually and technically different techniques have been proposed for determining the subtype of a query sequence, but there is not a universally optimal approach. We present a model-based phylogenetic method for automatically subtyping an HIV-1 (or other viral or bacterial) sequence, mapping the location of breakpoints and assigning parental sequences in recombinant strains as well as computing confidence levels for the inferred quantities. Our Subtype Classification Using Evolutionary ALgorithms (SCUEAL) procedure is shown to perform very well in a variety of simulation scenarios, runs in parallel when multiple sequences are being screened, and matches or exceeds the performance of existing approaches on typical empirical cases. We applied SCUEAL to all available polymerase (pol) sequences from two large databases, the Stanford Drug Resistance database and the UK HIV Drug Resistance Database. Comparing with subtypes which had previously been assigned revealed that a minor but substantial (?5%) fraction of pure subtype sequences may in fact be within- or inter-subtype recombinants. A free implementation of SCUEAL is provided as a module for the HyPhy package and the Datamonkey web server. Our method is especially useful when an accurate automatic classification of an unknown strain is desired, and is positioned to complement and extend faster but less accurate methods. Given the increasingly frequent use of HIV subtype information in studies focusing on the effect of subtype on treatment, clinical outcome, pathogenicity and vaccine design, the importance of accurate, robust and extensible subtyping procedures is clear. PMID:19956739

  14. Optimal selection of mother wavelet for accurate infant cry classification.

    PubMed

    Saraswathy, J; Hariharan, M; Nadarajaw, Thiyagar; Khairunizam, Wan; Yaacob, Sazali

    2014-06-01

    Wavelet theory is emerging as one of the prevalent tool in signal and image processing applications. However, the most suitable mother wavelet for these applications is still a relative question mark amongst researchers. Selection of best mother wavelet through parameterization leads to better findings for the analysis in comparison to random selection. The objective of this article is to compare the performance of the existing members of mother wavelets and to select the most suitable mother wavelet for accurate infant cry classification. Optimal wavelet is found using three different criteria namely the degree of similarity of mother wavelets, regularity of mother wavelets and accuracy of correct recognition during classification processes. Recorded normal and pathological infant cry signals are decomposed into five levels using wavelet packet transform. Energy and entropy features are extracted at different sub bands of cry signals and their effectiveness are tested with four supervised neural network architectures. Findings of this study expound that, the Finite impulse response based approximation of Meyer is the best wavelet candidate for accurate infant cry classification analysis. PMID:24691930

  15. SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.

    PubMed

    Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal

    2012-01-01

    Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed. PMID:22139466

  16. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees.

    PubMed

    Liu, Kevin; Raghavan, Sindhu; Nelesen, Serita; Linder, C Randal; Warnow, Tandy

    2009-06-19

    Inferring an accurate evolutionary tree of life requires high-quality alignments of molecular sequence data sets from large numbers of species. However, this task is often difficult, slow, and idiosyncratic, especially when the sequences are highly diverged or include high rates of insertions and deletions (collectively known as indels). We present SATé (simultaneous alignment and tree estimation), an automated method to quickly and accurately estimate both DNA alignments and trees with the maximum likelihood criterion. In our study, it improved tree and alignment accuracy compared to the best two-phase methods currently available for data sets of up to 1000 sequences, showing that coestimation can be both rapid and accurate in phylogenetic studies. PMID:19541996

  17. Tumor classification using phylogenetic methods on expression data

    Microsoft Academic Search

    Richard Desper; Javed Khan; Alejandro A. Schäffer

    2004-01-01

    Tumor classification is a well-studied problem in the field of bioinformatics. Developments in the field of DNA chip design have now made it possible to measure the expression levels of thousands of genes in sample tissue from healthy cell lines or tumors. A number of studies have examined the problems of tumor classification: class discovery, the problem of defining a

  18. Evolving accurate and compact classification rules with gene expression programming

    Microsoft Academic Search

    Chi Zhou; Weimin Xiao; Thomas M. Tirpak; Peter C. Nelson

    2003-01-01

    Abstract Classification is one of the fundamental tasks of data mining Most rule induction and decision tree algorithms perform local, greedy search to generate classification rules that are often more complex than necessary Evolutionary algorithms for pattern classification have recently received increased attention because they can perform global searches In this paper, we propose a new method for discovering high

  19. Phylogenetic Classification of Prokaryotic and Eukaryotic Sir2-like Proteins

    Microsoft Academic Search

    Roy A. Frye

    2000-01-01

    Sirtuins (Sir2-like proteins) are present in prokaryotes and eukaryotes. Here, two new human sirtuins (SIRT6 and SIRT7) are found to be similar to a particular subset of insect, nematode, plant, and protozoan sirtuins. Molecular phylogenetic analysis of 60 sirtuin conserved core domain sequences from a diverse array of organisms (including archaeans, bacteria, yeasts, plants, protozoans, and metazoans) shows that eukaryotic

  20. SVM and MRF-Based Method for Accurate Classification of Hyperspectral Images

    Microsoft Academic Search

    Yuliya Tarabalka; Mathieu Fauvel; Jocelyn Chanussot; Jón Atli Benediktsson

    2010-01-01

    The high number of spectral bands acquired by hyperspectral sensors increases the capability to distinguish physical materials and objects, presenting new challenges to image analysis and classification. This letter presents a novel method for accurate spectral-spatial classification of hyperspectral images. The proposed technique consists of two steps. In the first step, a probabilistic support vector machine pixelwise classification of the

  1. CPM: A Graph Pattern Matching Kernel with Diffusion for Accurate Graph Classification

    E-print Network

    Kansas, University of

    patterns from a graph database. We then map subgraphs to graphs in the graph database and use a process we databases search algorithms [17, 32, 40, 42], graph classification aims to construct accurate predictiveCPM: A Graph Pattern Matching Kernel with Diffusion for Accurate Graph Classification Aaron Smalter

  2. A phylogenetic analysis of the mycoplasmas: basis for their classification.

    PubMed Central

    Weisburg, W G; Tully, J G; Rose, D L; Petzel, J P; Oyaizu, H; Yang, D; Mandelco, L; Sechrest, J; Lawrence, T G; Van Etten, J

    1989-01-01

    Small-subunit rRNA sequences were determined for almost 50 species of mycoplasmas and their walled relatives, providing the basis for a phylogenetic systematic analysis of these organisms. Five groups of mycoplasmas per se were recognized (provisional names are given): the hominis group (which included species such as Mycoplasma hominis, Mycoplasma lipophilum, Mycoplasma pulmonis, and Mycoplasma neurolyticum), the pneumoniae group (which included species such as Mycoplasma pneumoniae and Mycoplasma muris), the spiroplasma group (which included species such as Mycoplasma mycoides, Spiroplasma citri, and Spiroplasma apis), the anaeroplasma group (which encompassed the anaeroplasmas and acholeplasmas), and a group known to contain only the isolated species Asteroleplasma anaerobium. In addition to these five mycoplasma groups, a sixth group of variously named gram-positive, walled organisms (which included lactobacilli, clostridia, and other organisms) was also included in the overall phylogenetic unit. In each of these six primary groups, subgroups were readily recognized and defined. Although the phylogenetic units identified by rRNA comparisons are difficult to recognize on the basis of mutually exclusive phenotypic characters alone, phenotypic justification can be given a posteriori for a number of them. PMID:2592342

  3. Phylogenetic classification of Aureobasidium pullulans strains for production of pullulan and xylanase

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study tests the hypothesis that phylogenetic classification can predict whether A. pullulans strains will produce useful levels of the commercial polysaccharide, pullulan, or the valuable enzyme, xylanase. To test this hypothesis, 19 strains of A. pullulans with previously described phenotypes...

  4. Phylogenetics

    NSDL National Science Digital Library

    2014-05-28

    This activity lets learners participate in the process of reconstructing a phylogenetic tree and introduces them to several core bioinformatics concepts, particularly in relation to evolution. Groups of learners (at least 10) repeat a secret message (five to seven similar-sounding words) like the game "Telephone". In this version of the game, however, learners write and then code what they hear, creating a model of a phylogenetic tree and using a species distance matrix. This resource includes background information about phylogenetic trees, maximum parsimony, and matrix theory (see page 6-7 of PDF).

  5. The COG database: new developments in phylogenetic classification of proteins from complete genomes

    Microsoft Academic Search

    Roman L. Tatusov; Darren A. Natale; Igor V. Garkavtsev; Tatiana A. Tatusova; Uma T. Shankavaram; Bachoti S. Rao; Boris Kiryutin; M. Y. Galperin; Natalie D. Fedorova; Eugene V. Koonin

    2001-01-01

    The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http:\\/\\/www.ncbi.nlm.nih. gov\\/COG). In addition, a supplement to the COGs is available, in which proteins encoded

  6. Molecular phylogenetic perspectives for character classification and convergence: Framing some issues with nematode vulval appendages and telotylenchid tail termini

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Characters flagged as convergent based on newer molecular phylogenetic trees inform both practical identification and more esoteric classification. Nematode morphological characters such as lateral lines, bullae and laciniae are quite independent structures from those similarly named in other organi...

  7. Abacus: Accurate behavioral classification of P2P-TV traffic Paola Bermolen a

    E-print Network

    Abacus: Accurate behavioral classification of P2P-TV traffic Paola Bermolen a , Marco Mellia b: Traffic classification Support Vector Machine P2P live-streaming a b s t r a c t Peer-to-Peer streaming (P2P-TV) applications offer the capability to watch real time video over the Internet at low cost

  8. Enhanced assessment of the wound-healing process by accurate multi-view tissue classification

    E-print Network

    Boyer, Edmond

    1 Enhanced assessment of the wound-healing process by accurate multi-view tissue classification of digital cameras, free- hand wound imaging has become common practice in clinical settings. There is however still a demand for a practical tool for accurate wound healing assessment, combining dimensional

  9. Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets

    E-print Network

    Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets Silvio a novel methodology to accurately classify the traffic gen- erated by P2P-TV applications, relying only implementation aspects of a P2P-TV application ­ such as network discovery and signaling activities, video

  10. The phagotrophic origin of eukaryotes and phylogenetic classification of Protozoa.

    PubMed

    Cavalier-Smith, T

    2002-03-01

    Eukaryotes and archaebacteria form the clade neomura and are sisters, as shown decisively by genes fragmented only in archaebacteria and by many sequence trees. This sisterhood refutes all theories that eukaryotes originated by merging an archaebacterium and an alpha-proteobacterium, which also fail to account for numerous features shared specifically by eukaryotes and actinobacteria. I revise the phagotrophy theory of eukaryote origins by arguing that the essentially autogenous origins of most eukaryotic cell properties (phagotrophy, endomembrane system including peroxisomes, cytoskeleton, nucleus, mitosis and sex) partially overlapped and were synergistic with the symbiogenetic origin of mitochondria from an alpha-proteobacterium. These radical innovations occurred in a derivative of the neomuran common ancestor, which itself had evolved immediately prior to the divergence of eukaryotes and archaebacteria by drastic alterations to its eubacterial ancestor, an actinobacterial posibacterium able to make sterols, by replacing murein peptidoglycan by N-linked glycoproteins and a multitude of other shared neomuran novelties. The conversion of the rigid neomuran wall into a flexible surface coat and the associated origin of phagotrophy were instrumental in the evolution of the endomembrane system, cytoskeleton, nuclear organization and division and sexual life-cycles. Cilia evolved not by symbiogenesis but by autogenous specialization of the cytoskeleton. I argue that the ancestral eukaryote was uniciliate with a single centriole (unikont) and a simple centrosomal cone of microtubules, as in the aerobic amoebozoan zooflagellate Phalansterium. I infer the root of the eukaryote tree at the divergence between opisthokonts (animals, Choanozoa, fungi) with a single posterior cilium and all other eukaryotes, designated 'anterokonts' because of the ancestral presence of an anterior cilium. Anterokonts comprise the Amoebozoa, which may be ancestrally unikont, and a vast ancestrally biciliate clade, named 'bikonts'. The apparently conflicting rRNA and protein trees can be reconciled with each other and this ultrastructural interpretation if long-branch distortions, some mechanistically explicable, are allowed for. Bikonts comprise two groups: corticoflagellates, with a younger anterior cilium, no centrosomal cone and ancestrally a semi-rigid cell cortex with a microtubular band on either side of the posterior mature centriole; and Rhizaria [a new infrakingdom comprising Cercozoa (now including Ascetosporea classis nov.), Retaria phylum nov., Heliozoa and Apusozoa phylum nov.], having a centrosomal cone or radiating microtubules and two microtubular roots and a soft surface, frequently with reticulopodia. Corticoflagellates comprise photokaryotes (Plantae and chromalveolates, both ancestrally with cortical alveoli) and Excavata (a new protozoan infrakingdom comprising Loukozoa, Discicristata and Archezoa, ancestrally with three microtubular roots). All basal eukaryotic radiations were of mitochondrial aerobes; hydrogenosomes evolved polyphyletically from mitochondria long afterwards, the persistence of their double envelope long after their genomes disappeared being a striking instance of membrane heredity. I discuss the relationship between the 13 protozoan phyla recognized here and revise higher protozoan classification by updating as subkingdoms Lankester's 1878 division of Protozoa into Corticata (Excavata, Alveolata; with prominent cortical microtubules and ancestrally localized cytostome--the Parabasalia probably secondarily internalized the cytoskeleton) and Gymnomyxa [infrakingdoms Sarcomastigota (Choanozoa, Amoebozoa) and Rhizaria; both ancestrally with a non-cortical cytoskeleton of radiating singlet microtubules and a relatively soft cell surface with diffused feeding]. As the eukaryote root almost certainly lies within Gymnomyxa, probably among the Sarcomastigota, Corticata are derived. Following the single symbiogenetic origin of chloroplasts in a corticoflagellate host with cortical alveoli, this ancestral plant radiated

  11. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment?

    PubMed Central

    2008-01-01

    Background While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood) and studied ways to improve the accuracy of trees obtained from such datasets. Results We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. Conclusion These results demonstrate that partial gene sequences and gappy multiple sequence alignments can pose a major problem for phylogenetic analysis. The concern will be greatest for high-throughput phylogenomic analyses, in which Neighbor Joining is often the preferred method due to its computational efficiency. Both approaches can be used to increase the accuracy of phylogenetic inference from a gappy alignment. The choice between the two approaches will depend upon how robust the application is to the loss of sequences from the input set, with alignment masking generally giving a much greater improvement in accuracy but at the cost of discarding a larger number of the input sequences. PMID:18366758

  12. Restriction data from chloroplast DNA for phylogenetic reconstruction: Is there only one accurate way of scoring?

    Microsoft Academic Search

    Birgitta Bremer

    1991-01-01

    Information from the same restriction analysis of chloroplast DNA of 33 taxa ofRubiaceae was scored in four different ways, two of which were based on fragments, and two on restriction sites, and they were subsequently analysed with Wagner parsimony. The methods resulted in different phylogenetic trees. The inherent differences between the methods relate to the amount of non-homologous characters and

  13. Molecular Phylogenetic Evaluation of Classification and Scenarios of Character Evolution in Calcareous Sponges (Porifera, Class Calcarea)

    PubMed Central

    Voigt, Oliver; Wülfing, Eilika; Wörheide, Gert

    2012-01-01

    Calcareous sponges (Phylum Porifera, Class Calcarea) are known to be taxonomically difficult. Previous molecular studies have revealed many discrepancies between classically recognized taxa and the observed relationships at the order, family and genus levels; these inconsistencies question underlying hypotheses regarding the evolution of certain morphological characters. Therefore, we extended the available taxa and character set by sequencing the complete small subunit (SSU) rDNA and the almost complete large subunit (LSU) rDNA of additional key species and complemented this dataset by substantially increasing the length of available LSU sequences. Phylogenetic analyses provided new hypotheses about the relationships of Calcarea and about the evolution of certain morphological characters. We tested our phylogeny against competing phylogenetic hypotheses presented by previous classification systems. Our data reject the current order-level classification by again finding non-monophyletic Leucosolenida, Clathrinida and Murrayonida. In the subclass Calcinea, we recovered a clade that includes all species with a cortex, which is largely consistent with the previously proposed order Leucettida. Other orders that had been rejected in the current system were not found, but could not be rejected in our tests either. We found several additional families and genera polyphyletic: the families Leucascidae and Leucaltidae and the genus Leucetta in Calcinea, and in Calcaronea the family Amphoriscidae and the genus Ute. Our phylogeny also provided support for the vaguely suspected close relationship of several members of Grantiidae with giantortical diactines to members of Heteropiidae. Similarly, our analyses revealed several unexpected affinities, such as a sister group relationship between Leucettusa (Leucaltidae) and Leucettidae and between Leucascandra (Jenkinidae) and Sycon carteri (Sycettidae). According to our results, the taxonomy of Calcarea is in desperate need of a thorough revision, which cannot be achieved by considering morphology alone or relying on a taxon sampling based on the current classification below the subclass level. PMID:22479395

  14. Rapid phylogenetic and functional classification of short genomic fragments with signature peptides

    PubMed Central

    2012-01-01

    Background Classification is difficult for shotgun metagenomics data from environments such as soils, where the diversity of sequences is high and where reference sequences from close relatives may not exist. Approaches based on sequence-similarity scores must deal with the confounding effects that inheritance and functional pressures exert on the relation between scores and phylogenetic distance, while approaches based on sequence alignment and tree-building are typically limited to a small fraction of gene families. We describe an approach based on finding one or more exact matches between a read and a precomputed set of peptide 10-mers. Results At even the largest phylogenetic distances, thousands of 10-mer peptide exact matches can be found between pairs of bacterial genomes. Genes that share one or more peptide 10-mers typically have high reciprocal BLAST scores. Among a set of 403 representative bacterial genomes, some 20 million 10-mer peptides were found to be shared. We assign each of these peptides as a signature of a particular node in a phylogenetic reference tree based on the RNA polymerase genes. We classify the phylogeny of a genomic fragment (e.g., read) at the most specific node on the reference tree that is consistent with the phylogeny of observed signature peptides it contains. Using both synthetic data from four newly-sequenced soil-bacterium genomes and ten real soil metagenomics data sets, we demonstrate a sensitivity and specificity comparable to that of the MEGAN metagenomics analysis package using BLASTX against the NR database. Phylogenetic and functional similarity metrics applied to real metagenomics data indicates a signal-to-noise ratio of approximately 400 for distinguishing among environments. Our method assigns ~6.6 Gbp/hr on a single CPU, compared with 25 kbp/hr for methods based on BLASTX against the NR database. Conclusions Classification by exact matching against a precomputed list of signature peptides provides comparable results to existing techniques for reads longer than about 300 bp and does not degrade severely with shorter reads. Orders of magnitude faster than existing methods, the approach is suitable now for inclusion in analysis pipelines and appears to be extensible in several different directions. PMID:22925230

  15. Accurate crop classification using hierarchical genetic fuzzy rule-based systems

    NASA Astrophysics Data System (ADS)

    Topaloglou, Charalampos A.; Mylonas, Stelios K.; Stavrakoudis, Dimitris G.; Mastorocostas, Paris A.; Theocharis, John B.

    2014-10-01

    This paper investigates the effectiveness of an advanced classification system for accurate crop classification using very high resolution (VHR) satellite imagery. Specifically, a recently proposed genetic fuzzy rule-based classification system (GFRBCS) is employed, namely, the Hierarchical Rule-based Linguistic Classifier (HiRLiC). HiRLiC's model comprises a small set of simple IF-THEN fuzzy rules, easily interpretable by humans. One of its most important attributes is that its learning algorithm requires minimum user interaction, since the most important learning parameters affecting the classification accuracy are determined by the learning algorithm automatically. HiRLiC is applied in a challenging crop classification task, using a SPOT5 satellite image over an intensively cultivated area in a lake-wetland ecosystem in northern Greece. A rich set of higher-order spectral and textural features is derived from the initial bands of the (pan-sharpened) image, resulting in an input space comprising 119 features. The experimental analysis proves that HiRLiC compares favorably to other interpretable classifiers of the literature, both in terms of structural complexity and classification accuracy. Its testing accuracy was very close to that obtained by complex state-of-the-art classification systems, such as the support vector machines (SVM) and random forest (RF) classifiers. Nevertheless, visual inspection of the derived classification maps shows that HiRLiC is characterized by higher generalization properties, providing more homogeneous classifications that the competitors. Moreover, the runtime requirements for producing the thematic map was orders of magnitude lower than the respective for the competitors.

  16. Genetic diversity and phylogenetic classification of viral hemorrhagic septicemia virus (VHSV).

    PubMed

    Basurco, B; Vende, P; Monnier, A F; Winton, J R; de Kinkelin, P; Benmansour, A

    1995-01-01

    The present study was undertaken to determine the genetic diversity of viral hemorrhagic septicemia virus (VHSV) and to gain insight into the molecular epidemiology of this fish rhabdovirus. The sequences of the nonstructural (NV) protein and the transmembrane (G) protein of sequential North American and European isolates of VHSV were determined and used to compute phylogenetic trees. According to the percentage of nucleotide or amino acid similarities, North American and European isolates formed 2 clearly distant genetic groups. While North American isolates clustered into a highly homogeneous genetic group, European isolates exhibited a higher genetic variability. Subgrouping based on this variability could be correlated with both the geographic origin and the serological classification. PMID:8581023

  17. Phylogenetic Classification of Protozoa Based on the Structure of the Linker Domain in the Bifunctional Enzyme, Dihydrofolate

    E-print Network

    Donald, Bruce Randall

    Phylogenetic Classification of Protozoa Based on the Structure of the Linker Domain a number of protozoa in two distinct and dissimilar structural fami- lies corresponding to two evolutionary- sification and evolutionary progression of the protozoa (2­5). It is thought that many eukaryotic taxa arose

  18. Towards a formal genealogical classification of the lezgian languages (north caucasus): testing various phylogenetic methods on lexical data.

    PubMed

    Kassian, Alexei

    2015-01-01

    A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies. PMID:25719456

  19. Towards a Formal Genealogical Classification of the Lezgian Languages (North Caucasus): Testing Various Phylogenetic Methods on Lexical Data

    PubMed Central

    Kassian, Alexei

    2015-01-01

    A lexicostatistical classification is proposed for 20 languages and dialects of the Lezgian group of the North Caucasian family, based on meticulously compiled 110-item wordlists, published as part of the Global Lexicostatistical Database project. The lexical data have been subsequently analyzed with the aid of the principal phylogenetic methods, both distance-based and character-based: Starling neighbor joining (StarlingNJ), Neighbor joining (NJ), Unweighted pair group method with arithmetic mean (UPGMA), Bayesian Markov chain Monte Carlo (MCMC), Unweighted maximum parsimony (UMP). Cognation indexes within the input matrix were marked by two different algorithms: traditional etymological approach and phonetic similarity, i.e., the automatic method of consonant classes (Levenshtein distances). Due to certain reasons (first of all, high lexicographic quality of the wordlists and a consensus about the Lezgian phylogeny among Caucasologists), the Lezgian database is a perfect testing area for appraisal of phylogenetic methods. For the etymology-based input matrix, all the phylogenetic methods, with the possible exception of UMP, have yielded trees that are sufficiently compatible with each other to generate a consensus phylogenetic tree of the Lezgian lects. The obtained consensus tree agrees with the traditional expert classification as well as some of the previously proposed formal classifications of this linguistic group. Contrary to theoretical expectations, the UMP method has suggested the least plausible tree of all. In the case of the phonetic similarity-based input matrix, the distance-based methods (StarlingNJ, NJ, UPGMA) have produced the trees that are rather close to the consensus etymology-based tree and the traditional expert classification, whereas the character-based methods (Bayesian MCMC, UMP) have yielded less likely topologies. PMID:25719456

  20. Determining suitable image resolutions for accurate supervised crop classification using remote sensing data

    NASA Astrophysics Data System (ADS)

    Löw, Fabian; Duveiller, Grégory

    2013-10-01

    Mapping the spatial distribution of crops has become a fundamental input for agricultural production monitoring using remote sensing. However, the multi-temporality that is often necessary to accurately identify crops and to monitor crop growth generally comes at the expense of coarser observation supports, and can lead to increasingly erroneous class allocations caused by mixed pixels. For a given application like crop classification, the spatial resolution requirement (e.g. in terms of a maximum tolerable pixel size) differs considerably over different landscapes. To analyse the spatial resolution requirements for accurate crop identification via image classification, this study builds upon and extends a conceptual framework established in a previous work1. This framework allows defining quantitatively the spatial resolution requirements for crop monitoring based on simulating how agricultural landscapes, and more specifically the fields covered by a crop of interest, are seen by instruments with increasingly coarser resolving power. The concept of crop specific pixel purity, defined as the degree of homogeneity of the signal encoded in a pixel with respect to the target crop type, is used to analyse how mixed the pixels can be (as they become coarser), without undermining their capacity to describe the desired surface properties. In this case, this framework has been steered towards answering the question: "What is the spatial resolution requirement for crop identification via supervised image classification, in particular minimum and coarsest acceptable pixel sizes, and how do these requirements change over different landscapes?" The framework is applied over four contrasting agro-ecological landscapes in Middle Asia. Inputs to the experiment were eight multi-temporal images from the RapidEye sensor, the simulated pixel sizes range from 6.5 m to 396.5 m. Constraining parameters for crop identification were defined by setting thresholds for classification accuracy and uncertainty. Different types of crops display marked individuality regarding the pixel size requirements, depending on the spatial structures and cropping pattern in the sites. The coarsest acceptable pixel sizes and corresponding purities for the same type of crop were found to vary from site to site, and some crops could not be identified using pixels coarser than 200 m.

  1. Fast and Accurate Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Novel Robustness Estimator

    NASA Astrophysics Data System (ADS)

    Lin, Yu; Rajan, Vaibhav; Moret, Bernard M. E.

    The rapid accumulation of whole-genome data has renewed interest in the study of genomic rearrangements. Comparative genomics, evolutionary biology, and cancer research all require models and algorithms to elucidate the mechanisms, history, and consequences of these rearrangements. However, even simple models lead to NP-hard problems, particularly in the area of phylogenetic analysis. Current approaches are limited to small collections of genomes and low-resolution data (typically a few hundred syntenic blocks). Moreover, whereas phylogenetic analyses from sequence data are deemed incomplete unless bootstrapping scores (a measure of confidence) are given for each tree edge, no equivalent to bootstrapping exists for rearrangement-based phylogenetic analysis.

  2. A classification of the Chloridoideae (Poaceae) based on multi-gene phylogenetic trees.

    PubMed

    Peterson, Paul M; Romaschenko, Konstantin; Johnson, Gabriel

    2010-05-01

    We conducted a molecular phylogenetic study of the subfamily Chloridoideae using six plastid DNA sequences (ndhA intron, ndhF, rps16-trnK, rps16 intron, rps3, and rpl32-trnL) and a single nuclear ITS DNA sequence. Our large original data set includes 246 species (17.3%) representing 95 genera (66%) of the grasses currently placed in the Chloridoideae. The maximum likelihood and Bayesian analysis of DNA sequences provides strong support for the monophyly of the Chloridoideae; followed by, in order of divergence: a Triraphideae clade with Neyraudia sister to Triraphis; an Eragrostideae clade with the Cotteinae (includes Cottea and Enneapogon) sister to the Uniolinae (includes Entoplocamia, Tetrachne, and Uniola), and a terminal Eragrostidinae clade of Ectrosia, Harpachne, and Psammagrostis embedded in a polyphyletic Eragrostis; a Zoysieae clade with Urochondra sister to a Zoysiinae (Zoysia) clade, and a terminal Sporobolinae clade that includes Spartina, Calamovilfa, Pogoneura, and Crypsis embedded in a polyphyletic Sporobolus; and a very large terminal Cynodonteae clade that includes 13 monophyletic subtribes. The Cynodonteae includes, in alphabetical order: Aeluropodinae (Aeluropus); Boutelouinae (Bouteloua); Eleusininae (includes Apochiton, Astrebla with Schoenefeldia embedded, Austrochloris, Brachyachne, Chloris, Cynodon with Brachyachne embedded in part, Eleusine, Enteropogon with Eustachys embedded in part, Eustachys, Chrysochloa, Coelachyrum, Leptochloa with Dinebra embedded, Lepturus, Lintonia, Microchloa, Saugetia, Schoenefeldia, Sclerodactylon, Tetrapogon, and Trichloris); Hilariinae (Hilaria); Monanthochloinae (includes Distichlis, Monanthochloe, and Reederochloa); Muhlenbergiinae (Muhlenbergia with Aegopogon, Bealia, Blepharoneuron, Chaboissaea, Lycurus, Pereilema, Redfieldia, Schaffnerella, and Schedonnardus all embedded); Orcuttiinae (includes Orcuttia and Tuctoria); Pappophorinae (includes Neesiochloa and Pappophorum); Scleropogoninae (includes Blepharidachne, Dasyochloa, Erioneuron, Munroa, Scleropogon, and Swallenia); Traginae (Tragus with Monelytrum, Polevansia, and Willkommia all embedded); Tridentinae (includes Gouinia, Tridens, Triplasis, and Vaseyochloa); Triodiinae (Triodia); and the Tripogoninae (Melanocenchris and Tripogon with Eragrostiella embedded). In our study the Cynodonteae still include 19 genera and the Zoysieae include a single genus that are not yet placed in a subtribe. The tribe Triraphideae and the subtribe Aeluropodinae are newly treated at that rank. We propose a new tribal and subtribal classification for all known genera in the Chloridoideae. The subfamily might have originated in Africa and/or Asia since the basal lineage, the Triraphideae, includes species with African and Asian distribution. PMID:20096795

  3. Photometric brown-dwarf classification. I. A method to identify and accurately classify large samples of brown dwarfs without spectroscopy

    NASA Astrophysics Data System (ADS)

    Skrzypek, N.; Warren, S. J.; Faherty, J. K.; Mortlock, D. J.; Burgasser, A. J.; Hewett, P. C.

    2015-02-01

    Aims: We present a method, named photo-type, to identify and accurately classify L and T dwarfs onto the standard spectral classification system using photometry alone. This enables the creation of large and deep homogeneous samples of these objects efficiently, without the need for spectroscopy. Methods: We created a catalogue of point sources with photometry in 8 bands, ranging from 0.75 to 4.6 ?m, selected from an area of 3344 deg2, by combining SDSS, UKIDSS LAS, and WISE data. Sources with 13.0 0.8, were then classified by comparison against template colours of quasars, stars, and brown dwarfs. The L and T templates, spectral types L0 to T8, were created by identifying previously known sources with spectroscopic classifications, and fitting polynomial relations between colour and spectral type. Results: Of the 192 known L and T dwarfs with reliable photometry in the surveyed area and magnitude range, 189 are recovered by our selection and classification method. We have quantified the accuracy of the classification method both externally, with spectroscopy, and internally, by creating synthetic catalogues and accounting for the uncertainties. We find that, brighter than J = 17.5, photo-type classifications are accurate to one spectral sub-type, and are therefore competitive with spectroscopic classifications. The resultant catalogue of 1157 L and T dwarfs will be presented in a companion paper.

  4. More accurate diagnosis in electric power apparatus conditions using ensemble classification methods

    Microsoft Academic Search

    Hideo Hirose; Faisal Zaman

    2011-01-01

    Recently, the classification study is accelerated, especially in machine learning expertise. Although the decision tree was still recommended as a classification tool in diagnosing electric power apparatus because of the property having the visible if-then rule, the recent development in classification methods, especially those using the ensemble methods, suggests us to apply these methods to condition diagnosis area. In this

  5. Genetic diversity and phylogenetic classification of viral hemorrhagic septicemia virus (VHSV)

    E-print Network

    Paris-Sud XI, Université de

    of neutralizing anti- bodies. Four serotypes based on neutralization tests have been described. Improving our phylogenetic trees. According to the percentage of nucleotide or amino acid similarities, North American based on this variability could be correlated with both the geographic origin and the serological

  6. Highly Accurate Classification of Watson-Crick Basepairs on Termini of Single DNA Molecules

    Microsoft Academic Search

    Stephen Winters-Hilt; Wenonah Vercoutere; Veronica S. DeGuzman; David Deamer; Mark Akeson; David Haussler

    2003-01-01

    We introduce a computational method for classification of individual DNA molecules measured by an ?-hemolysin channel detector. We show classification with better than 99% accuracy for DNA hairpin molecules that differ only in their terminal Watson-Crick basepairs. Signal classification was done in silico to establish performance metrics (i.e., where train and test data were of known type, via single-species data

  7. Beyond classification: gene-family phylogenies from shotgun metagenomic reads enable accurate community analysis

    PubMed Central

    2013-01-01

    Background Sequence-based phylogenetic trees are a well-established tool for characterizing diversity of both macroorganisms and microorganisms. Phylogenetic methods have recently been applied to shotgun metagenomic data from microbial communities, particularly with the aim of classifying reads. But the accuracy of gene-family phylogenies that characterize evolutionary relationships among short, non-overlapping sequencing reads has not been thoroughly evaluated. Results To quantify errors in metagenomic read trees, we developed MetaPASSAGE, a software pipeline to generate in silico bacterial communities, simulate a sample of shotgun reads from a gene family represented in the community, orient or translate reads, and produce a profile-based alignment of the reads from which a gene-family phylogenetic tree can be built. We applied MetaPASSAGE to a variety of RNA and protein-coding gene families, built trees using a range of different phylogenetic methods, and compared the resulting trees using topological and branch-length error metrics. We identified read length as one of the major sources of error. Because phylogenetic methods use a reference database of full-length sequences from the gene family to guide construction of alignments and trees, we found that error can also be substantially reduced through increasing the size and diversity of the reference database. Finally, UniFrac analysis, which compares metagenomic samples based on a summary statistic computed over all branches in a read tree, is very robust to the level of error we observe. Conclusions Bacterial community diversity can be quantified using phylogenetic approaches applied to shotgun metagenomic data. As sequencing reads get longer and more genomes across the bacterial tree of life are sequenced, the accuracy of this approach will continue to improve, opening the door to more applications. PMID:23799973

  8. Phylogenetic classification of the frog pathogen Amphibiothecum (Dermosporidium) penneri based on small ribosomal subunit sequencing

    USGS Publications Warehouse

    Feldman, S.H.; Wimsatt, J.H.; Green, D.E.

    2005-01-01

    We determined 1,600 base pairs of DNA sequence in the 18S small ribosomal subunit from two geographically distinct isolates of Dermosporidium penneri. Maximum likelihood and parsimony analysis of these sequences place D. penneri in the order Dermocystida of the class Mesomycetozoea. The 18S rRNA sequences from these two isolates only differ within a single region of 16 contiguous nucleotides. Based on the distant phylogenetic relationship of these organisms to Amphibiocystidium ranae and similarity to Sphaerothecum destruens we propose the organism be renamed Amphibiothecum penneri.

  9. Toward a Phylogenetic Classification of Primates Based on DNA Evidence Complemented by Fossil Evidence

    Microsoft Academic Search

    Morris Goodman; Calvin A. Porter; John Czelusniak; Scott L. Page; Horacio Schneider; Jeheskel Shoshani; Gregg Gunnell; Colin P. Groves

    1998-01-01

    A highly resolved primate cladogram based on DNA evidence is congruent with extant and fossil osteological evidence. A provisional primate classification based on this cladogram and the time scale provided by fossils and the model of local molecular clocks has all named taxa represent clades and assigns the same taxonomic rank to those clades of roughly equivalent age. Order Primates

  10. Archaeal--Eubacterial Mergers in the Origin of Eukarya: Phylogenetic Classification of Life

    Microsoft Academic Search

    Lynn Margulis

    1996-01-01

    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. ``Kingdoms'' and ``Domains'' are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life-history,

  11. Isolation and phylogenetic classification of culturable psychrophilic prokaryotes from the Collins glacier in the Antarctica

    Microsoft Academic Search

    S. A. García-Echauri; M. Gidekel; A. Gutiérrez-Moraga; L. Santos; A. De León-Rodríguez

    Culturable psychrophilic prokaryotes were obtained of samples of glacier sediment, seaside mud, glacier melted ice, and Deschampsia antarctica rhizosphere from Collins glacier, Antarctica. The taxonomic classification was done by a culture-dependent molecular approach\\u000a involving the Amplified Ribosomal DNA Restriction Analysis. Two hundred sixty colonies were successfully isolated and sub-cultivated\\u000a under laboratory conditions. The analysis showed a bacterial profile dominated by

  12. Assignment of Calibration Information to Deeper Phylogenetic Nodes is More Effective in Obtaining Precise and Accurate Divergence Time Estimates

    PubMed Central

    Mello, Beatriz; Schrago, Carlos G

    2014-01-01

    Divergence time estimation has become an essential tool for understanding macroevolutionary events. Molecular dating aims to obtain reliable inferences, which, within a statistical framework, means jointly increasing the accuracy and precision of estimates. Bayesian dating methods exhibit the propriety of a linear relationship between uncertainty and estimated divergence dates. This relationship occurs even if the number of sites approaches infinity and places a limit on the maximum precision of node ages. However, how the placement of calibration information may affect the precision of divergence time estimates remains an open question. In this study, relying on simulated and empirical data, we investigated how the location of calibration within a phylogeny affects the accuracy and precision of time estimates. We found that calibration priors set at median and deep phylogenetic nodes were associated with higher precision values compared to analyses involving calibration at the shallowest node. The results were independent of the tree symmetry. An empirical mammalian dataset produced results that were consistent with those generated by the simulated sequences. Assigning time information to the deeper nodes of a tree is crucial to guarantee the accuracy and precision of divergence times. This finding highlights the importance of the appropriate choice of outgroups in molecular dating. PMID:24855333

  13. Assignment of Calibration Information to Deeper Phylogenetic Nodes is More Effective in Obtaining Precise and Accurate Divergence Time Estimates.

    PubMed

    Mello, Beatriz; Schrago, Carlos G

    2014-01-01

    Divergence time estimation has become an essential tool for understanding macroevolutionary events. Molecular dating aims to obtain reliable inferences, which, within a statistical framework, means jointly increasing the accuracy and precision of estimates. Bayesian dating methods exhibit the propriety of a linear relationship between uncertainty and estimated divergence dates. This relationship occurs even if the number of sites approaches infinity and places a limit on the maximum precision of node ages. However, how the placement of calibration information may affect the precision of divergence time estimates remains an open question. In this study, relying on simulated and empirical data, we investigated how the location of calibration within a phylogeny affects the accuracy and precision of time estimates. We found that calibration priors set at median and deep phylogenetic nodes were associated with higher precision values compared to analyses involving calibration at the shallowest node. The results were independent of the tree symmetry. An empirical mammalian dataset produced results that were consistent with those generated by the simulated sequences. Assigning time information to the deeper nodes of a tree is crucial to guarantee the accuracy and precision of divergence times. This finding highlights the importance of the appropriate choice of outgroups in molecular dating. PMID:24855333

  14. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life.

    PubMed

    Margulis, L

    1996-02-01

    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life-history, and fossil record evidence support the reunification of bacteria as Prokarya while subdividing Eukarya into uniquely defined subtaxa: Protoctista, Animalia, Fungi, and Plantae. PMID:8577716

  15. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life.

    PubMed Central

    Margulis, L

    1996-01-01

    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life-history, and fossil record evidence support the reunification of bacteria as Prokarya while subdividing Eukarya into uniquely defined subtaxa: Protoctista, Animalia, Fungi, and Plantae. Images Fig. 1 PMID:8577716

  16. Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life

    NASA Technical Reports Server (NTRS)

    Margulis, L.

    1996-01-01

    A symbiosis-based phylogeny leads to a consistent, useful classification system for all life. "Kingdoms" and "Domains" are replaced by biological names for the most inclusive taxa: Prokarya (bacteria) and Eukarya (symbiosis-derived nucleated organisms). The earliest Eukarya, anaerobic mastigotes, hypothetically originated from permanent whole-cell fusion between members of Archaea (e.g., Thermoplasma-like organisms) and of Eubacteria (e.g., Spirochaeta-like organisms). Molecular biology, life-history, and fossil record evidence support the reunification of bacteria as Prokarya while subdividing Eukarya into uniquely defined subtaxa: Protoctista, Animalia, Fungi, and Plantae.

  17. Automatic phylogenetic classification of bacterial beta-lactamase sequences including structural and antibiotic substrate preference information.

    PubMed

    Ma, Jianmin; Eisenhaber, Frank; Maurer-Stroh, Sebastian

    2013-12-01

    Beta lactams comprise the largest and still most effective group of antibiotics, but bacteria can gain resistance through different beta lactamases that can degrade these antibiotics. We developed a user friendly tree building web server that allows users to assign beta lactamase sequences to their respective molecular classes and subclasses. Further clinically relevant information includes if the gene is typically chromosomal or transferable through plasmids as well as listing the antibiotics which the most closely related reference sequences are known to target and cause resistance against. This web server can automatically build three phylogenetic trees: the first tree with closely related sequences from a Tachyon search against the NCBI nr database, the second tree with curated reference beta lactamase sequences, and the third tree built specifically from substrate binding pocket residues of the curated reference beta lactamase sequences. We show that the latter is better suited to recover antibiotic substrate assignments through nearest neighbor annotation transfer. The users can also choose to build a structural model for the query sequence and view the binding pocket residues of their query relative to other beta lactamases in the sequence alignment as well as in the 3D structure relative to bound antibiotics. This web server is freely available at http://blac.bii.a-star.edu.sg/. PMID:24372040

  18. Expression analysis of LIM gene family in poplar, toward an updated phylogenetic classification

    PubMed Central

    2012-01-01

    Background Plant LIM domain proteins may act as transcriptional activators of lignin biosynthesis and/or as actin binding and bundling proteins. Plant LIM genes have evolved in phylogenetic subgroups differing in their expression profiles: in the whole plant or specifically in pollen. However, several poplar PtLIM genes belong to uncharacterized monophyletic subgroups and the expression patterns of the LIM gene family in a woody plant have not been studied. Findings In this work, the expression pattern of the twelve duplicated poplar PtLIM genes has been investigated by semi quantitative RT-PCR in different vegetative and reproductive tissues. As in other plant species, poplar PtLIM genes were widely expressed in the tree or in particular tissues. Especially, PtXLIM1a, PtXLIM1b and PtWLIM1b genes were preferentially expressed in the secondary xylem, suggesting a specific function in wood formation. Moreover, the expression of these genes and of the PtPLIM2a gene was increased in tension wood. Western-blot analysis confirmed the preferential expression of PtXLIM1a protein during xylem differentiation and tension wood formation. Genes classified within the pollen specific PLIM2 and PLIM2-like subgroups were all strongly expressed in pollen but also in cottony hairs. Interestingly, pairs of duplicated PtLIM genes exhibited different expression patterns indicating subfunctionalisations in specific tissues. Conclusions The strong expression of several LIM genes in cottony hairs and germinating pollen, as well as in xylem fibers suggests an involvement of plant LIM domain proteins in the control of cell expansion. Comparisons of expression profiles of poplar LIM genes with the published functions of closely related plant LIM genes suggest conserved functions in the areas of lignin biosynthesis, pollen tube growth and mechanical stress response. Based on these results, we propose a novel nomenclature of poplar LIM domain proteins. PMID:22339987

  19. Deceptive Desmas: Molecular Phylogenetics Suggests a New Classification and Uncovers Convergent Evolution of Lithistid Demosponges

    PubMed Central

    Schuster, Astrid; Erpenbeck, Dirk; Pisera, Andrzej; Hooper, John; Bryce, Monika; Fromont, Jane; Wörheide, Gert

    2015-01-01

    Reconciling the fossil record with molecular phylogenies to enhance the understanding of animal evolution is a challenging task, especially for taxa with a mostly poor fossil record, such as sponges (Porifera). ‘Lithistida’, a polyphyletic group of recent and fossil sponges, are an exception as they provide the richest fossil record among demosponges. Lithistids, currently encompassing 13 families, 41 genera and >300 recent species, are defined by the common possession of peculiar siliceous spicules (desmas) that characteristically form rigid articulated skeletons. Their phylogenetic relationships are to a large extent unresolved and there has been no (taxonomically) comprehensive analysis to formally reallocate lithistid taxa to their closest relatives. This study, based on the most comprehensive molecular and morphological investigation of ‘lithistid’ demosponges to date, corroborates some previous weakly-supported hypotheses, and provides novel insights into the evolutionary relationships of the previous ‘order Lithistida’. Based on molecular data (partial mtDNA CO1 and 28S rDNA sequences), we show that 8 out of 13 ‘Lithistida’ families belong to the order Astrophorida, whereas Scleritodermidae and Siphonidiidae form a separate monophyletic clade within Tetractinellida. Most lithistid astrophorids are dispersed between different clades of the Astrophorida and we propose to formally reallocate them, respectively. Corallistidae, Theonellidae and Phymatellidae are monophyletic, whereas the families Pleromidae and Scleritodermidae are polyphyletic. Family Desmanthidae is polyphyletic and groups within Halichondriidae – we formally propose a reallocation. The sister group relationship of the family Vetulinidae to Spongillida is confirmed and we propose here for the first time to include Vetulina into a new Order Sphaerocladina. Megascleres and microscleres possibly evolved and/or were lost several times independently in different ‘lithistid’ taxa, and microscleres might at least be four times more likely lost than megascleres. Desma spicules occasionally may have undergone secondary losses too. Our study provides a framework for further detailed investigations of this important demosponge group. PMID:25565279

  20. Phylogenetic analysis, genomic diversity and classification of M class gene segments of turkey reoviruses.

    PubMed

    Mor, Sunil K; Marthaler, Douglas; Verma, Harsha; Sharafeldin, Tamer A; Jindal, Naresh; Porter, Robert E; Goyal, Sagar M

    2015-03-23

    From 2011 to 2014, 13 turkey arthritis reoviruses (TARVs) were isolated from cases of swollen hock joints in 2-18-week-old turkeys. In addition, two isolates from similar cases of turkey arthritis were received from another laboratory. Eight turkey enteric reoviruses (TERVs) isolated from fecal samples of turkeys were also used for comparison. The aims of this study were to characterize turkey reovirus (TRV) based on complete M class genome segments and to determine genetic diversity within TARVs in comparison to TERVs and chicken reoviruses (CRVs). Nucleotide (nt) cut off values of 84%, 83% and 85% for the M1, M2 and M3 gene segments were proposed and used for genotype classification, generating 5, 7, and 3 genotypes, respectively. Using these nt cut off values, we propose M class genotype constellations (GCs) for avian reoviruses. Of the seven GCs, GC1 and GC3 were shared between the TARVs and TERVs, indicating possible reassortment between turkey and chicken reoviruses. The TARVs and TERVs were divided into three GCs, and GC2 was unique to TARVs and TERVs. The proposed new GC approach should be useful in identifying reassortant viruses, which may ultimately be used in the design of a universal vaccine against both chicken and turkey reoviruses. PMID:25655814

  1. A Comprehensive Guide for the Accurate Classification of Murine Hair Follicles in Distinct Hair Cycle Stages

    Microsoft Academic Search

    Sven Müller-Röver; Bori Handjiski; Carina van der Veen; Stefan Eichmüller; Kerstin Foitzik; Ian A. McKay; Kurt S. Stenn; Ralf Paus

    2001-01-01

    Numerous strains of mice with defined mutations display pronounced abnormalities of hair follicle cycling, even in the absence of overt alterations of the skin and hair phenotype; however, in order to recognize even subtle, hair cycle-related abnormalities, it is critically important to be able to determine accurately and classify the major stages of the normal murine hair cycle. In this

  2. GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications in Cheminformatics

    Microsoft Academic Search

    Aaron Smalter; Jun Huan; Yi Jia; Gerald Lushington

    2010-01-01

    Graph data mining is an active research area. Graphs are general modeling tools to organize information from heterogeneous sources and have been applied in many scientific, engineering, and business fields. With the fast accumulation of graph data, building highly accurate predictive models for graph data emerges as a new challenge that has not been fully explored in the data mining

  3. Photometric brown-dwarf classification. I. A method to identify and accurately classify large samples of brown dwarfs without spectroscopy

    E-print Network

    Skrzypek, Nathalie; Faherty, Jacqueline K; Mortlock, Daniel J; Burgasser, Adam J; Hewett, Paul C

    2014-01-01

    Aims. We present a method, named photo-type, to identify and accurately classify L and T dwarfs onto the standard spectral classification system using photometry alone. This enables the creation of large and deep homogeneous samples of these objects efficiently, without the need for spectroscopy. Methods. We created a catalogue of point sources with photometry in 8 bands, ranging from 0.75 to 4.6 microns, selected from an area of 3344 deg^2, by combining SDSS, UKIDSS LAS, and WISE data. Sources with 13.0 0.8, were then classified by comparison against template colours of quasars, stars, and brown dwarfs. The L and T templates, spectral types L0 to T8, were created by identifying previously known sources with spectroscopic classifications, and fitting polynomial relations between colour and spectral type. Results. Of the 192 known L and T dwarfs with reliable photometry in the surveyed area and magnitude range, 189 are recovered by our selection and classification method. We have quantified the accuracy of th...

  4. Classification algorithms with multi-modal data fusion could accurately distinguish neuromyelitis optica from multiple sclerosis.

    PubMed

    Eshaghi, Arman; Riyahi-Alam, Sadjad; Saeedi, Roghayyeh; Roostaei, Tina; Nazeri, Arash; Aghsaei, Aida; Doosti, Rozita; Ganjgahi, Habib; Bodini, Benedetta; Shakourirad, Ali; Pakravan, Manijeh; Ghana'ati, Hossein; Firouznia, Kavous; Zarei, Mojtaba; Azimi, Amir Reza; Sahraian, Mohammad Ali

    2015-01-01

    Neuromyelitis optica (NMO) exhibits substantial similarities to multiple sclerosis (MS) in clinical manifestations and imaging results and has long been considered a variant of MS. With the advent of a specific biomarker in NMO, known as anti-aquaporin 4, this assumption has changed; however, the differential diagnosis remains challenging and it is still not clear whether a combination of neuroimaging and clinical data could be used to aid clinical decision-making. Computer-aided diagnosis is a rapidly evolving process that holds great promise to facilitate objective differential diagnoses of disorders that show similar presentations. In this study, we aimed to use a powerful method for multi-modal data fusion, known as a multi-kernel learning and performed automatic diagnosis of subjects. We included 30 patients with NMO, 25 patients with MS and 35 healthy volunteers and performed multi-modal imaging with T1-weighted high resolution scans, diffusion tensor imaging (DTI) and resting-state functional MRI (fMRI). In addition, subjects underwent clinical examinations and cognitive assessments. We included 18 a priori predictors from neuroimaging, clinical and cognitive measures in the initial model. We used 10-fold cross-validation to learn the importance of each modality, train and finally test the model performance. The mean accuracy in differentiating between MS and NMO was 88%, where visible white matter lesion load, normal appearing white matter (DTI) and functional connectivity had the most important contributions to the final classification. In a multi-class classification problem we distinguished between all of 3 groups (MS, NMO and healthy controls) with an average accuracy of 84%. In this classification, visible white matter lesion load, functional connectivity, and cognitive scores were the 3 most important modalities. Our work provides preliminary evidence that computational tools can be used to help make an objective differential diagnosis of NMO and MS. PMID:25610795

  5. Use B-factor related features for accurate classification between protein binding interfaces and crystal packing contacts

    PubMed Central

    2014-01-01

    Background Distinction between true protein interactions and crystal packing contacts is important for structural bioinformatics studies to respond to the need of accurate classification of the rapidly increasing protein structures. There are many unannotated crystal contacts and there also exist false annotations in this rapidly expanding volume of data. Previous tools have been proposed to address this problem. However, challenging issues still remain, such as low performance when the training and test data contain mixed interfaces having diverse sizes of contact areas. Methods and results B factor is a measure to quantify the vibrational motion of an atom, a more relevant feature than interface size to characterize protein binding. We propose to use three features related to B factor for the classification between biological interfaces and crystal packing contacts. The first feature is the sum of the normalized B factors of the interfacial atoms in the contact area, the second is the average of the interfacial B factor per residue in the chain, and the third is the average number of interfacial atoms with a negative normalized B factor per residue in the chain. We investigate the distribution properties of these basic features and a compound feature on four datasets of biological binding and crystal packing, and on a protein binding-only dataset with known binding affinity. We also compare the cross-dataset classification performance of these features with existing methods and with a widely-used and the most effective feature interface area. The results demonstrate that our features outperform the interface area approach and the existing prediction methods remarkably for many tests on all of these datasets. Conclusions The proposed B factor related features are more effective than interface area to distinguish crystal packing from biological binding interfaces. Our computational methods have a potential for large-scale and accurate identification of biological interactions from the experimentally determined structural data stored at PDB which may have diverse interface sizes. PMID:25522196

  6. Phylogenetic Classification and Species Identification of Dermatophyte Strains Based on DNA Sequences of Nuclear Ribosomal Internal Transcribed Spacer 1 Regions

    Microsoft Academic Search

    KOICHI MAKIMURA; YOSHIKO TAMURA; TAKASHI MOCHIZUKI; ATSUHIKO HASEGAWA; YOSHITO TAJIRI; RYO HANAZAWA; KATSUHISA UCHIDA; HIUGA SAITO; HIDEYO YAMAGUCHI

    1999-01-01

    The mutual phylogenetic relationships of dermatophytes of the genera Trichophyton, Microsporum, and Epidermophyton were demonstrated by using internal transcribed spacer 1 (ITS1) region ribosomal DNA sequences. Trichophyton spp. and Microsporum spp. form a cluster in the phylogenetic tree with Epidermophyton floccosum as an outgroup, and within this cluster, all Trichophyton spp. except Trichophyton terrestre form a nested cluster (100% bootstrap

  7. The genus Spiroplasma and its non-helical descendants: phylogenetic classification, correlation with phenotype and roots of the Mycoplasma mycoides clade.

    PubMed

    Gasparich, Gail E; Whitcomb, Robert F; Dodge, Deborah; French, Frank E; Glass, John; Williamson, David L

    2004-05-01

    The genus Spiroplasma (helical mollicutes: Bacteria: Firmicutes: Mollicutes: Entomoplasmatales: Spiroplasmataceae) is associated primarily with insects. The Mycoplasma mycoides cluster (sensu Weisburg et al. 1989 and Johansson and Pettersson 2002) is a group of mollicutes that includes the type species - Mycoplasma mycoides - of Mycoplasmatales, Mycoplasmataceae and Mycoplasma. This cluster, associated solely with ruminants, contains five other species and subspecies. Earlier phylogenetic reconstructions based on partial 16S rDNA sequences and a limited sample of Spiroplasma and Mycoplasma sequences suggested that the genus Mycoplasma was polyphyletic, as the M. mycoides cluster and the grouping that consisted of the hominis and pneumoniae groups of Mycoplasma species were widely separated phylogenetically and the M. mycoides cluster was allied with Spiroplasma. It is shown here that the M. mycoides cluster arose from Spiroplasma through an intermediate group of non-helical spiroplasmal descendants - the Entomoplasmataceae. As this conclusion has profound implications in the taxonomy of Mollicutes, a detailed phylogenetic study of Spiroplasma and its non-helical descendants was undertaken. These analyses, done with maximum-parsimony, provide cladistic status; a new nomenclature is introduced here, based on 'bottom-up' rather than 'top-down' clade classification. The order Entomoplasmatales consists of four major clades: (i) the Mycoides-Entomoplasmataceae clade, which contains M. mycoides and its allies and Entomoplasma and Mesoplasma species and is a sister lineage to (ii) the Apis clade of Spiroplasma. Spiroplasma and the Entomoplasmataceae are paraphyletic, but this status does not diminish their phylogenetic usefulness. Five species that were previously unclassified phylogenetically are basal to the Apis clade sensu strictu and to the Mycoides clade. One of these species, Spiroplasma sp. TIUS-1, has very poor helicity and a very small genome (840 kbp); this putative species can be envisioned as a 'missing link' in the evolution of the Mycoides-Entomoplasmataceae clade. The other two Spiroplasma clades are: (iii) the Citri-Chrysopicola-Mirum clade (serogroups I, II, V and VIII) and (iv) the ixodetis clade (serogroup VI). As Mesoplasma lactucae represents a basal divergence within the Mycoides-Entomoplasmataceae clade, and as Entomoplasma freundtii is basal to the Mycoides clade, M. mycoides and its allies must have arisen from an ancestor in the Entomoplasmataceae. The paraphyletic grouping that consists of the Hominis and Pneumoniae groups (sensu Johansson & Pettersson 2002) of Mycoplasma species contains the ancestral roots of Ureaplasma spp. and haemoplasmas. This clade is a sister lineage to the Entomoplasmatales clade. Serological classifications of spiroplasma are very highly supported by the trees presented. Genome size and G+C content of micro-organismal DNA were moderately conserved, but there have been frequent and polyphyletically distributed genome reductions. Sterol requirements were polyphyletic, as was the ability to grow in the presence of polyoxyethylene sorbitan-supplemented, but not serum-supplemented, media. As this character is not phylogenetically distributed, Mesoplasma and Entomoplasma should be combined into a single genus. The phylogenetic trees presented here confirm previous reports of polyphyly of the genus Mycoplasma. As both clades of Mycoplasma contain several species of great practical importance, a change of the genus name for species in either clade would have immense practical implications. In addition, a change of the genus name for M. mycoides would have to be approved by the Judicial Commission. For these reasons, the Linnaean and phylogenetic classifications of Mycoplasma must for now be discrepant. PMID:15143041

  8. A comprehensive multilocus phylogeny of the Neotropical cotingas (Cotingidae, Aves) with a comparative evolutionary analysis of breeding system and plumage dimorphism and a revised phylogenetic classification.

    PubMed

    Berv, Jacob S; Prum, Richard O

    2014-12-01

    The Neotropical cotingas (Cotingidae: Aves) are a group of passerine birds that are characterized by extreme diversity in morphology, ecology, breeding system, and behavior. Here, we present a comprehensive phylogeny of the Neotropical cotingas based on six nuclear and mitochondrial loci (?7500 bp) for a sample of 61 cotinga species in all 25 genera, and 22 species of suboscine outgroups. Our taxon sample more than doubles the number of cotinga species studied in previous analyses, and allows us to test the monophyly of the cotingas as well as their intrageneric relationships with high resolution. We analyze our genetic data using a Bayesian species tree method, and concatenated Bayesian and maximum likelihood methods, and present a highly supported phylogenetic hypothesis. We confirm the monophyly of the cotingas, and present the first phylogenetic evidence for the relationships of Phibalura flavirostris as the sister group to Ampelion and Doliornis, and the paraphyly of Lipaugus with respect to Tijuca. In addition, we resolve the diverse radiations within the Cotinga, Lipaugus, Pipreola, and Procnias genera. We find no support for Darwin's (1871) hypothesis that the increase in sexual selection associated with polygynous breeding systems drives the evolution of color dimorphism in the cotingas, at least when analyzed at a broad categorical scale. Finally, we present a new comprehensive phylogenetic classification of all cotinga species. PMID:25234241

  9. Phylogenetics, ancestral state reconstruction, and a new infrafamilial classification of the pantropical Ochnaceae (Medusagynaceae, Ochnaceae s.str., Quiinaceae) based on five DNA regions.

    PubMed

    Schneider, Julio V; Bissiengou, Pulcherie; Amaral, Maria do Carmo E; Tahir, Ali; Fay, Michael F; Thines, Marco; Sosef, Marc S M; Zizka, Georg; Chatrou, Lars W

    2014-09-01

    Ochnaceae s.str. (Malpighiales) are a pantropical family of about 500 species and 27 genera of almost exclusively woody plants. Infrafamilial classification and relationships have been controversial partially due to the lack of a robust phylogenetic framework. Including all genera except Indosinia and Perissocarpa and DNA sequence data for five DNA regions (ITS, matK, ndhF, rbcL, trnL-F), we provide for the first time a nearly complete molecular phylogenetic analysis of Ochnaceae s.l. resolving most of the phylogenetic backbone of the family. Based on this, we present a new classification of Ochnaceae s.l., with Medusagynoideae and Quiinoideae included as subfamilies and the former subfamilies Ochnoideae and Sauvagesioideae recognized at the rank of tribe. Our data support a monophyletic Ochneae, but Sauvagesieae in the traditional circumscription is paraphyletic because Testulea emerges as sister to the rest of Ochnoideae, and the next clade shows Luxemburgia+Philacra as sister group to the remaining Ochnoideae. To avoid paraphyly, we classify Luxemburgieae and Testuleeae as new tribes. The African genus Lophira, which has switched between subfamilies (here tribes) in past classifications, emerges as sister to all other Ochneae. Thus, endosperm-free seeds and ovules with partly to completely united integuments (resulting in an apparently single integument) are characters that unite all members of that tribe. The relationships within its largest clade, Ochnineae (former Ochneae), are poorly resolved, but former Ochninae (Brackenridgea, Ochna) are polyphyletic. Within Sauvagesieae, the genus Sauvagesia in its broad circumscription is polyphyletic as Sauvagesia serrata is sister to a clade of Adenarake, Sauvagesia spp., and three other genera. Within Quiinoideae, in contrast to former phylogenetic hypotheses, Lacunaria and Touroulia form a clade that is sister to Quiina. Bayesian ancestral state reconstructions showed that zygomorphic flowers with adaptations to buzz-pollination (poricidal anthers), a syncarpous gynoecium (a near-apocarpous gynoecium evolved independently in Quiinoideae and Ochninae), numerous ovules, septicidal capsules, and winged seeds with endosperm are the ancestral condition in Ochnoideae. Although in some lineages poricidal anthers were lost secondarily, the evolution of poricidal superstructures secured the maintenance of buzz-pollination in some of these genera, indicating a strong selective pressure on keeping that specialized pollination system. PMID:24862223

  10. Use of Extended Phylogenetic Profiles with E-Values and Support Vector Machines for Protein Family Classification

    E-print Network

    Liao, Li

    . Such extension allows for direct use of E-Values, instead of imposing an ad hoc cut-off to derive binary profilesUse of Extended Phylogenetic Profiles with E-Values and Support Vector Machines for Protein Family, which are commonly used in previous methods. A scoring scheme is adopted for measuring the similarity

  11. Classification

    NASA Astrophysics Data System (ADS)

    Oza, Nikunj

    2012-03-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. A set of training examples— examples with known output values—is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate’s measurements. The generalization performance of a learned model (how closely the target outputs and the model’s predicted outputs agree for patterns that have not been presented to the learning algorithm) would provide an indication of how well the model has learned the desired mapping. More formally, a classification learning algorithm L takes a training set T as its input. The training set consists of |T| examples or instances. It is assumed that there is a probability distribution D from which all training examples are drawn independently—that is, all the training examples are independently and identically distributed (i.i.d.). The ith training example is of the form (x_i, y_i), where x_i is a vector of values of several features and y_i represents the class to be predicted.* In the sunspot classification example given above, each training example would represent one sunspot’s classification (y_i) and the corresponding set of measurements (x_i). The output of a supervised learning algorithm is a model h that approximates the unknown mapping from the inputs to the outputs. In our example, h would map from the sunspot measurements to the type of sunspot. We may have a test set S—a set of examples not used in training that we use to test how well the model h predicts the outputs on new examples. Just as with the examples in T, the examples in S are assumed to be independent and identically distributed (i.i.d.) draws from the distribution D. We measure the error of h on the test set as the proportion of test cases that h misclassifies: 1/|S| Sigma(x,y union S)[I(h(x)!= y)] where I(v) is the indicator function—it returns 1 if v is true and 0 otherwise. In our sunspot classification example, we would identify additional examples of sunspots that were not used in generating the model, and use these to determine how accurate the model is—the fraction of the test samples that the model classifies correctly. An example of a classification model is the decision tree shown in Figure 23.1. We will discuss the decision tree learning algorithm in more detail later—for now, we assume that, given a training set with examples of sunspots, this decision tree is derived. This can be used to classify previously unseen examples of sunpots. For example, if a new sunspot’s inputs indicate that its "Group Length" is in the range 10-15, then the decision tree would classify the sunspot as being of type “E,” whereas if the "Group Length" is "NULL," the "Magnetic Type" is "bipolar," and the "Penumbra" is "rudimentary," then it would be classified as type "C." In this chapter, we will add to the above description of classification problems. We will discuss decision trees and several other classification models. In particular, we will discuss the learning algorithms that generate these classification models, how to use them to classify new

  12. Increasing the data size to accurately reconstruct the phylogenetic relationships between nine subgroups of the Drosophila melanogaster species group (Drosophilidae, Diptera).

    PubMed

    Yang, Yong; Hou, Zhuo-Cheng; Qian, Yuan-Huai; Kang, Han; Zeng, Qing-Tao

    2012-01-01

    Previous phylogenetic analyses of the melanogaster species group have led to conflicting hypotheses concerning their relationship; therefore the addition of new sequence data is necessary to discover the phylogeny of this species group. Here we present new data derived from 17 genes and representing 48 species to reconstruct the phylogeny of the melanogaster group. A variety of statistical tests, as well as maximum likelihood mapping analysis, were performed to estimate data quality, suggesting that all genes had a high degree of contribution to resolve the phylogeny. Individual locus was analyzed using maximum likelihood (ML), and the concatenated dataset (12,988 bp) were analyzed using partitioned maximum likelihood (ML) and Bayesian analyses. Separated analysis produced various phylogenetic relationships, however, phylogenetic topologies from ML and Bayesian analysis based on concatenated dataset, at the subgroup level, were completely identical to each other with high levels of support. Our results recovered three major clades: the ananassae subgroup, followed by the montium subgroup, the melanogaster subgroup and the oriental subgroups form the third monophyletic clade, in which melanogaster (takahashii, suzukii) forms one subclade and ficusphila [eugracilis (elegans, rhopaloa)] forms another. However, more data are necessary to determine the phylogenetic position of Drosophila lucipennis which proved difficult to place. PMID:21985965

  13. Comprehensive Phylogenetic Reconstructions of African Swine Fever Virus: Proposal for a New Classification and Molecular Dating of the Virus

    PubMed Central

    Michaud, Vincent; Randriamparany, Tantely; Albina, Emmanuel

    2013-01-01

    African swine fever (ASF) is a highly lethal disease of domestic pigs caused by the only known DNA arbovirus. It was first described in Kenya in 1921 and since then many isolates have been collected worldwide. However, although several phylogenetic studies have been carried out to understand the relationships between the isolates, no molecular dating analyses have been achieved so far. In this paper, comprehensive phylogenetic reconstructions were made using newly generated, publicly available sequences of hundreds of ASFV isolates from the past 70 years. Analyses focused on B646L, CP204L, and E183L genes from 356, 251, and 123 isolates, respectively. Phylogenetic analyses were achieved using maximum likelihood and Bayesian coalescence methods. A new lineage-based nomenclature is proposed to designate 35 different clusters. In addition, dating of ASFV origin was carried out from the molecular data sets. To avoid bias, diversity due to positive selection or recombination events was neutralized. The molecular clock analyses revealed that ASFV strains currently circulating have evolved over 300 years, with a time to the most recent common ancestor (TMRCA) in the early 18th century. PMID:23936068

  14. Analysis of genetic diversity in banana cultivars (Musa cvs.) from the South of Oman using AFLP markers and classification by phylogenetic, hierarchical clustering and principal component analyses*

    PubMed Central

    Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar

    2010-01-01

    Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented. PMID:20443211

  15. DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification

    PubMed Central

    Schlosberg, Arran; Lam, Brian Y. H.; Yeo, Giles S. H.; Clifton-Bligh, Roderick J.

    2014-01-01

    Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs) in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs) and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV. PMID:24828207

  16. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  17. Two new species of Geodiscelis Michener & Rozen (Hymenoptera: Apoidea: Colletidae) with a phylogenetic analysis and subgeneric classification of the genus.

    PubMed

    Packer, Laurence; Dumesh, Sheila

    2014-01-01

    Two new species of the genus Geodiscelis are described: Geodiscelis nazcalinea Packer & Dumesh, sp. nov. from Peru (the first record of the genus from that country) and G. phisquiri Packer & Dumesh, sp. nov. from northern Chile. The new species are most closely related to G. longiceps, but differ primarily in having somewhat less elongate heads and in details of the male terminalia. A key to the five known species of the genus is provided as are the results of a phylogenetic analysis based upon 68 characters, and the genus is formally divided into three subgenera: Geodiscelis s. str. Michener and Rozen, Geodiscelis (Nazcoediscelis) Packer and Dumesh, subgenus nov. and Geodiscelis (Thaumoediscelis) Packer and Dumesh, subgenus nov. The two new species described herein belong to subgenus Geodiscelis (Nazcoediscelis). Figures of the most important characters are provided. Tiquilia sp. (Boraginaceae) is the probable floral host of both new species and it is suggested that all species are ground-nesters. Sexual dimorphism in an unusual character is recorded for G. thaumaskelos Packer.  PMID:25283109

  18. Phylogenetic inference rejects sporophyte based classification of the Funariaceae (Bryophyta): rapid radiation suggests rampant homoplasy in sporophyte evolution.

    PubMed

    Liu, Yang; Budke, Jessica M; Goffinet, Bernard

    2012-01-01

    The moss family Funariaceae, which includes the model systems Funaria hygrometrica and Physcomitrella patens, comprises 15 genera, of which three accommodate approximately 95% of the 250-400 species. Generic concepts are drawn primarily from patterns in the diversity of morphological complexity of the sporophyte. Phylogenetic inferences from ten loci sampled across the three genomic compartments yield a hypothesis that is incompatible with the current circumscription of two of the speciose genera of the Funariaceae. The single clade, comprising exemplars of Funaria with a compound annulus, is congruent with the systematic concept proposed by Fife (1985). By contrast, Entosthodon and Physcomitrium are resolved as polyphyletic entities, and even the three species of Physcomitrella are confirmed to have diverged from distinct ancestors. Although the backbone relationships within the core clade of the Funariaceae remain unresolved, the polyphyly of these genera withstands alternative hypothesis testing. Consequently, the sporophytic characters that define these lineages are clearly homoplasious suggesting that selective pressures (or their relaxation) are in fact driving the diversification rather than the conservation of sporophytic architecture in the Funariaceae. PMID:21971055

  19. A species independent universal bio-detection microarray for pathogen forensics and phylogenetic classification of unknown microorganisms

    Microsoft Academic Search

    Shamira J Shallom; Cristi L Galindo; Lauren McIver; Zhaohui Sun; John McCormick; L Garry Adams; Harold R Garner

    2011-01-01

    Background  The ability to differentiate a bioterrorist attack or an accidental release of a research pathogen from a naturally occurring\\u000a pandemic or disease event is crucial to the safety and security of this nation by enabling an appropriate and rapid response.\\u000a It is critical in samples from an infected patient, the environment, or a laboratory to quickly and accurately identify the

  20. CLASSIFICATION

    NSDL National Science Digital Library

    Mrs. Ballew

    2010-10-17

    Project Overview: Classification is grouping similar objects together. When you go into a grocery store, you see fresh fruits and vegetables, frozen food, cereal, and pet suplies in different aisles. Imagine how difficult life would be if you went into a store, and the aisles were not labeled to tell you where to find the items! You don't have to be a scientist to use classification! You use classification when you group your IPOD music into different genres and when you divide your dark colored clothing from light colors to do laundry. You might even use it to sort Halloween candy into 4 groups: chocolate candy, hard candy, chewy candy, and gum. The science of classification is called taxonomy. Taxonomy classifies organisms based on evolutionary relationships and describes and names organisms with a two-part name: genus and species. Scientists use taxonomy to identify unknown organisms by using books called field guides or by using taxonomic keys (also called dichomotous keys). Project Objective: As a class,you will be previewing and answering some questions about some classification resources to learn how to use a dichotomous key, how to key a specimen, and to help you write your own dichotomous key for school items. Project: Get a sheet of notebook paper and pencil and refer to the websites to find the answers to the questions. One way to classify objects is to create a "tree" to group similar objects together.Open hierarchical classfication of objects to the second page and find the diagram of common household objects. See how all the ...

  1. 16S Classifier: A Tool for Fast and Accurate Taxonomic Classification of 16S rRNA Hypervariable Regions in Metagenomic Datasets

    PubMed Central

    Chaudhary, Nikhil; Sharma, Ashok K.; Agarwal, Piyush; Gupta, Ankit; Sharma, Vineet K.

    2015-01-01

    The diversity of microbial species in a metagenomic study is commonly assessed using 16S rRNA gene sequencing. With the rapid developments in genome sequencing technologies, the focus has shifted towards the sequencing of hypervariable regions of 16S rRNA gene instead of full length gene sequencing. Therefore, 16S Classifier is developed using a machine learning method, Random Forest, for faster and accurate taxonomic classification of short hypervariable regions of 16S rRNA sequence. It displayed precision values of up to 0.91 on training datasets and the precision values of up to 0.98 on the test dataset. On real metagenomic datasets, it showed up to 99.7% accuracy at the phylum level and up to 99.0% accuracy at the genus level. 16S Classifier is available freely at http://metagenomics.iiserb.ac.in/16Sclassifier and http://metabiosys.iiserb.ac.in/16Sclassifier. PMID:25646627

  2. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  3. Phylogenetics problems

    NSDL National Science Digital Library

    Sarah Deel

    Students receive information about cladistics and apply this phylogenetic approach to two problems, collecting data, determining whether traits are ancestral or derived, and using this information to select the most parsimonious tree.

  4. Geometric morphometric character suites as phylogenetic data: extracting phylogenetic signal from gastropod shells.

    PubMed

    Smith, Ursula E; Hendricks, Jonathan R

    2013-05-01

    Despite being the objects of numerous macroevolutionary studies, many of the best represented constituents of the fossil record-including diverse examples such as foraminifera, brachiopods, and mollusks-have mineralized skeletons with limited discrete characteristics, making morphological phylogenies difficult to construct. In contrast to their paucity of phylogenetic characters, the mineralized structures (tests and shells) of these fossil groups frequently have distinctive shapes that have long proved useful for their classification. The recent introduction of methodologies for including continuous data directly in a phylogenetic analysis has increased the number of available characters, making it possible to produce phylogenies based, in whole or part, on continuous character data collected from such taxa. Geometric morphometric methods provide tools for accurately characterizing shape variation and can produce quantitative data that can therefore now be included in a phylogenetic matrix in a nonarbitrary manner. Here, the marine gastropod genus Conus is used to evaluate the ability of continuous characters-generated from a geometric morphometric analysis of shell shape-to contribute to a total evidence phylogenetic hypothesis constructed using molecular and morphological data. Furthermore, the ability of continuous characters derived from geometric morphometric analyses to place fossil taxa with limited discrete characters into a phylogeny with their extant relatives was tested by simulating the inclusion of fossil taxa. This was done by removing the molecular partition of individual extant species to produce a "cladistic pseudofossil" with only the geometric morphometric derived characters coded. The phylogenetic position of each cladistic pseudofossil taxon was then compared with its placement in the total evidence tree and a symmetric resampling tree to evaluate the degree to which morphometric characters alone can correctly place simulated fossil species. In 33-45% of the test cases (depending upon the approach used for measuring success), it was possible to place the pseudofossil taxon into the correct regions of the phylogeny using only the morphometric characters. This suggests that the incorporation of extinct Conus taxa into phylogenetic hypotheses will be possible, permitting a wide range of macroevolutionary questions to be addressed within this genus. This methodology also has potential to contribute to phylogenetic reconstructions for other major components of the fossil record that lack numerous discrete characters. PMID:23325808

  5. Phylogenetic classification of Pleurothecium and Pleurotheciella gen. nov. and its dactylaria-like anamorph (Sordariomycetes) based on nuclear ribosomal and protein-coding genes.

    PubMed

    Réblová, Martina; Seifert, Keith A; Fournier, Jacques; Stepánek, Václav

    2012-01-01

    Two strains of an unidentified perithecial ascomycete with a dactylaria-like anamorph and another morphologically similar strain of a dactylaria-like fungus were collected on decaying wood submerged in freshwater. To study their phylogenetic relationships we (i) combined sequence data from the nuclear small and large subunits ribosomal DNA (nc18S and nc28S) and the second largest subunit of RNA polymerase II (RPB2) for a multigene phylogenetic analysis and (ii) used sequences of the internal transcribed spacer region (ITS) of the rRNA operon for a species-level analysis. The new genus Pleurotheciella is described for two new species, Pla. rivularia and Pla. centenaria, with nonstromatic perithecia, unitunicate asci, persistent paraphyses and hyaline, septate ascospores and dactylaria-like anamorphs characterized by holoblastic, denticulate conidiogenesis, subhyaline conidiophores and hyaline, septate conidia. Based on morphological and molecular data, Pleurotheciella is closely related to the genera Pleurothecium and Sterigmatobotrys. A key to the three genera and the known species is provided. In the three-gene inferred phylogeny, these genera grouped as a sister clade to the Savoryellales within a robust clade of uncertain higher rank affiliation. Phylogenetic study of the 12 strains that represent Pleurothecium recurvatum revealed four that grouped apart from the core of the species. Two of these strains, which form a monophyletic well supported clade in both phylogenies and share similar morphological characteristics, are described as a new species, Pleurothecium semifecundum. PMID:22684295

  6. Polytomy identification in microbial phylogenetic reconstruction

    PubMed Central

    2011-01-01

    Background A phylogenetic tree, showing ancestral relations among organisms, is commonly represented as a rooted tree with sets of bifurcating branches (dichotomies) for simplicity, although polytomies (multifurcating branches) may reflect more accurate evolutionary relationships. To represent the true evolutionary relationships, it is important to systematically identify the polytomies from a bifurcating tree and generate a taxonomy-compatible multifurcating tree. For this purpose we propose a novel approach, "PolyPhy", which would classify a set of bifurcating branches of a phylogenetic tree into a set of branches with dichotomies and polytomies by considering genome distances among genomes and tree topological properties. Results PolyPhy employs a machine learning technique, BLR (Bayesian logistic regression) classifier, to identify possible bifurcating subtrees as polytomies from the trees resulted from ComPhy. Other than considering genome-scale distances between all pairs of species, PolyPhy also takes into account different properties of tree topology between dichotomy and polytomy, such as long-branch retraction and short-branch contraction, and quantifies these properties into comparable rates among different sub-branches. We extract three tree topological features, 'LR' (Leaf rate), 'IntraR' (Intra-subset branch rate) and 'InterR' (Inter-subset branch rate), all of which are calculated from bifurcating tree branch sets for classification. We have achieved F-measure (balanced measure between precision and recall) of 81% with about 0.9 area under the curve (AUC) of ROC. Conclusions PolyPhy is a fast and robust method to identify polytomies from phylogenetic trees based on genome-wide inference of evolutionary relationships among genomes. The software package and test data can be downloaded from http://digbio.missouri.edu/ComPhy/phyloTreeBiNonBi-1.0.zip. PMID:22784621

  7. GB Virus C/Hepatitis G Virus Groups and Subgroups: Classification by a Restriction Fragment Length Polymorphism Method Based on Phylogenetic Analysis of the 5? Untranslated Region

    PubMed Central

    Quarleri, J. F.; Mathet, V. L.; Feld, M.; Ferrario, D.; della Latta, M. P.; Verdun, R.; Sánchez, D. O.; Oubiña, J. R.

    1999-01-01

    A phylogenetic tree based on 150 5? untranslated region sequences deposited in GenBank database allowed segregation of the sequences into three major groups, including two subgroups, i.e., 1, 2a, 2b, and 3, supported by bootstrap analysis. Restriction site analysis of these sequences predicted that HinfI and either AatII or AciI could be used for genomic typing with 99.4% accuracy. cDNA sequencing and subsequent alignment of 21 Argentine GB virus C/hepatitis G virus strains confirmed restriction fragment length polymorphism patterns theoretically predicted. This method may be useful for a rapid screening of samples when either epidemiological or transmission studies of this agent are carried out. PMID:10203483

  8. Stratification of co-evolving genomic groups using ranked phylogenetic profiles

    PubMed Central

    Freilich, Shiri; Goldovsky, Leon; Gottlieb, Assaf; Blanc, Eric; Tsoka, Sophia; Ouzounis, Christos A

    2009-01-01

    Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples. PMID:19860884

  9. Endometriosis fertility index score maybe more accurate for predicting the outcomes of in vitro fertilisation than r-AFS classification in women with endometriosis

    PubMed Central

    2013-01-01

    Background Endometriosis is a common disease. The most widely used staging system of endometriosis is the revised American Fertility Society classification (r-AFS classification) which has limited predictive ability for pregnancy after surgery. The endometriosis fertility index (EFI) is used to predict fecundity after endometriosis surgery. This diagnostic accuracy study was designed to compare the predictive value of the EFI with that of the r-AFS classification for IVF outcomes in patients with endometriosis. Methods 199 women with endometriosis receiving IVF treatment after surgery were analysis. The EFI score and r-AFS classification in their ability to predict these IVF outcomes were compared in the same population. ROC curves were used to analyse the predictive values of the EFI and r-AFS indices for clinical pregnancy, and their accuracies were evaluated by sensitivity, specificity, and the Youden’s index. Results The Area Under the Curve (AUC) of the EFI score (AUC?=?0.641, Standard Error(SE)?=?0.039, P?=?0.001, 95% CI?=?0.564-0.717, cut-off score?=?6) was significantly larger than that of the r-AFS classification (AUC?=?0.445, SE?=?0.041, P?=?0.184, and 95% CI?=?0.364-0.526). The antral follicle count, oestradiol level on day of hCG, number of oocytes retrieved, number of oocytes fertilised, and number of cleaved embryos in the greater than or equal to 6 EFI score group was greater than that of the lower than or equal to 5 EFI score group, and the dose of gonadotropin of the greater than or equal to 6 EFI score group were less than that in the lower than or equal to 5 EFI score group. Implantation rate, clinical pregnancy rate, and cumulative pregnancy rate in the greater than or equal to 6 EFI score group were higher than in the lower than or equal to 5 EFI score group. Conclusions It suggests that the EFI has more predictive power for IVF outcomes in endometriosis patients than the r-AFS classification. The clinical pregnancy rate was higher in patients with EFI greater than or equal to 6 score than with EFI lower than or equal to 5 score. PMID:24330552

  10. A classification for extant ferns

    Microsoft Academic Search

    Alan R. Smith; Kathleen M. Pryer; Eric Schuettpelz; Petra Korall; Harald Schneider; Paul G. Wolf

    2006-01-01

    We present a revised classification for extant ferns, with emphasis on ordinal and familial ranks, and a synop- sis of included genera. Our classification reflects recently published phylogenetic hypotheses based on both morphological and molecular data. Within our new classification, we recognize four monophyletic classes, 11 monophyletic orders, and 37 families, 32 of which are strongly supported as monophyletic. One

  11. Phylogenetic classification of Escherichia coli O157:H7 strains of human and bovine origin using a novel set of nucleotide polymorphisms

    PubMed Central

    Clawson, Michael L; Keen, James E; Smith, Timothy PL; Durso, Lisa M; McDaneld, Tara G; Mandrell, Robert E; Davis, Margaret A; Bono, James L

    2009-01-01

    Background Cattle are a reservoir of Shiga toxin-producing Escherichia coli O157:H7 (STEC O157), and are known to harbor subtypes not typically found in clinically ill humans. Consequently, nucleotide polymorphisms previously discovered via strains originating from human outbreaks may be restricted in their ability to distinguish STEC O157 genetic subtypes present in cattle. The objectives of this study were firstly to identify nucleotide polymorphisms in a diverse sampling of human and bovine STEC O157 strains, secondly to classify strains of either bovine or human origin by polymorphism-derived genotypes, and finally to compare the genotype diversity with pulsed-field gel electrophoresis (PFGE), a method currently used for assessing STEC O157 diversity. Results High-throughput 454 sequencing of pooled STEC O157 strain DNAs from human clinical cases (n = 91) and cattle (n = 102) identified 16,218 putative polymorphisms. From those, 178 were selected primarily within genomic regions conserved across E. coli serotypes and genotyped in 261 STEC O157 strains. Forty-two unique genotypes were observed that are tagged by a minimal set of 32 polymorphisms. Phylogenetic trees of the genotypes are divided into clades that represent strains of cattle origin, or cattle and human origin. Although PFGE diversity surpassed genotype diversity overall, ten PFGE patterns each occurred with multiple strains having different genotypes. Conclusions Deep sequencing of pooled STEC O157 DNAs proved highly effective in polymorphism discovery. A polymorphism set has been identified that characterizes genetic diversity within STEC O157 strains of bovine origin, and a subset observed in human strains. The set may complement current techniques used to classify strains implicated in disease outbreaks. PMID:19463166

  12. Molecular Classifications

    Microsoft Academic Search

    Gregory N. Fuller

    The field of glioma classification is currently entering a new era with the introduction of paradigms based on molecular information.\\u000a Rather than supplanting traditional morphology-based classification schemes, it is anticipated that emerging molecular biologic,\\u000a genomic, transcriptomic, and proteomic data will complement and augment existing morphologic and immunophenotypic data, providing\\u000a for a more accurate and refined stratification of glioma patients for

  13. A preliminary phylogenetic analysis of the New World Helopini (Coleoptera, Tenebrionidae, Tenebrioninae) indicates the need for profound rearrangements of the classification.

    PubMed

    Cifuentes-Ruiz, Paulina; Zaragoza-Caballero, Santiago; Ochoterena-Booth, Helga; Morón, Miguel Ángel

    2014-01-01

    Helopini is a diverse tribe in the subfamily Tenebrioninae with a worldwide distribution. The New World helopine species have not been reviewed recently and several doubts emerge regarding their generic assignment as well as the naturalness of the tribe and subordinate taxa. To assess these questions, a preliminary cladistic analysis was conducted with emphasis on sampling the genera distributed in the New World, but including representatives from other regions. The parsimony analysis includes 30 ingroup species from America, Europe and Asia of the subtribes Helopina and Cylindrinotina, plus three outgroups, and 67 morphological characters. Construction of the matrix resulted in the discovery of morphological character states not previously reported for the tribe, particularly from the genitalia of New World species. A consensus of the 12 most parsimonious trees supports the monophyly of the tribe based on a unique combination of characters, including one synapomorphy. None of the subtribes or the genera of the New World represented by more than one species (Helops Fabricius, Nautes Pascoe and Tarpela Bates) were recovered as monophyletic. Helopina was recovered as paraphyletic in relation to Cylindrinotina. One Nearctic species of Helops and one Palearctic species of Tarpela (subtribe Helopina) were more closely related to species of Cylindrinotina. A relatively derived clade, mainly composed by Neotropical species, was found; it includes seven species of Tarpela, seven species of Nautes, and three species of Helops, two Nearctic and one Neotropical. Our results reveal the need to deeply re-evaluate the current classification of the tribe and subordinated taxa, but a broader taxon sampling and further character exploration is needed in order to fully recognize monophyletic groups at different taxonomic levels (from subtribes to genera). PMID:25009428

  14. A preliminary phylogenetic analysis of the New World Helopini (Coleoptera, Tenebrionidae, Tenebrioninae) indicates the need for profound rearrangements of the classification

    PubMed Central

    Cifuentes-Ruiz, Paulina; Zaragoza-Caballero, Santiago; Ochoterena-Booth, Helga; Morón, Miguel Ángel

    2014-01-01

    Abstract Helopini is a diverse tribe in the subfamily Tenebrioninae with a worldwide distribution. The New World helopine species have not been reviewed recently and several doubts emerge regarding their generic assignment as well as the naturalness of the tribe and subordinate taxa. To assess these questions, a preliminary cladistic analysis was conducted with emphasis on sampling the genera distributed in the New World, but including representatives from other regions. The parsimony analysis includes 30 ingroup species from America, Europe and Asia of the subtribes Helopina and Cylindrinotina, plus three outgroups, and 67 morphological characters. Construction of the matrix resulted in the discovery of morphological character states not previously reported for the tribe, particularly from the genitalia of New World species. A consensus of the 12 most parsimonious trees supports the monophyly of the tribe based on a unique combination of characters, including one synapomorphy. None of the subtribes or the genera of the New World represented by more than one species (Helops Fabricius, Nautes Pascoe and Tarpela Bates) were recovered as monophyletic. Helopina was recovered as paraphyletic in relation to Cylindrinotina. One Nearctic species of Helops and one Palearctic species of Tarpela (subtribe Helopina) were more closely related to species of Cylindrinotina. A relatively derived clade, mainly composed by Neotropical species, was found; it includes seven species of Tarpela, seven species of Nautes, and three species of Helops, two Nearctic and one Neotropical. Our results reveal the need to deeply re-evaluate the current classification of the tribe and subordinated taxa, but a broader taxon sampling and further character exploration is needed in order to fully recognize monophyletic groups at different taxonomic levels (from subtribes to genera). PMID:25009428

  15. Accurate age classification of 6 and 12 month-old infants based on resting-state functional connectivity magnetic resonance imaging data.

    PubMed

    Pruett, John R; Kandala, Sridhar; Hoertel, Sarah; Snyder, Abraham Z; Elison, Jed T; Nishino, Tomoyuki; Feczko, Eric; Dosenbach, Nico U F; Nardos, Binyam; Power, Jonathan D; Adeyemo, Babatunde; Botteron, Kelly N; McKinstry, Robert C; Evans, Alan C; Hazlett, Heather C; Dager, Stephen R; Paterson, Sarah; Schultz, Robert T; Collins, D Louis; Fonov, Vladimir S; Styner, Martin; Gerig, Guido; Das, Samir; Kostopoulos, Penelope; Constantino, John N; Estes, Annette M; Petersen, Steven E; Schlaggar, Bradley L; Piven, Joseph

    2015-04-01

    Human large-scale functional brain networks are hypothesized to undergo significant changes over development. Little is known about these functional architectural changes, particularly during the second half of the first year of life. We used multivariate pattern classification of resting-state functional connectivity magnetic resonance imaging (fcMRI) data obtained in an on-going, multi-site, longitudinal study of brain and behavioral development to explore whether fcMRI data contained information sufficient to classify infant age. Analyses carefully account for the effects of fcMRI motion artifact. Support vector machines (SVMs) classified 6 versus 12 month-old infants (128 datasets) above chance based on fcMRI data alone. Results demonstrate significant changes in measures of brain functional organization that coincide with a special period of dramatic change in infant motor, cognitive, and social development. Explorations of the most different correlations used for SVM lead to two different interpretations about functional connections that support 6 versus 12-month age categorization. PMID:25704288

  16. Phylogenetic Relationships of the Acanthocephala Inferred from 18S Ribosomal DNA Sequences

    Microsoft Academic Search

    James R. Garey; Steven A. Nadler

    1998-01-01

    Phylogenetic relationships within the Acanthocephala have remained unresolved. Past systematic efforts have focused on creating classifications with little consideration of phylogenetic methods. The Acanthocephala are currently divided into three major taxonomic groups: Archiacanthocephala, Palaeacanthocephala, and Eoacanthocephala. These groups are characterized by structural features in addition to the taxonomy and habitat of hosts parasitized. In this study the phylogenetic relationships of

  17. Phylogenetic Models: Algebra and Evolution

    E-print Network

    Allman, Elizabeth S.

    Phylogenetic Models: Algebra and Evolution Elizabeth S. Allman Dept. of Mathematics and Statistics evolutionary tree 2. sequence evolution probabilistic models on trees 3. phylogenetic ideals and varieties history. IMA -- Phylogenetic Models: Algebra and Evolution Slide 1 #12;For phylogenetic inference

  18. Complete chloroplast genome of the genus Cymbidium: lights into the species identification, phylogenetic implications and population genetic analyses

    PubMed Central

    2013-01-01

    Background Cymbidium orchids, including some 50 species, are the famous flowers, and they possess high commercial value in the floricultural industry. Furthermore, the values of different orchids are great differences. However, species identification is very difficult. To a certain degree, chloroplast DNA sequence data are a versatile tool for species identification and phylogenetic implications in plants. Different chloroplast loci have been utilized for evaluating phylogenetic relationships at each classification level among plant species, including at the interspecies and intraspecies levels. However, there is no evidence that a short sequence can distinguish all plant species from each other in order to infer phylogenetic relationships. Molecular markers derived from the complete chloroplast genome can provide effective tools for species identification and phylogenetic resolution. Results The complete nucleotide sequences of eight individuals from a total of five Cymbidium species’ chloroplast (cp) genomes were determined using Illumina sequencing technology of the total DNA via a combination of de novo and reference-guided assembly. The length of the Cymbidium cp genome is about 155 kb. The cp genomes contain 123 unique genes, and the IR regions contain 24 duplicates. Although the genomes, including genome structure, gene order and orientation, are similar to those of other orchids, they are not evolutionarily conservative. The cp genome of Cymbidium evolved moderately with more than 3% sequence divergence, which could provide enough information for phylogeny. Rapidly evolving chloroplast genome regions were identified and 11 new divergence hotspot regions were disclosed for further phylogenetic study and species identification in Orchidaceae. Conclusions Phylogenomic analyses were conducted using 10 complete chloroplast genomes from seven orchid species. These data accurately identified the individuals and established the phylogenetic relationships between the species. The results reveal that phylogenomics based on organelle genome sequencing lights the species identification—organelle-scale “barcodes”, and is also an effective approach for studying whole populations and phylogenetic characteristics of Cymbidium. PMID:23597078

  19. Directional biases in phylogenetic structure quantification: a Mediterranean case study.

    PubMed

    Molina-Venegas, Rafael; Roquet, Cristina

    2014-06-01

    Recent years have seen an increasing effort to incorporate phylogenetic hypotheses to the study of community assembly processes. The incorporation of such evolutionary information has been eased by the emergence of specialized software for the automatic estimation of partially resolved supertrees based on published phylogenies. Despite this growing interest in the use of phylogenies in ecological research, very few studies have attempted to quantify the potential biases related to the use of partially resolved phylogenies and to branch length accuracy, and no work has examined how tree shape may affect inference of community phylogenetic metrics. In this study, using a large plant community and elevational dataset, we tested the influence of phylogenetic resolution and branch length information on the quantification of phylogenetic structure; and also explored the impact of tree shape (stemminess) on the loss of accuracy in phylogenetic structure quantification due to phylogenetic resolution. For this purpose, we used 9 sets of phylogenetic hypotheses of varying resolution and branch lengths to calculate three indices of phylogenetic structure: the mean phylogenetic distance (NRI), the mean nearest taxon distance (NTI) and phylogenetic diversity (stdPD) metrics. The NRI metric was the less sensitive to phylogenetic resolution, stdPD showed an intermediate sensitivity, and NTI was the most sensitive one; NRI was also less sensitive to branch length accuracy than NTI and stdPD, the degree of sensitivity being strongly dependent on the dating method and the sample size. Directional biases were generally towards type II errors. Interestingly, we detected that tree shape influenced the accuracy loss derived from the lack of phylogenetic resolution, particularly for NRI and stdPD. We conclude that well-resolved molecular phylogenies with accurate branch length information are needed to identify the underlying phylogenetic structure of communities, and also that sensitivity of phylogenetic structure measures to low phylogenetic resolution can strongly differ depending on phylogenetic tree shape. PMID:25076812

  20. Directional biases in phylogenetic structure quantification: a Mediterranean case study

    PubMed Central

    Molina-Venegas, Rafael; Roquet, Cristina

    2014-01-01

    Recent years have seen an increasing effort to incorporate phylogenetic hypotheses to the study of community assembly processes. The incorporation of such evolutionary information has been eased by the emergence of specialized software for the automatic estimation of partially resolved supertrees based on published phylogenies. Despite this growing interest in the use of phylogenies in ecological research, very few studies have attempted to quantify the potential biases related to the use of partially resolved phylogenies and to branch length accuracy, and no work has examined how tree shape may affect inference of community phylogenetic metrics. In this study, using a large plant community and elevational dataset, we tested the influence of phylogenetic resolution and branch length information on the quantification of phylogenetic structure; and also explored the impact of tree shape (stemminess) on the loss of accuracy in phylogenetic structure quantification due to phylogenetic resolution. For this purpose, we used 9 sets of phylogenetic hypotheses of varying resolution and branch lengths to calculate three indices of phylogenetic structure: the mean phylogenetic distance (NRI), the mean nearest taxon distance (NTI) and phylogenetic diversity (stdPD) metrics. The NRI metric was the less sensitive to phylogenetic resolution, stdPD showed an intermediate sensitivity, and NTI was the most sensitive one; NRI was also less sensitive to branch length accuracy than NTI and stdPD, the degree of sensitivity being strongly dependent on the dating method and the sample size. Directional biases were generally towards type II errors. Interestingly, we detected that tree shape influenced the accuracy loss derived from the lack of phylogenetic resolution, particularly for NRI and stdPD. We conclude that well-resolved molecular phylogenies with accurate branch length information are needed to identify the underlying phylogenetic structure of communities, and also that sensitivity of phylogenetic structure measures to low phylogenetic resolution can strongly differ depending on phylogenetic tree shape. PMID:25076812

  1. Phylogenetic system and zoogeography of the Plecoptera.

    PubMed

    Zwick, P

    2000-01-01

    Information about the phylogenetic relationships of Plecoptera is summarized. The few characters supporting monophyly of the order are outlined. Several characters of possible significance for the search for the closest relatives of the stoneflies are discussed, but the sister-group of the order remains unknown. Numerous characters supporting the presently recognized phylogenetic system of Plecoptera are presented, alternative classifications are discussed, and suggestions for future studies are made. Notes on zoogeography are appended. The order as such is old (Permian fossils), but phylogenetic relationships and global distribution patterns suggest that evolution of the extant suborders started with the breakup of Pangaea. There is evidence of extensive recent speciation in all parts of the world. PMID:10761594

  2. CREST – Classification Resources for Environmental Sequence Tags

    PubMed Central

    Lanzén, Anders; Jørgensen, Steffen L.; Huson, Daniel H.; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. PMID:23145153

  3. Phylogenetic Inference From Conserved sites Alignments

    SciTech Connect

    grundy, W.N.; Naylor, G.J.P.

    1999-08-15

    Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny and that the quality of the phylogenetic tree improves when the input consists only of the highly conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing 19 species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the 19 inference methods used before to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and suggests several novel placements.

  4. Accurate Model Selection of Relaxed Molecular Clocks in Bayesian Phylogenetics

    PubMed Central

    Baele, Guy; Li, Wai Lok Sibon; Drummond, Alexei J.; Suchard, Marc A.; Lemey, Philippe

    2013-01-01

    Recent implementations of path sampling (PS) and stepping-stone sampling (SS) have been shown to outperform the harmonic mean estimator (HME) and a posterior simulation-based analog of Akaike’s information criterion through Markov chain Monte Carlo (AICM), in Bayesian model selection of demographic and molecular clock models. Almost simultaneously, a Bayesian model averaging approach was developed that avoids conditioning on a single model but averages over a set of relaxed clock models. This approach returns estimates of the posterior probability of each clock model through which one can estimate the Bayes factor in favor of the maximum a posteriori (MAP) clock model; however, this Bayes factor estimate may suffer when the posterior probability of the MAP model approaches 1. Here, we compare these two recent developments with the HME, stabilized/smoothed HME (sHME), and AICM, using both synthetic and empirical data. Our comparison shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model. We also illustrate the importance of using proper priors on a large set of empirical data sets. PMID:23090976

  5. Accurate model selection of relaxed molecular clocks in bayesian phylogenetics.

    PubMed

    Baele, Guy; Li, Wai Lok Sibon; Drummond, Alexei J; Suchard, Marc A; Lemey, Philippe

    2013-02-01

    Recent implementations of path sampling (PS) and stepping-stone sampling (SS) have been shown to outperform the harmonic mean estimator (HME) and a posterior simulation-based analog of Akaike's information criterion through Markov chain Monte Carlo (AICM), in bayesian model selection of demographic and molecular clock models. Almost simultaneously, a bayesian model averaging approach was developed that avoids conditioning on a single model but averages over a set of relaxed clock models. This approach returns estimates of the posterior probability of each clock model through which one can estimate the Bayes factor in favor of the maximum a posteriori (MAP) clock model; however, this Bayes factor estimate may suffer when the posterior probability of the MAP model approaches 1. Here, we compare these two recent developments with the HME, stabilized/smoothed HME (sHME), and AICM, using both synthetic and empirical data. Our comparison shows reassuringly that MAP identification and its Bayes factor provide similar performance to PS and SS and that these approaches considerably outperform HME, sHME, and AICM in selecting the correct underlying clock model. We also illustrate the importance of using proper priors on a large set of empirical data sets. PMID:23090976

  6. Phylogenetics Todd Scheetz

    E-print Network

    Casavant, Tom

    in the tree. human mouse fly Trees Rooted vs. unrooted trees Example GAATC GAGTT GA(A/G)T(C/T) Inherent are homologous 3. Each position within the alignment is homologous 4. Each sequence has a common phylogenetic;2 General Process The basic process of phylogenetic analysis is 1. Alignment 2. Determining the substitution

  7. Phylogenetic relationships among arecoid palms (Arecaceae: Arecoideae)

    PubMed Central

    Baker, William J.; Norup, Maria V.; Clarkson, James J.; Couvreur, Thomas L. P.; Dowe, John L.; Lewis, Carl E.; Pintaud, Jean-Christophe; Savolainen, Vincent; Wilmot, Tomas; Chase, Mark W.

    2011-01-01

    Background and Aims The Arecoideae is the largest and most diverse of the five subfamilies of palms (Arecaceae/Palmae), containing >50 % of the species in the family. Despite its importance, phylogenetic relationships among Arecoideae are poorly understood. Here the most densely sampled phylogenetic analysis of Arecoideae available to date is presented. The results are used to test the current classification of the subfamily and to identify priority areas for future research. Methods DNA sequence data for the low-copy nuclear genes PRK and RPB2 were collected from 190 palm species, covering 103 (96 %) genera of Arecoideae. The data were analysed using the parsimony ratchet, maximum likelihood, and both likelihood and parsimony bootstrapping. Key Results and Conclusions Despite the recovery of paralogues and pseudogenes in a small number of taxa, PRK and RPB2 were both highly informative, producing well-resolved phylogenetic trees with many nodes well supported by bootstrap analyses. Simultaneous analyses of the combined data sets provided additional resolution and support. Two areas of incongruence between PRK and RPB2 were strongly supported by the bootstrap relating to the placement of tribes Chamaedoreeae, Iriarteeae and Reinhardtieae; the causes of this incongruence remain uncertain. The current classification within Arecoideae was strongly supported by the present data. Of the 14 tribes and 14 sub-tribes in the classification, only five sub-tribes from tribe Areceae (Basseliniinae, Linospadicinae, Oncospermatinae, Rhopalostylidinae and Verschaffeltiinae) failed to receive support. Three major higher level clades were strongly supported: (1) the RRC clade (Roystoneeae, Reinhardtieae and Cocoseae), (2) the POS clade (Podococceae, Oranieae and Sclerospermeae) and (3) the core arecoid clade (Areceae, Euterpeae, Geonomateae, Leopoldinieae, Manicarieae and Pelagodoxeae). However, new data sources are required to elucidate ambiguities that remain in phylogenetic relationships among and within the major groups of Arecoideae, as well as within the Areceae, the largest tribe in the palm family. PMID:21325340

  8. On Exploring Genome Rearrangement Phylogenetic Patterns

    NASA Astrophysics Data System (ADS)

    Xu, Andrew Wei

    The study of genome rearrangement is much harder than the corresponding problems on DNA and protein sequences, because of the occurrences of numerous combinatorial structures. By explicitly exploring these combinatorial structures, the recently developed adequate subgraph theory shows that a family of these structures, adequate subgraphs, are informative in finding the optimal solutions to the rearrangement median problem. Its extension gives rise to the tree scoring method GASTS, which provides quick and accurate estimation of the number of rearrangement events, for any given topology. With a similar motivation, this paper discusses and provides solid but somewhat initial results, on combinatorial structures that are informative in phylogenetic inference. These structures, called rearrangement phylogenetic patterns, provide more insights than algorithmic approaches, and may provide statistical significance for inferred phylogenies and lead to efficient and robust phylogenetic inference methods on large sets of taxa.

  9. Phylogenetic relationships of some filamentous cyanoprokaryotic species.

    PubMed

    Stoyanov, Plamen; Moten, Dzhemal; Mladenov, Rumen; Dzhambazov, Balik; Teneva, Ivanka

    2014-01-01

    The polyphasic approach is the most progressive system that has been suggested for distinguishing and phylogenetically classifying Cyanoprokaryota (Cyanobacteria/Cyanophyta). Several oscillatorialean genera (Lyngbya, Phormidium, Plectonema, and Leptolyngbya) have problematic phylogenetic position and taxonomic state because of their heterogeneity and polyphyletic nature. To accurately resolve the phylogenetic relationship of some filamentous species (Nodosilinea bijugata, Phormidium molle, Phormidium papyraceum), we have performed phylogenetic analyses based on 16S rRNA gene and the phycocyanin operon (PC-IGS) by using maximum-likelihood (ML) tree inference methods. These analyses were combined with morphological re-evaluation. Our phylogenetic analyses support the taxonomic separation of genus Nodosilinea from the polyphyletic genus Leptolyngbya. Investigated Nodosilinea strains always formed a coherent genetic cluster supported with a high bootstrap value. The molecular phylogeny confirmed also the monophyly of the Wilmottia group. In addition, data reveal that although P. papyraceum is morphologically similar to Wilmottia murrayi, this species is genetically distinct. Strains from the newly formed genus Phormidesmis and some Phormidium priestleyi strains were clustered in a separate clade different from the typical Phormidium species, but without strong bootstrap support. PMID:24596450

  10. Phylogenetic Relationships of Some Filamentous Cyanoprokaryotic Species

    PubMed Central

    Stoyanov, Plamen; Moten, Dzhemal; Mladenov, Rumen; Dzhambazov, Balik; Teneva, Ivanka

    2014-01-01

    The polyphasic approach is the most progressive system that has been suggested for distinguishing and phylogenetically classifying Cyanoprokaryota (Cyanobacteria/Cyanophyta). Several oscillatorialean genera (Lyngbya, Phormidium, Plectonema, and Leptolyngbya) have problematic phylogenetic position and taxonomic state because of their heterogeneity and polyphyletic nature. To accurately resolve the phylogenetic relationship of some filamentous species (Nodosilinea bijugata, Phormidium molle, Phormidium papyraceum), we have performed phylogenetic analyses based on 16S rRNA gene and the phycocyanin operon (PC-IGS) by using maximum-likelihood (ML) tree inference methods. These analyses were combined with morphological re-evaluation. Our phylogenetic analyses support the taxonomic separation of genus Nodosilinea from the polyphyletic genus Leptolyngbya. Investigated Nodosilinea strains always formed a coherent genetic cluster supported with a high bootstrap value. The molecular phylogeny confirmed also the monophyly of the Wilmottia group. In addition, data reveal that although P. papyraceum is morphologically similar to Wilmottia murrayi, this species is genetically distinct. Strains from the newly formed genus Phormidesmis and some Phormidium priestleyi strains were clustered in a separate clade different from the typical Phormidium species, but without strong bootstrap support. PMID:24596450

  11. Phylogenetic lineages in Entomophthoromycota

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Entomophthoromycota Humber is one of five major phylogenetic lineages among the former phylum Zygomycota. These early terrestrial fungi share evolutionarily ancestral characters such as coenocytic mycelium and gametangiogamy as a sexual process resulting in zygospore formation. Previous molecular st...

  12. Phylogenetic relationships and infraspecific variation in Canadian Arctic Poa based on chloroplast DNA restriction site data

    Microsoft Academic Search

    Lynn J. Gillespie; Ruben Boles

    2001-01-01

    Infraspecific variation and phylogenetic relationships of Canadian Arctic species of the genus Poa were stud- ied based on chloroplast DNA (cpDNA) variation. Restriction site analysis of polymerase chain reaction amplified cpDNA was used to reexamine the status of infraspecific taxa, reconstruct phylogenetic relationships, and reexamine previous classification systems and hypotheses of relationships. Infraspecific variation was detected in three species, but

  13. Explanations for bird species range size: ecological correlates and phylogenetic

    E-print Network

    Carrascal, Luis M.

    of the primary variables determining the endangered status of species (IUCN Red List classification, 1 DepartmentORIGINAL ARTICLE Explanations for bird species range size: ecological correlates and phylogenetic INTRODUCTION Understanding why species are more or less broadly distrib- uted within their geographical limits

  14. Host specificity and phylogenetic relationships of chicken and turkey parvoviruses

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Previous reports indicate that the newly discovered chicken parvoviruses (ChPV) and turkey parvoviruses (TuPV) are very similar to each other, yet they represent different species within a new genus of Parvoviridae. Currently, strain classification is based on the phylogenetic analysis of a 561 bas...

  15. Flying Insect Classification with Inexpensive Yanping Chen

    E-print Network

    Zordan, Victor

    Flying Insect Classification with Inexpensive Sensors Yanping Chen Department of Computer Science, noninvasive sensors to accurately classify flying insects would have significant implications and extrinsic to the insect's flight behavior, and that a Bayesian classification approach allows us

  16. The Phylogenetic Likelihood Library

    PubMed Central

    Flouri, T.; Izquierdo-Carrasco, F.; Darriba, D.; Aberer, A.J.; Nguyen, L.-T.; Minh, B.Q.; Von Haeseler, A.; Stamatakis, A.

    2015-01-01

    We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2–10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). PMID:25358969

  17. The phylogenetic likelihood library.

    PubMed

    Flouri, T; Izquierdo-Carrasco, F; Darriba, D; Aberer, A J; Nguyen, L-T; Minh, B Q; Von Haeseler, A; Stamatakis, A

    2015-03-01

    We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2-10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). PMID:25358969

  18. ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs

    PubMed Central

    Van der Auwera, Sandra; Bulla, Ingo; Ziller, Mario; Pohlmann, Anne; Harder, Timm; Stanke, Mario

    2014-01-01

    Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1–H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1–N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new “Highly Pathogenic H5N1 Clade Classification Tool” (IRD-CT) proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT), although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification. A retraining of ClassyFlu to altered cladistics as well as an extension of ClassyFlu to other IAV genome segments or fragments thereof is undemanding. This is exemplified by unambiguous assignment to a distinct cluster within subtype H7 of sequences of H7N9 viruses which emerged in China early in 2013 and caused more than 130 human infections. http://bioinf.uni-greifswald.de/ClassyFlu is a free web service. For local execution, the ClassyFlu source code in PERL is freely available. PMID:24404173

  19. Journey into Phylogenetic Systematics

    NSDL National Science Digital Library

    This straightforward and informative site from the Museum of Paleontology (UCMP) at the University of California at Berkeley offers an excellent introduction to Phylogenetic Systematics, the reconstruction of "the pattern of events that have led to the distribution and diversity of life." The site is organized into several sections, addressing "the philosophy, methodology, and implications of cladistic analysis." Descriptive summaries are made more useful with links to the UCMP Glossary of Phylogenetic Terms, and interested users may seek greater depth by linking directly to the UCMP's additional resources.

  20. The revised classification of eukaryotes

    PubMed Central

    Adl, Sina M.; Simpson, Alastair. G.; Lane, Christopher E.; Lukeš, Julius; Bass, David; Bowser, Samuel S.; Brown, Matt; Burki, Fabien; Dunthorn, Micah; Hampl, Vladimir; Heiss, Aaron; Hoppenrath, Mona; Lara, Enrique; leGall, Line; Lynn, Denis H.; McManus, Hilary; Mitchell, Edward A. D.; Mozley-Stanridge, Sharon E.; Parfrey, Laura Wegener; Pawlowski, Jan; Rueckert, Sonja; Shadwick, Laura; Schoch, Conrad; Smirnov, Alexey; Spiegel, Frederick W.

    2012-01-01

    This revision of the classification of eukaryotes, which updates that of Adl et al. (2005), retains an emphasis on the protists and incorporates changes since 2005 that have resolved nodes and branches in phylogenetic trees. Whereas the previous revision was successful in re-introducing name stability to the classification, this revision provides a classification for lineages that were then still unresolved. The supergroups have withstood phylogenetic hypothesis testing with some modifications, but despite some progress, problematic nodes at the base of the eukaryotic tree still remain to be statistically resolved. Looking forward, subsequent transformations to our understanding of the diversity of life will be from the discovery of novel lineages in previously under-sampled areas and from environmental genomic information. PMID:23020233

  1. A Universal Phylogenetic Tree.

    ERIC Educational Resources Information Center

    Offner, Susan

    2001-01-01

    Presents a universal phylogenetic tree suitable for use in high school and college-level biology classrooms. Illustrates the antiquity of life and that all life is related, even if it dates back 3.5 billion years. Reflects important evolutionary relationships and provides an exciting way to learn about the history of life. (SAH)

  2. Molecular phylogenetics before sequences

    PubMed Central

    Ragan, Mark A; Bernard, Guillaume; Chan, Cheong Xin

    2014-01-01

    From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components of the translational apparatus. They recognized a third domain of life, revealed the phylogenetic backbone of bacteria (and its limitations), delineated taxa, and explored the tempo and mode of microbial evolution. For these discoveries to have stood the test of time, oligonucleotide catalogs must carry significant phylogenetic signal; they thus bear re-examination in view of the current interest in alignment-free phylogenetics based on k-mers. Here we consider the aims, successes, and limitations of this early phase of molecular phylogenetics. We computationally generate oligonucleotide sets (e-catalogs) from 16S/18S rRNA sequences, calculate pairwise distances between them based on D2 statistics, compute distance trees, and compare their performance against alignment-based and k-mer trees. Although the catalogs themselves were superseded by full-length sequences, this stage in the development of computational molecular biology remains instructive for us today. PMID:24572375

  3. Phylogenetics Todd Scheetz

    E-print Network

    Casavant, Tom

    to each other in the tree. human mouse fly #12;Trees Example GAATC GAGTT GA(A/G)T(C/T) Rooted vs. unrooted. The sequences are homologous 3. Each position within the alignment is homologous 4. Each sequence has a common;General Process The basic process of phylogenetic analysis is 1. Alignment 2. Determining the substitution

  4. Insights into the evolution of sorbitol metabolism: phylogenetic analysis of SDR196C family

    PubMed Central

    2012-01-01

    Background Short chain dehydrogenases/reductases (SDR) are NAD(P)(H)-dependent oxidoreductases with a highly conserved 3D structure and of an early origin, which has allowed them to diverge into several families and enzymatic activities. The SDR196C family (http://www.sdr-enzymes.org) groups bacterial sorbitol dehydrogenases (SDH), which are of great industrial interest. In this study, we examine the phylogenetic relationship between the members of this family, and based on the findings and some sequence conserved blocks, a new and a more accurate classification is proposed. Results The distribution of the 66 bacterial SDH species analyzed was limited to Gram-negative bacteria. Six different bacterial families were found, encompassing ?-, ?- and ?-proteobacteria. This broad distribution in terms of bacteria and niches agrees with that of SDR, which are found in all forms of life. A cluster analysis of sorbitol dehydrogenase revealed different types of gene organization, although with a common pattern in which the SDH gene is surrounded by sugar ABC transporter proteins, another SDR, a kinase, and several gene regulators. According to the obtained trees, six different lineages and three sublineages can be discerned. The phylogenetic analysis also suggested two different origins for SDH in ?-proteobacteria and four origins for ?-proteobacteria. Finally, this subdivision was further confirmed by the differences observed in the sequence of the conserved blocks described for SDR and some specific blocks of SDH, and by a functional divergence analysis, which made it possible to establish new consensus sequences and specific fingerprints for the lineages and sub lineages. Conclusion SDH distribution agrees with that observed for SDR, indicating the importance of the polyol metabolism, as an alternative source of carbon and energy. The phylogenetic analysis pointed to six clearly defined lineages and three sub lineages, and great variability in the origin of this gene, despite its well conserved 3D structure. This suggests that SDH are very old and emerged early during the evolution. This study also opens up a new and more accurate classification of SDR196C family, introducing two numbers at the end of the family name, which indicate the lineage and the sublineage of each member, i.e, SDR196C6.3. PMID:22899811

  5. Charles Darwin, beetles and phylogenetics.

    PubMed

    Beutel, Rolf G; Friedrich, Frank; Leschen, Richard A B

    2009-11-01

    Here, we review Charles Darwin's relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in "The Descent of Man". During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig's new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data. This has changed dramatically. With very large data sets and high throughput sampling, phylogenetic questions can be addressed without prior knowledge of morphological characters. Nevertheless, molecular studies have not lead to the great breakthrough in beetle systematics--yet. Especially the phylogeny of the extremely species rich suborder Polyphaga remains incompletely resolved. Coordinated efforts of molecular workers and of morphologists using innovative techniques may lead to more profound insights in the near future. The final aim is to develop a well-founded phylogeny, which truly reflects the evolution of this immensely species rich group of organisms. PMID:19760277

  6. Charles Darwin, beetles and phylogenetics

    NASA Astrophysics Data System (ADS)

    Beutel, Rolf G.; Friedrich, Frank; Leschen, Richard A. B.

    2009-11-01

    Here, we review Charles Darwin’s relation to beetles and developments in coleopteran systematics in the last two centuries. Darwin was an enthusiastic beetle collector. He used beetles to illustrate different evolutionary phenomena in his major works, and astonishingly, an entire sub-chapter is dedicated to beetles in “The Descent of Man”. During his voyage on the Beagle, Darwin was impressed by the high diversity of beetles in the tropics, and he remarked that, to his surprise, the majority of species were small and inconspicuous. However, despite his obvious interest in the group, he did not get involved in beetle taxonomy, and his theoretical work had little immediate impact on beetle classification. The development of taxonomy and classification in the late nineteenth and earlier twentieth century was mainly characterised by the exploration of new character systems (e.g. larval features and wing venation). In the mid-twentieth century, Hennig’s new methodology to group lineages by derived characters revolutionised systematics of Coleoptera and other organisms. As envisioned by Darwin and Ernst Haeckel, the new Hennigian approach enabled systematists to establish classifications truly reflecting evolution. Roy A. Crowson and Howard E. Hinton, who both made tremendous contributions to coleopterology, had an ambivalent attitude towards the Hennigian ideas. The Mickoleit school combined detailed anatomical work with a classical Hennigian character evaluation, with stepwise tree building, comparatively few characters and a priori polarity assessment without explicit use of the outgroup comparison method. The rise of cladistic methods in the 1970s had a strong impact on beetle systematics. Cladistic computer programs facilitated parsimony analyses of large data matrices, mostly morphological characters not requiring detailed anatomical investigations. Molecular studies on beetle phylogeny started in the 1990s with modest taxon sampling and limited DNA data. This has changed dramatically. With very large data sets and high throughput sampling, phylogenetic questions can be addressed without prior knowledge of morphological characters. Nevertheless, molecular studies have not lead to the great breakthrough in beetle systematics—yet. Especially the phylogeny of the extremely species rich suborder Polyphaga remains incompletely resolved. Coordinated efforts of molecular workers and of morphologists using innovative techniques may lead to more profound insights in the near future. The final aim is to develop a well-founded phylogeny, which truly reflects the evolution of this immensely species rich group of organisms.

  7. Classification 1: Classification Scheme

    NSDL National Science Digital Library

    Science Netlinks

    2001-10-20

    This Science NetLinks lesson, first of a two-part series will show students that many kinds of living things can be sorted into groups in many ways using various features to decide which things belong to which group and that classification schemes will vary with purpose.ContextThis lesson is the first of a two-part series on classification. This lesson is intended to supplement students' direct investigations by using the Internet to expose students to a variety of living organisms, as well as encourage them to start developing classification schemes of their own.

  8. PAL: Phylogenetic Analysis Library

    NSDL National Science Digital Library

    Created by researchers at the Universities of Auckland (New Zealand) and Oxford (UK), PAL (Phylogenetic Analysis Library) is a Java library intended for use in molecular evolution and phylogenetics. PAL currently consists of thirteen packages with "ready-to-use objects" for reading/writing sequence alignments, distance matrices, and trees; substitution models for nucleotides and amino acids; efficient maximum-likelihood estimation of pairwise distances and tree branch lengths; numerous statistical tests; and options for constructing neighbor-joining and UPGMA trees; among many other features. The program may be downloaded as a .tar.gz or .hqx file, but users who are only interested in accessing the library and not in programming may want simply to download a user front end; conditions for use and instructions are included on-site.

  9. Refuting phylogenetic relationships

    PubMed Central

    Bucknam, James; Boucher, Yan; Bapteste, Eric

    2006-01-01

    Background Phylogenetic methods are philosophically grounded, and so can be philosophically biased in ways that limit explanatory power. This constitutes an important methodologic dimension not often taken into account. Here we address this dimension in the context of concatenation approaches to phylogeny. Results We discuss some of the limits of a methodology restricted to verificationism, the philosophy on which gene concatenation practices generally rely. As an alternative, we describe a software which identifies and focuses on impossible or refuted relationships, through a simple analysis of bootstrap bipartitions, followed by multivariate statistical analyses. We show how refuting phylogenetic relationships could in principle facilitate systematics. We also apply our method to the study of two complex phylogenies: the phylogeny of the archaea and the phylogeny of the core of genes shared by all life forms. While many groups are rejected, our results left open a possible proximity of N. equitans and the Methanopyrales, of the Archaea and the Cyanobacteria, and as well the possible grouping of the Methanobacteriales/Methanoccocales and Thermosplasmatales, of the Spirochaetes and the Actinobacteria and of the Proteobacteria and firmicutes. Conclusion It is sometimes easier (and preferable) to decide which species do not group together than which ones do. When possible topologies are limited, identifying local relationships that are rejected may be a useful alternative to classical concatenation approaches aiming to find a globally resolved tree on the basis of weak phylogenetic markers. Reviewers This article was reviewed by Mark Ragan, Eugene V Koonin and J Peter Gogarten. PMID:16956399

  10. Caminalcule Phylogenetic Exercise

    NSDL National Science Digital Library

    Bill Ausich

    To prepare students to think about the data, assumptions, and interpretations that are part of a phylogenetic analysis. This exercise comes in five parts. The first part is all of the data â all specimens and age dates for all specimens. This simulates the impossible â a complete fossil record. The second part has 10% of the specimens randomly removed (an imperfect fossil record), but all age information is provided for the 90% given. Similarly, the third and fourth parts have 20% (different 20%s) of the data randomly removed, and all information is provided for the 80% of remaining specimens (a more imperfect fossil record). The fifth part has dates only for the modern forms â all other dates are removed. This simulates the situation for a group lacking a fossil record or a situation where the fossil record is ignored. Depending on the class size, students either individually or in groups develop a phylogeny from their data prior to class time. In class we lay everything out on tables and compare and contrast the various phylogenies and in the process discuss many of the basic assumptions, practices, biases, etc. of phylogenetic reconstruction. You could make this more complex and have students code things into MacClade, Paup, etc.; however, I use this for the concepts of phylogenetic reconstruction only.

  11. Novel Phylogenetic Studies of Genomic Sequence Fragments Derived from Uncultured Microbe Mixtures in Environmental and Clinical Samples

    Microsoft Academic Search

    Takashi Abe; Hideaki Sugawara; Makoto Kinouchi; Shigehiko Kanaya; Toshimichi Ikemura

    2005-01-01

    A self-organizing map (SOM) was developed as a novel bioinformatics strategy for phylogenetic classi- fication of sequence fragments obtained from pooled genome samples of uncultured microbes in environ- mental and clinical samples. This phylogenetic classification was possible without either orthologous sequence sets or sequence alignments. We first constructed SOMs for tetranucleotide frequencies in 210 000 5 kb sequence fragments obtained

  12. Learning Weighted Naive Bayes with Accurate Ranking Harry Zhang

    E-print Network

    Zhang, Huajie "Harry"

    Learning Weighted Naive Bayes with Accurate Ranking Harry Zhang Faculty of Computer Science Bayes is one of most effective classification algo- rithms. In many applications, however, a ranking of exam- ples are more desirable than just classification. How to ex- tend naive Bayes to improve its

  13. A Multichannel Approach to Fingerprint Classification

    Microsoft Academic Search

    Anil K. Jain; Salil Prabhakar; Lin Hong

    1999-01-01

    Fingerprint classification provides an important indexing mechanism in a fingerprint database. An accurate and consistent classification can greatly reduce fingerprint matching time for a large database. We present a fingerprint classification algorithm which is able to achieve an accuracy better than previously reported in the literature. We classify fingerprints into five categories: whorl, right loop, left loop, arch, and tented

  14. Classification 1: Classification Scheme

    NSDL National Science Digital Library

    This lesson shows students that living things can be sorted into groups in many ways using various features to decide which things belong to which group and that classification schemes will vary with purpose. It is the first of a two-part series on classification. At this grade level, students should have the opportunity to learn about an increasing variety of living organisms, both the familiar and the exotic, and should become more precise in identifying similarities and differences among them. Firsthand observation of the living environment is essential for students to gain an understanding of the differences among organisms. This lesson is intended to supplement direct investigations made by students by using the Internet to expose them to a variety of living organisms, as well as encourage them to start developing classification schemes of their own.

  15. Phylogenetic Comparative Assembly

    NASA Astrophysics Data System (ADS)

    Husemann, Peter; Stoye, Jens

    Recent high throughput sequencing technologies are capable of generating a huge amount of data for bacterial genome sequencing projects. Although current sequence assemblers successfully merge the overlapping reads, often several contigs remain which cannot be assembled any further. It is still costly and time consuming to close all the gaps in order to acquire the whole genomic sequence. Here we propose an algorithm that takes several related genomes and their phylogenetic relationships into account to create a contig adjacency graph. From this a layout graph can be computed which indicates putative adjacencies of the contigs in order to aid biologists in finishing the complete genomic sequence.

  16. The Phylogenetic Diversity of Metagenomes

    PubMed Central

    Kembel, Steven W.; Eisen, Jonathan A.; Pollard, Katherine S.; Green, Jessica L.

    2011-01-01

    Phylogenetic diversity—patterns of phylogenetic relatedness among organisms in ecological communities—provides important insights into the mechanisms underlying community assembly. Studies that measure phylogenetic diversity in microbial communities have primarily been limited to a single marker gene approach, using the small subunit of the rRNA gene (SSU-rRNA) to quantify phylogenetic relationships among microbial taxa. In this study, we present an approach for inferring phylogenetic relationships among microorganisms based on the random metagenomic sequencing of DNA fragments. To overcome challenges caused by the fragmentary nature of metagenomic data, we leveraged fully sequenced bacterial genomes as a scaffold to enable inference of phylogenetic relationships among metagenomic sequences from multiple phylogenetic marker gene families. The resulting metagenomic phylogeny can be used to quantify the phylogenetic diversity of microbial communities based on metagenomic data sets. We applied this method to understand patterns of microbial phylogenetic diversity and community assembly along an oceanic depth gradient, and compared our findings to previous studies of this gradient using SSU-rRNA gene and metagenomic analyses. Bacterial phylogenetic diversity was highest at intermediate depths beneath the ocean surface, whereas taxonomic diversity (diversity measured by binning sequences into taxonomically similar groups) showed no relationship with depth. Phylogenetic diversity estimates based on the SSU-rRNA gene and the multi-gene metagenomic phylogeny were broadly concordant, suggesting that our approach will be applicable to other metagenomic data sets for which corresponding SSU-rRNA gene sequences are unavailable. Our approach opens up the possibility of using metagenomic data to study microbial diversity in a phylogenetic context. PMID:21912589

  17. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  18. Phylogenetic Quantification of Intra-tumour Heterogeneity

    PubMed Central

    Schwarz, Roland F.; Trinh, Anne; Sipos, Botond; Brenton, James D.; Goldman, Nick; Markowetz, Florian

    2014-01-01

    Intra-tumour genetic heterogeneity is the result of ongoing evolutionary change within each cancer. The expansion of genetically distinct sub-clonal populations may explain the emergence of drug resistance, and if so, would have prognostic and predictive utility. However, methods for objectively quantifying tumour heterogeneity have been missing and are particularly difficult to establish in cancers where predominant copy number variation prevents accurate phylogenetic reconstruction owing to horizontal dependencies caused by long and cascading genomic rearrangements. To address these challenges, we present MEDICC, a method for phylogenetic reconstruction and heterogeneity quantification based on a Minimum Event Distance for Intra-tumour Copy-number Comparisons. Using a transducer-based pairwise comparison function, we determine optimal phasing of major and minor alleles, as well as evolutionary distances between samples, and are able to reconstruct ancestral genomes. Rigorous simulations and an extensive clinical study show the power of our method, which outperforms state-of-the-art competitors in reconstruction accuracy, and additionally allows unbiased numerical quantification of tumour heterogeneity. Accurate quantification and evolutionary inference are essential to understand the functional consequences of tumour heterogeneity. The MEDICC algorithms are independent of the experimental techniques used and are applicable to both next-generation sequencing and array CGH data. PMID:24743184

  19. An Optimization-Based Sampling Scheme for Phylogenetic Trees

    NASA Astrophysics Data System (ADS)

    Misra, Navodit; Blelloch, Guy; Ravi, R.; Schwartz, Russell

    Much modern work in phylogenetics depends on statistical sampling approaches to phylogeny construction to estimate probability distributions of possible trees for any given input data set. Our theoretical understanding of sampling approaches to phylogenetics remains far less developed than that for optimization approaches, however, particularly with regard to the number of sampling steps needed to produce accurate samples of tree partition functions. Despite the many advantages in principle of being able to sample trees from sophisticated probabilistic models, we have little theoretical basis for concluding that the prevailing sampling approaches do in fact yield accurate samples from those models within realistic numbers of steps. We propose a novel approach to phylogenetic sampling intended to be both efficient in practice and more amenable to theoretical analysis than the prevailing methods. The method depends on replacing the standard tree rearrangement moves with an alternative Markov model in which one solves a theoretically hard but practically tractable optimization problem on each step of sampling. The resulting method can be applied to a broad range of standard probability models, yielding practical algorithms for efficient sampling and rigorous proofs of accurate sampling for some important special cases. We demonstrate the efficiency and versatility of the method in an analysis of uncertainty in tree inference over varying input sizes. In addition to providing a new practical method for phylogenetic sampling, the technique is likely to prove applicable to many similar problems involving sampling over combinatorial objects weighted by a likelihood model.

  20. Phylogenetic Toric Varieties on Graphs

    E-print Network

    Buczynska, Weronika J.

    2010-10-12

    We define the phylogenetic model of a trivalent graph as a generalization of a binary symmetric model of a trivalent phylogenetic tree. If the underlining graph is a tree, the model has a parametrization that can be expressed in terms of the tree...

  1. Construct a phylogenetic tree

    NSDL National Science Digital Library

    Brian White

    2012-06-28

    This web page will construct a phylogenetic tree of the creatures you select below. It will use the protein sequences of the protein cytochrome c from each of these organisms to construct the tree. Select the desired creatures from the lists below. To select more than one in the same list, hold down the apple key (on Macs); the control key (on PCs); on the Suns, you just click. If you want to clear your selections and start over, click the "Clear all selections" button. You must also choose one and only one outgroup organism so that your tree will have a root. This is especially important for the parsimony analysis. The outgroup organism should not be closely related to the other organisms. When you have made the selections you want, click the "calculate tree" button. Your request will then be processed. This may take a while, so please be patient.

  2. Phylogenetic and Biogeographic Analysis of Sphaerexochine Trilobites

    PubMed Central

    Congreve, Curtis R.; Lieberman, Bruce S.

    2011-01-01

    Background Sphaerexochinae is a speciose and widely distributed group of cheirurid trilobites. Their temporal range extends from the earliest Ordovician through the Silurian, and they survived the end Ordovician mass extinction event (the second largest mass extinction in Earth history). Prior to this study, the individual evolutionary relationships within the group had yet to be determined utilizing rigorous phylogenetic methods. Understanding these evolutionary relationships is important for producing a stable classification of the group, and will be useful in elucidating the effects the end Ordovician mass extinction had on the evolutionary and biogeographic history of the group. Methodology/Principal Findings Cladistic parsimony analysis of cheirurid trilobites assigned to the subfamily Sphaerexochinae was conducted to evaluate phylogenetic patterns and produce a hypothesis of relationship for the group. This study utilized the program TNT, and the analysis included thirty-one taxa and thirty-nine characters. The results of this analysis were then used in a Lieberman-modified Brooks Parsimony Analysis to analyze biogeographic patterns during the Ordovician-Silurian. Conclusions/Significance The genus Sphaerexochus was found to be monophyletic, consisting of two smaller clades (one composed entirely of Ordovician species and another composed of Silurian and Ordovician species). By contrast, the genus Kawina was found to be paraphyletic. It is a basal grade that also contains taxa formerly assigned to Cydonocephalus. Phylogenetic patterns suggest Sphaerexochinae is a relatively distinctive trilobite clade because it appears to have been largely unaffected by the end Ordovician mass extinction. Finally, the biogeographic analysis yields two major conclusions about Sphaerexochus biogeography: Bohemia and Avalonia were close enough during the Silurian to exchange taxa; and during the Ordovician there was dispersal between Eastern Laurentia and the Yangtze block (South China) and between Eastern Laurentia and Avalonia. PMID:21738632

  3. Editor's Note: Classification Matters

    NSDL National Science Digital Library

    Chris Ohana

    2009-03-01

    Classification skills, so foundational to science, must be taught. While children have a passion and drive to organize and categorize their experiences, sometimes the way they organize them doesn't lead to a worthwhile or accurate scientific understanding. Just as putting a pencil in the hands of a child doesn't automatically teach them to write, having a child sort rocks won't lead to an understanding of classification. The articles in this issue aim to help you teach students how to classify successfully and with purpose.

  4. Contextual classification of multispectral image data: Approximate algorithm

    NASA Technical Reports Server (NTRS)

    Tilton, J. C. (principal investigator)

    1980-01-01

    An approximation to a classification algorithm incorporating spatial context information in a general, statistical manner is presented which is computationally less intensive. Classifications that are nearly as accurate are produced.

  5. Norovirus classification and proposed strain nomenclature

    Microsoft Academic Search

    Du-Ping Zheng; Tamie Ando; Rebecca L. Fankhauser; R. Suzanne Beard; Roger I. Glass; Stephan S. Monroe

    2006-01-01

    Without a virus culture system, genetic analysis becomes the principal method to classify norovirus (NoV) strains. Currently, classification of NoV strains beneath the species level has been based on sequences from different regions of the viral genome. As a result, the phylogenetic insights of some virus were not appropriately interpreted, and no consensus has been reached to establish a uniform

  6. Spectral classification

    NASA Astrophysics Data System (ADS)

    Jaschek, C.

    Taxonomic classification of astronomically observed stellar objects is described in terms of spectral properties. Stars receive a classification containing a letter, number, and a Roman numeral, which relates the star to other stars of higher or lower Roman numerals. The citation indicates the stellar chromatic emission in relation to the wavelengths of other stars. Standards are chosen from the available objects detected. Various classification schemes such as the MK, HD, and the Barbier-Chalonge-Divan systems are defined, including examples of indexing differences. Details delineating the separations between classifications are discussed with reference to the information content in spectral and in photometric classification schemes. The parameters usually used for classification include the temperature, luminosity, reddening, binarity, rotation, magnetic field, and elemental abundance or composition. The inclusion of recently discovered extended wavelength characteristics in nominal classifications is outlined, together with techniques involved in automated classification.

  7. Molecular identification of hepatitis B virus genotypes/subgenotypes: Revised classification hurdles and updated resolutions

    PubMed Central

    Pourkarim, Mahmoud Reza; Amini-Bavil-Olyaee, Samad; Kurbanov, Fuat; Van Ranst, Marc; Tacke, Frank

    2014-01-01

    The clinical course of infections with the hepatitis B virus (HBV) substantially varies between individuals, as a consequence of a complex interplay between viral, host, environmental and other factors. Due to the high genetic variability of HBV, the virus can be categorized into different HBV genotypes and subgenotypes, which considerably differ with respect to geographical distribution, transmission routes, disease progression, responses to antiviral therapy or vaccination, and clinical outcome measures such as cirrhosis or hepatocellular carcinoma. However, HBV (sub)genotyping has caused some controversies in the past due to misclassifications and incorrect interpretations of different genotyping methods. Thus, an accurate, holistic and dynamic classification system is essential. In this review article, we aimed at highlighting potential pitfalls in genetic and phylogenetic analyses of HBV and suggest novel terms for HBV classification. Analyzing full-length genome sequences when classifying genotypes and subgenotypes is the foremost prerequisite of this classification system. Careful attention must be paid to all aspects of phylogenetic analysis, such as bootstrapping values and meeting the necessary thresholds for (sub)genotyping. Quasi-subgenotype refers to subgenotypes that were incorrectly suggested to be novel. As many of these strains were misclassified due to genetic differences resulting from recombination, we propose the term “recombino-subgenotype”. Moreover, immigration is an important confounding facet of global HBV distribution and substantially changes the geographic pattern of HBV (sub)genotypes. We therefore suggest the term “immigro-subgenotype” to distinguish exotic (sub)genotypes from native ones. We are strongly convinced that applying these two proposed terms in HBV classification will help harmonize this rapidly progressing field and allow for improved prophylaxis, diagnosis and treatment. PMID:24966586

  8. Evaluating Support for the Current Classification of Eukaryotic Diversity

    Microsoft Academic Search

    Laura Wegener Parfrey; Erika Barbero; Elyse Lasser; Micah Saul Dunthorn; Debashish Bhattacharya; David J. Patterson; Laura A. Katz

    2005-01-01

    Perspectives on the classification of eukaryotic diversity have changed rapidly in recent years, as the four eukaryotic groups within the five-kingdom classification—plants, animals, fungi, and protists—have been transformed through numerous permutations into the current system of six ''supergroups.'' The intent of the supergroup classification system is to unite microbial and macroscopic eukaryotes based on phylogenetic inference. This supergroup approach is

  9. Concordance analysis in mitogenomic phylogenetics.

    PubMed

    Weisrock, David W

    2012-10-01

    Here I advocate the utility of Bayesian concordance analysis as a mechanism for exploring the magnitude and source of phylogenetic signal in concatenated mitogenomic phylogenetic studies. While typically applied to the study of independently evolving gene trees, Bayesian concordance analysis can also be applied to linked, but individually analyzed, gene regions using a prior probability that reflects the expectation of similar phylogenetic reconstructions. For true branches in the mitogenomic tree, concordance factors should represent the number of gene regions that contain phylogenetic signal for a particular clade. As a demonstration of the application of Bayesian concordance analysis to empirical data, I analyzed two different salamander (Hynobiidae and Plethodontidae) mitogenomic data sets using a gene-based partitioning strategy. The results revealed many strongly supported clades in the concatenated trees that have high concordance factors, permitting the inference that these are robustly resolved through phylogenetic signal distributed across the mitogenome. In contrast, a number of strongly supported clades in the concatenated tree received low concordance factors, indicating that their reconstruction is either driven primarily by phylogenetic signal in a small number of gene regions, or that they are inconsistent reconstructions influenced by properties of the data that can produce inaccurate trees (e.g., compositional bias, selection, etc.). Exploration of the Bayesian joint posterior distribution of trees highlighted partitions that contribute phylogenetic information to similar clade reconstructions. This approach was particularly insightful in the hynobiid data, where different combinations of genes were identified that support alternative tree reconstructions. Concatenated analysis of these different subsets of genes highlighted through Bayesian concordance analysis produced strongly supported and contrasting trees, demonstrating the potential for inconsistency in concatenated mitogenomic phylogenetics. The overall results presented here suggest that Bayesian concordance analysis can serve as an effective exploration of the influence of different gene regions in mitogenomic (and other organellar genomic) phylogenetic studies. PMID:22705824

  10. Phylogenetics and the Human Microbiome

    PubMed Central

    Matsen, Frederick A.

    2015-01-01

    The human microbiome is the ensemble of genes in the microbes that live inside and on the surface of humans. Because microbial sequencing information is now much easier to come by than phenotypic information, there has been an explosion of sequencing and genetic analysis of microbiome samples. Much of the analytical work for these sequences involves phylogenetics, at least indirectly, but methodology has developed in a somewhat different direction than for other applications of phylogenetics. In this article, I review the field and its methods from the perspective of a phylogeneticist, as well as describing current challenges for phylogenetics coming from this type of work. PMID:25102857

  11. Absolute Pitch in Boreal Chickadees and Humans: Exceptions that Test a Phylogenetic Rule

    ERIC Educational Resources Information Center

    Weisman, Ronald G.; Balkwill, Laura-Lee; Hoeschele, Marisa; Moscicki, Michele K.; Bloomfield, Laurie L.; Sturdy, Christopher B.

    2010-01-01

    This research examined generality of the phylogenetic rule that birds discriminate frequency ranges more accurately than mammals. Human absolute pitch chroma possessors accurately tracked transitions between frequency ranges. Independent tests showed that they used note naming (pitch chroma) to remap the tones into ranges; neither possessors nor…

  12. Phylogenetic Analysis of a Spontaneous Cocoa Bean Fermentation Metagenome Reveals New Insights into Its Bacterial and Fungal Community Diversity

    PubMed Central

    Illeghems, Koen; De Vuyst, Luc; Papalexandratou, Zoi; Weckx, Stefan

    2012-01-01

    This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly ?-Proteobacteria) and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni). Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques. PMID:22666442

  13. Phylogenetic analysis of a spontaneous cocoa bean fermentation metagenome reveals new insights into its bacterial and fungal community diversity.

    PubMed

    Illeghems, Koen; De Vuyst, Luc; Papalexandratou, Zoi; Weckx, Stefan

    2012-01-01

    This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly ?-Proteobacteria) and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni). Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques. PMID:22666442

  14. Rational disagreements in phylogenetics.

    PubMed

    Mc Manus, Fabrizzio Guerrero

    2009-06-01

    This paper addresses the general problem of how to rationally choose an algorithm for phylogenetic inference. Specifically, the controversy between maximum likelihood (ML) and maximum parsimony (MP) perspectives is reframed within the philosophical issue of theory choice. A Kuhnian approach in which rationality is bounded and value-laden is offered and construed through the notion of a Style of Modeling. A Style is divided into four stages: collecting remnant models, constructing models of taxonomical identity, implementing modeling algorithms, and finally inferring and confirming evolutionary trees or cladograms. The identification and investigation of styles is useful for exploring sociological and epistemological issues such as individuating scientific communities and assessing the rationality of algorithm choice. Regarding the last point, this paper suggests that the values motivating ML and MP perspectives are justified but only contextually; these algorithms also have normative force because they can be therapeutic by allowing us to rationally choose among several competing trees, nonetheless this force is limited and cannot be used in order to decide the controversy tout court. PMID:19229637

  15. Robustness of Ancestral Sequence Reconstruction to Phylogenetic Uncertainty

    PubMed Central

    Hanson-Smith, Victor; Kolaczkowski, Bryan; Thornton, Joseph W.

    2010-01-01

    Ancestral sequence reconstruction (ASR) is widely used to formulate and test hypotheses about the sequences, functions, and structures of ancient genes. Ancestral sequences are usually inferred from an alignment of extant sequences using a maximum likelihood (ML) phylogenetic algorithm, which calculates the most likely ancestral sequence assuming a probabilistic model of sequence evolution and a specific phylogeny—typically the tree with the ML. The true phylogeny is seldom known with certainty, however. ML methods ignore this uncertainty, whereas Bayesian methods incorporate it by integrating the likelihood of each ancestral state over a distribution of possible trees. It is not known whether Bayesian approaches to phylogenetic uncertainty improve the accuracy of inferred ancestral sequences. Here, we use simulation-based experiments under both simplified and empirically derived conditions to compare the accuracy of ASR carried out using ML and Bayesian approaches. We show that incorporating phylogenetic uncertainty by integrating over topologies very rarely changes the inferred ancestral state and does not improve the accuracy of the reconstructed ancestral sequence. Ancestral state reconstructions are robust to uncertainty about the underlying tree because the conditions that produce phylogenetic uncertainty also make the ancestral state identical across plausible trees; conversely, the conditions under which different phylogenies yield different inferred ancestral states produce little or no ambiguity about the true phylogeny. Our results suggest that ML can produce accurate ASRs, even in the face of phylogenetic uncertainty. Using Bayesian integration to incorporate this uncertainty is neither necessary nor beneficial. PMID:20368266

  16. Quantum Simulation of Phylogenetic Trees

    E-print Network

    Demosthenes Ellinas; Peter Jarvis

    2011-05-09

    Quantum simulations constructing probability tensors of biological multi-taxa in phylogenetic trees are proposed, in terms of positive trace preserving maps, describing evolving systems of quantum walks with multiple walkers. Basic phylogenetic models applying on trees of various topologies are simulated following appropriate decoherent quantum circuits. Quantum simulations of statistical inference for aligned sequences of biological characters are provided in terms of a quantum pruning map operating on likelihood operator observables, utilizing state-observable duality and measurement theory.

  17. Cyber-infrastructure for Fusarium (CiF): Three integrated platforms supporting strain identification, phylogenetics, comparative genomics, and knowledge sharing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The fungal genus Fusarium includes many plant and/or animal pathogenic species and produces diverse toxins. Although accurate identification is critical for managing such threats, it is difficult to identify Fusarium morphologically. Fortunately, extensive molecular phylogenetic studies, founded on ...

  18. Phylogenetic lineages in Pseudocercospora

    PubMed Central

    Crous, P.W.; Braun, U.; Hunter, G.C.; Wingfield, M.J.; Verkley, G.J.M.; Shin, H.-D.; Nakashima, C.; Groenewald, J.Z.

    2013-01-01

    Pseudocercospora is a large cosmopolitan genus of plant pathogenic fungi that are commonly associated with leaf and fruit spots as well as blights on a wide range of plant hosts. They occur in arid as well as wet environments and in a wide range of climates including cool temperate, sub-tropical and tropical regions. Pseudocercospora is now treated as a genus in its own right, although formerly recognised as either an anamorphic state of Mycosphaerella or having mycosphaerella-like teleomorphs. The aim of this study was to sequence the partial 28S nuclear ribosomal RNA gene of a selected set of isolates to resolve phylogenetic generic limits within the Pseudocercospora complex. From these data, 14 clades are recognised, six of which cluster in Mycosphaerellaceae. Pseudocercospora s. str. represents a distinct clade, sister to Passalora eucalypti, and a clade representing the genera Scolecostigmina, Trochophora and Pallidocercospora gen. nov., taxa formerly accommodated in the Mycosphaerella heimii complex and characterised by smooth, pale brown conidia, as well as the formation of red crystals in agar media. Other clades in Mycosphaerellaceae include Sonderhenia, Microcyclosporella, and Paracercospora. Pseudocercosporella resides in a large clade along with Phloeospora, Miuraea, Cercospora and Septoria. Additional clades represent Dissoconiaceae, Teratosphaeriaceae, Cladosporiaceae, and the genera Xenostigmina, Strelitziana, Cyphellophora and Thedgonia. The genus Phaeomycocentrospora is introduced to accommodate Mycocentrospora cantuariensis, primarily distinguished from Pseudocercospora based on its hyaline hyphae, broad conidiogenous loci and hila. Host specificity was considered for 146 species of Pseudocercospora occurring on 115 host genera from 33 countries. Partial nucleotide sequence data for three gene loci, ITS, EF-1?, and ACT suggest that the majority of these species are host specific. Species identified on the basis of host, symptomatology and general morphology, within the same geographic region, frequently differed phylogenetically, indicating that the application of European and American names to Asian taxa, and vice versa, was often not warranted. Taxonomic novelties: New genera - Pallidocercospora Crous, Phaeomycocentrospora Crous, H.D. Shin & U. Braun; New species - Cercospora eucommiae Crous, U. Braun & H.D. Shin, Microcyclospora quercina Crous & Verkley, Pseudocercospora ampelopsis Crous, U. Braun & H.D. Shin, Pseudocercospora cercidicola Crous, U. Braun & C. Nakash., Pseudocercospora crispans G.C. Hunter & Crous, Pseudocercospora crocea Crous, U. Braun, G.C. Hunter & H.D. Shin, Pseudocercospora haiweiensis Crous & X. Zhou, Pseudocercospora humulicola Crous, U. Braun & H.D. Shin, Pseudocercospora marginalis G.C. Hunter, Crous, U. Braun & H.D. Shin, Pseudocercospora ocimi-basilici Crous, M.E. Palm & U. Braun, Pseudocercospora plectranthi G.C. Hunter, Crous, U. Braun & H.D. Shin, Pseudocercospora proteae Crous, Pseudocercospora pseudostigmina-platani Crous, U. Braun & H.D. Shin, Pseudocercospora pyracanthigena Crous, U. Braun & H.D. Shin, Pseudocercospora ravenalicola G.C. Hunter & Crous, Pseudocercospora rhamnellae G.C. Hunter, H.D. Shin, U. Braun & Crous, Pseudocercospora rhododendri-indici Crous, U. Braun & H.D. Shin, Pseudocercospora tibouchinigena Crous & U. Braun, Pseudocercospora xanthocercidis Crous, U. Braun & A. Wood, Pseudocercosporella koreana Crous, U. Braun & H.D. Shin; New combinations - Pallidocercospora acaciigena (Crous & M.J. Wingf.) Crous & M.J. Wingf., Pallidocercospora crystallina (Crous & M.J. Wingf.) Crous & M.J. Wingf., Pallidocercospora heimii (Crous) Crous, Pallidocercospora heimioides (Crous & M.J. Wingf.) Crous & M.J. Wingf., Pallidocercospora holualoana (Crous, Joanne E. Taylor & M.E. Palm) Crous, Pallidocercospora konae (Crous, Joanne E. Taylor & M.E. Palm) Crous, Pallidoocercospora irregulariramosa (Crous & M.J. Wingf.) Crous & M.J. Wingf., Phaeomycocentrospora cantuariensis (E.S. Salmon & Wormald) Crous, H.D. Shin & U. Braun, Pseudocercospora hakeae (U. Braun & Cr

  19. Phylogenetic inference with weighted codon evolutionary distances.

    PubMed

    Criscuolo, Alexis; Michel, Christian J

    2009-04-01

    We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates. PMID:19308635

  20. Accurately determining inflationary perturbations

    E-print Network

    Andrew R Liddle; Ian J Grivell

    1997-01-08

    Cosmic microwave anisotropy satellites promise extremely accurate measures of the amplitude of perturbations in the universe. We use a numerical code to test the accuracy of existing approximate expressions for the amplitude of perturbations produced by single-field inflation models. We find that the second-order Stewart-Lyth calculation gives extremely accurate results, typically better than one percent. We use our code to carry out an expansion about the general power-law inflation solution, providing a fitting function giving results of even higher accuracy.

  1. Phylogenetic analysis of North American Elymus and the monogenomic Triticeae (Poaceae) using

    E-print Network

    Mason-Gamer, Roberta J.

    Phylogenetic analysis of North American Elymus and the monogenomic Triticeae (Poaceae) using three species of Elymus, which, under the current genomic system of classification, are almost all and previously pub- lished chloroplast DNA data from Elymus and from most of the monogenomic genera

  2. Phylogenetic Relationships Among East African Haplochromine Fish as Revealed by Short Interspersed Elements (SINEs)

    Microsoft Academic Search

    Yohey Terai; Naoko Takezaki; Werner E. Mayer; Herbert Tichy; Naoyuki Takahata; Jan Klein; Norihiro Okada

    2004-01-01

    Genomic DNA libraries were prepared from two endemic species of Lake Victoria haplochromine (cichlid) fish and used to isolate and characterize a set of short interspersed elements (SINEs). The distribution and sequences of the SINEs were used to infer phylogenetic relationships among East African haplochromines. The SINE-based classification divides the fish into four groups, which, in order of their divergence

  3. A phylogenetic analysis of the myxobacteria: basis for their classification

    NASA Technical Reports Server (NTRS)

    Shimkets, L.; Woese, C. R.

    1992-01-01

    The primary sequence and secondary structural features of the 16S rRNA were compared for 12 different myxobacteria representing all the known cultivated genera. Analysis of these data show the myxobacteria to form a monophyletic grouping consisting of three distinct families, which lies within the delta subdivision of the purple bacterial phylum. The composition of the families is consistent with differences in cell and spore morphology, cell behavior, and pigment and secondary metabolite production but is not correlated with the morphological complexity of the fruiting bodies. The Nannocystis exedens lineage has evolved at an unusually rapid pace and its rRNA shows numerous primary and secondary structural idiosyncrasies.

  4. Protein Classification Using Transductive Learning On Phylogenetic Profiles

    E-print Network

    Liao, Li

    ancestral genomes, are scored in a way to reflect their chances of developing divergence in the descendants. The scoring scheme also incorporates the likelihood of proteins presence in genomes as weighting factors vector machine. In a transductive learning scheme, when the SVM is used for classifying test data

  5. Leaf Classification

    NSDL National Science Digital Library

    2012-08-03

    In this activity, students learn about hierarchical classification systems and create their own classification system for leaves they collect in the field. They learn about the key characteristics for classification systems and see that there are multiple ways to classify objects. This is a learning activity associated with the GLOBE land cover/biology investigations and is supported by the Land Cover/Biology chapter of the GLOBE Teacher's Guide.

  6. A fast and accurate \\

    Microsoft Academic Search

    Steven M. Schimmel; Martin F. Müller; Norbert Dillier

    2009-01-01

    We present a new ldquoshoeboxrdquo room acoustics simulator that is designed to support research into signal processing algorithms that are robust to reverberation. It is an improvement over existing room acoustics simulators because it is computationally fast, portable to many kinds of research environments, and flexible to use. The proposed simulator is also perceptually accurate because it models both specular

  7. Integrating Classification and Association Rule Mining

    Microsoft Academic Search

    Bing Liu; Wynne Hsu; Yiming Ma

    1998-01-01

    Classification rule mining aims to discover a small set of rules in the database that forms an accurate classifier. Association rule mining finds all the rules existing in the database that satisfy some minimum support and minimum confidence constraints. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only

  8. Classification of Instructional Programs: 2000 Edition.

    ERIC Educational Resources Information Center

    Morgan, Robert L.; Hunt, E. Stephen

    This third revision of the Classification of Instructional Programs (CIP) updates and modifies education program classifications, providing a taxonomic scheme that supports the accurate tracking, assessment, and reporting of field of study and program completions activity. This edition has also been adopted as the standard field of study taxonomy…

  9. Fingerprint classification

    Microsoft Academic Search

    Kalle Karu; Anil K. Jain

    1996-01-01

    A fingerprint classification algorithm is presented in this paper. Fingerprints are classified into five categories: arch, tented arch, left loop, right loop and whorl. The algorithm extracts singular points (cores and deltas) in a fingerprint image and performs classification based on the number and locations of the detected singular points. The classifier is invariant to rotation, translation and small amounts

  10. Hubble Classification

    Microsoft Academic Search

    P. Murdin

    2000-01-01

    A classification scheme for galaxies, devised in its original form in 1925 by Edwin P Hubble (1889-1953), and still widely used today. The Hubble classification recognizes four principal types of galaxy---elliptical, spiral, barred spiral and irregular---and arranges these in a sequence that is called the tuning-fork diagram....

  11. Spectral classification

    Microsoft Academic Search

    C. Jaschek

    1982-01-01

    Taxonomic classification of astronomically observed stellar objects is described in terms of spectral properties. Stars receive a classification containing a letter, number, and a Roman numeral, which relates the star to other stars of higher or lower Roman numerals. The citation indicates the stellar chromatic emission in relation to the wavelengths of other stars. Standards are chosen from the available

  12. Hubble Classification

    NASA Astrophysics Data System (ADS)

    Murdin, P.

    2000-11-01

    A classification scheme for galaxies, devised in its original form in 1925 by Edwin P Hubble (1889-1953), and still widely used today. The Hubble classification recognizes four principal types of galaxy—elliptical, spiral, barred spiral and irregular—and arranges these in a sequence that is called the tuning-fork diagram....

  13. Accurate Unlexicalized Parsing

    Microsoft Academic Search

    Dan Klein; Christopher D. Manning

    2003-01-01

    We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP\\/LR F PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong

  14. Molecular identification and phylogenetic study of Demodex caprae.

    PubMed

    Zhao, Ya-E; Cheng, Juan; Hu, Li; Ma, Jun-Xian

    2014-10-01

    The DNA barcode has been widely used in species identification and phylogenetic analysis since 2003, but there have been no reports in Demodex. In this study, to obtain an appropriate DNA barcode for Demodex, molecular identification of Demodex caprae based on mitochondrial cox1 was conducted. Firstly, individual adults and eggs of D. caprae were obtained for genomic DNA (gDNA) extraction; Secondly, mitochondrial cox1 fragment was amplified, cloned, and sequenced; Thirdly, cox1 fragments of D. caprae were aligned with those of other Demodex retrieved from GenBank; Finally, the intra- and inter-specific divergences were computed and the phylogenetic trees were reconstructed to analyze phylogenetic relationship in Demodex. Results obtained from seven 429-bp fragments of D. caprae showed that sequence identities were above 99.1% among three adults and four eggs. The intraspecific divergences in D. caprae, Demodex folliculorum, Demodex brevis, and Demodex canis were 0.0-0.9, 0.5-0.9, 0.0-0.2, and 0.0-0.5%, respectively, while the interspecific divergences between D. caprae and D. folliculorum, D. canis, and D. brevis were 20.3-20.9, 21.8-23.0, and 25.0-25.3, respectively. The interspecific divergences were 10 times higher than intraspecific ones, indicating considerable barcoding gap. Furthermore, the phylogenetic trees showed that four Demodex species gathered separately, representing independent species; and Demodex folliculorum gathered with canine Demodex, D. caprae, and D. brevis in sequence. In conclusion, the selected 429-bp mitochondrial cox1 gene is an appropriate DNA barcode for molecular classification, identification, and phylogenetic analysis of Demodex. D. caprae is an independent species and D. folliculorum is closer to D. canis than to D. caprae or D. brevis. PMID:25132566

  15. A Phylogenetic Re-Analysis of Groupers with Applications for Ciguatera Fish Poisoning

    PubMed Central

    Schoelinck, Charlotte; Hinsinger, Damien D.; Dettaï, Agnès; Cruaud, Corinne; Justine, Jean-Lou

    2014-01-01

    Background Ciguatera fish poisoning (CFP) is a significant public health problem due to dinoflagellates. It is responsible for one of the highest reported incidence of seafood-borne illness and Groupers are commonly reported as a source of CFP due to their position in the food chain. With the role of recent climate change on harmful algal blooms, CFP cases might become more frequent and more geographically widespread. Since there is no appropriate treatment for CFP, the most efficient solution is to regulate fish consumption. Such a strategy can only work if the fish sold are correctly identified, and it has been repeatedly shown that misidentifications and species substitutions occur in fish markets. Methods We provide here both a DNA-barcoding reference for groupers, and a new phylogenetic reconstruction based on five genes and a comprehensive taxonomical sampling. We analyse the correlation between geographic range of species and their susceptibility to ciguatera accumulation, and the co-occurrence of ciguatoxins in closely related species, using both character mapping and statistical methods. Results Misidentifications were encountered in public databases, precluding accurate species identifications. Epinephelinae now includes only twelve genera (vs. 15 previously). Comparisons with the ciguatera incidences show that in some genera most species are ciguateric, but statistical tests display only a moderate correlation with the phylogeny. Atlantic species were rarely contaminated, with ciguatera occurrences being restricted to the South Pacific. Conclusions The recent changes in classification based on the reanalyses of the relationships within Epinephelidae have an impact on the interpretation of the ciguatera distribution in the genera. In this context and to improve the monitoring of fish trade and safety, we need to obtain extensive data on contamination at the species level. Accurate species identifications through DNA barcoding are thus an essential tool in controlling CFP since meal remnants in CFP cases can be easily identified with molecular tools. PMID:25093850

  16. EM for phylogenetic topology reconstruction on nonhomogeneous data

    PubMed Central

    2014-01-01

    Background The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. Methods We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. Results In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. Conclusions The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data. PMID:24938507

  17. Phylogenetic definitions and taxonomic philosophy

    Microsoft Academic Search

    KEVIN DE QUEIROZ

    1992-01-01

    An examination of the post-Darwinian history of biological taxonomy reveals an implicit assumption that the definitions of taxon names consist of lists of organismal traits. That assumption represents a failure to grant the concept of evolution a central role in taxonomy, and it causes conflicts between traditional methods of defining taxon names and evolutionary concepts of taxa. Phylogenetic definitions of

  18. Alignment-Free Phylogenetic Reconstruction

    NASA Astrophysics Data System (ADS)

    Daskalakis, Constantinos; Roch, Sebastien

    We introduce the first polynomial-time phylogenetic reconstruction algorithm under a model of sequence evolution allowing insertions and deletions (or indels). Given appropriate assumptions, our algorithm requires sequence lengths growing polynomially in the number of leaf taxa. Our techniques are distance-based and largely bypass the problem of multiple alignment.

  19. Molecular Evolution and Phylogenetic Analysis

    E-print Network

    Pollock, David

    Molecular Evolution and Phylogenetic Analysis David Pollock and Richard Goldstein© Introduction All of biology is based on evolution. Evolution is the organizing principle for understanding the shared history of all biological organisms. Evolution describes the similarities between different organisms, as well

  20. Sirevirus LTR retrotransposons: phylogenetic misconceptions in the plant world

    PubMed Central

    2013-01-01

    Sireviruses are an ancient and plant-specific LTR retrotransposon genus. They possess a unique genome structure that is characterized by a plethora of highly conserved sequence motifs in key domains of the non-coding genome, and often, by the presence of an envelope-like gene. Recently, their crucial role in the organization of the maize genome, where Sireviruses occupy approximately 21% of its nuclear content, was revealed, followed by an analysis of their distribution across the plant kingdom. It is now suggested that Sireviruses have been a major mediator of the evolution of many plant genomes. However, the name ‘Sirevirus’ has caused confusion in the scientific community in regards to their classification within the LTR retrotransposon order and their relationship with viruses - a situation that is not unique to Sireviruses, but also affects other LTR retrotransposon genera. Here, we clarify the phylogenetic position of Sireviruses as typical LTR retrotransposons of the Copia superfamily and explain that the confusion stems from the discrepancy in the categorization of LTR retrotransposons by the two main classification systems: the International Committee on the Taxonomy of Viruses (ICTV) system and the unified classification system for eukaryotic transposable elements. While the name ‘Sirevirus’ has been given by ICTV, we show that the transposable element system, which is more suitable for eukaryotic genome studies, lacks an appropriate taxonomic level for describing them. We urge for this inconsistency to be addressed. Finally, we provide data suggesting that of the three ICTV-proposed genera of the Pseudoviridae (that is, Copia) family, only Sireviruses form a monophyletic group, while the phylogenetic distinction between Pseudoviruses and Hemiviruses is unclear. We conclude that because of their ongoing important contribution to the classification of transposable elements, these schemes need to be frequently revisited and revised - as shown by the example of the Sirevirus LTR retrotransposon genus. PMID:23452336

  1. Mineral Classification

    NSDL National Science Digital Library

    This problem set challenges students to determine the chemical classification of minerals based on their chemical formula (provided). For oxygen-bearing minerals, students must also provide the valences of the various cations.

  2. Herbicide Classification

    NSDL National Science Digital Library

    This lesson focuses on understanding the classification system intowhich herbicides are organized. Terms of classification, classificationhierachy, examples of classification and a brief overview of the eightmodes of action are all discussed in this lesson. Once this isunderstood it is much easier to grasp similar herbicides and know whythey may exhibit certain symptoms to weeds and plants alike.Objectives:1.Understand how herbicides are classified and why it is important for managing herbicide resistance2.Understand the Importance of classification and herbicides by mode of action rather than chemical family3.Be able to tell the difference between mode of action and site of action4.Be able to differentiate between herbicide families, modes of action, and sites of action5.Understand common name, trade names and sites of absorption

  3. LABEL: Fast and Accurate Lineage Assignment with Assessment of H5N1 and H9N2 Influenza A Hemagglutinins

    PubMed Central

    Shepard, Samuel S.; Davis, C. Todd; Bahl, Justin; Rivailler, Pierre; York, Ian A.; Donis, Ruben O.

    2014-01-01

    The evolutionary classification of influenza genes into lineages is a first step in understanding their molecular epidemiology and can inform the subsequent implementation of control measures. We introduce a novel approach called Lineage Assignment By Extended Learning (LABEL) to rapidly determine cladistic information for any number of genes without the need for time-consuming sequence alignment, phylogenetic tree construction, or manual annotation. Instead, LABEL relies on hidden Markov model profiles and support vector machine training to hierarchically classify gene sequences by their similarity to pre-defined lineages. We assessed LABEL by analyzing the annotated hemagglutinin genes of highly pathogenic (H5N1) and low pathogenicity (H9N2) avian influenza A viruses. Using the WHO/FAO/OIE H5N1 evolution working group nomenclature, the LABEL pipeline quickly and accurately identified the H5 lineages of uncharacterized sequences. Moreover, we developed an updated clade nomenclature for the H9 hemagglutinin gene and show a similarly fast and reliable phylogenetic assessment with LABEL. While this study was focused on hemagglutinin sequences, LABEL could be applied to the analysis of any gene and shows great potential to guide molecular epidemiology activities, accelerate database annotation, and provide a data sorting tool for other large-scale bioinformatic studies. PMID:24466291

  4. Accurate quantum chemical calculations

    NASA Technical Reports Server (NTRS)

    Bauschlicher, Charles W., Jr.; Langhoff, Stephen R.; Taylor, Peter R.

    1989-01-01

    An important goal of quantum chemical calculations is to provide an understanding of chemical bonding and molecular electronic structure. A second goal, the prediction of energy differences to chemical accuracy, has been much harder to attain. First, the computational resources required to achieve such accuracy are very large, and second, it is not straightforward to demonstrate that an apparently accurate result, in terms of agreement with experiment, does not result from a cancellation of errors. Recent advances in electronic structure methodology, coupled with the power of vector supercomputers, have made it possible to solve a number of electronic structure problems exactly using the full configuration interaction (FCI) method within a subspace of the complete Hilbert space. These exact results can be used to benchmark approximate techniques that are applicable to a wider range of chemical and physical problems. The methodology of many-electron quantum chemistry is reviewed. Methods are considered in detail for performing FCI calculations. The application of FCI methods to several three-electron problems in molecular physics are discussed. A number of benchmark applications of FCI wave functions are described. Atomic basis sets and the development of improved methods for handling very large basis sets are discussed: these are then applied to a number of chemical and spectroscopic problems; to transition metals; and to problems involving potential energy surfaces. Although the experiences described give considerable grounds for optimism about the general ability to perform accurate calculations, there are several problems that have proved less tractable, at least with current computer resources, and these and possible solutions are discussed.

  5. HIV classification using coalescent theory

    SciTech Connect

    Zhang, Ming [Los Alamos National Laboratory; Letiner, Thomas K [Los Alamos National Laboratory; Korber, Bette T [Los Alamos National Laboratory

    2008-01-01

    Algorithms for subtype classification and breakpoint detection of HIV-I sequences are based on a classification system of HIV-l. Hence, their quality highly depend on this system. Due to the history of creation of the current HIV-I nomenclature, the current one contains inconsistencies like: The phylogenetic distance between the subtype B and D is remarkably small compared with other pairs of subtypes. In fact, it is more like the distance of a pair of subsubtypes Robertson et al. (2000); Subtypes E and I do not exist any more since they were discovered to be composed of recombinants Robertson et al. (2000); It is currently discussed whether -- instead of CRF02 being a recombinant of subtype A and G -- subtype G should be designated as a circulating recombination form (CRF) nd CRF02 as a subtype Abecasis et al. (2007); There are 8 complete and over 400 partial HIV genomes in the LANL-database which belong neither to a subtype nor to a CRF (denoted by U). Moreover, the current classification system is somehow arbitrary like all complex classification systems that were created manually. To this end, it is desirable to deduce the classification system of HIV systematically by an algorithm. Of course, this problem is not restricted to HIV, but applies to all fast mutating and recombining viruses. Our work addresses the simpler subproblem to score classifications of given input sequences of some virus species (classification denotes a partition of the input sequences in several subtypes and CRFs). To this end, we reconstruct ancestral recombination graphs (ARG) of the input sequences under restrictions determined by the given classification. These restritions are imposed in order to ensure that the reconstructed ARGs do not contradict the classification under consideration. Then, we find the ARG with maximal probability by means of Markov Chain Monte Carlo methods. The probability of the most probable ARG is interpreted as a score for the classification. To our knowledge, this particular problem was not addressed up to now. The software package Lamarc Kuhner et al. (2000) allows for sampling ARGs, but it assumes that recombination events only involve one breakpoint. However, in HIV recombinants usually have more than one breakpoint. Moreover, Lamarc does not perform an explicit breakpoint detection, but tries to find them by chance. Although this approach is suitable for most situations, it will not lead to satisfying results in case of highly recombining viruses with multiple breakpoints.

  6. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  7. Conformation of phylogenetic relationship of Penaeidae shrimp based on morphometric and molecular investigations.

    PubMed

    Rajakumaran, P; Vaseeharan, B; Jayakumar, R; Chidambara, R

    2014-01-01

    Understanding of accurate phylogenetic relationship among Penaeidae shrimp is important for academic and fisheries industry. The Morphometric and Randomly amplified polymorphic DNA (RAPD) analysis was used to make the phylogenetic relationsip among 13 Penaeidae shrimp. For morphometric analysis forty variables and total lengths of shrimp were measured for each species, and removed the effect of size variation. The size normalized values obtained was subjected to UPGMA (Unweighted Pair-Group Method with Arithmetic Mean) cluster analysis. For RAPD analysis, the four primers showed reliable differentiation between species, and used correlation coefficient between the DNA banding patterns of 13 Penaeidae species to construct UPGMA dendrogram. Phylogenetic relationship from morphometric and molecular analysis for Penaeidae species found to be congruent. We concluded that as the results from morphometry investigations concur with molecular one, phylogenetic relationship obtained for the studied Penaeidae are considered to be reliable. PMID:25536818

  8. Phylogenetic methods in natural product research.

    PubMed

    Schmitt, Imke; Barker, F Keith

    2009-12-01

    Natural products researchers are increasingly employing evolutionary analyses of genes and gene products that rely on phylogenetic trees. The field of phylogenetic inference and of evolutionary analyses based on phylogenies is growing at an amazing rate, making it difficult to keep up with the latest methodologies. Here, we summarize phylogenetic applications in natural products research, and review methods and software useful for carrying out analyses inferring or using phylogenetic trees. We include an updated overview of available alignment methods and programs, as well as a selection of some useful phylogenetic analysis tools. This review covers primarily the period 2000-2009 for applications of phylogenetic methods in natural product research, and 1990-2009 for phylogenetic methods, with some references going back to the 1960s. PMID:19936388

  9. Comparison of Tree-Child Phylogenetic Networks

    E-print Network

    Cardona, Gabriel; Valiente, Gabriel

    2007-01-01

    Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of non-treelike evolutionary events, like recombination, hybridization, or lateral gene transfer. In this paper, we present and study a new class of phylogenetic networks, called tree-child phylogenetic networks, where every non-extant species has some descendant through mutation. We provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors, and we use this representation to define a distance on this class and to give an alignment method for pairs of these networks. To the best of our knowledge, they are respectively the first true distance and the first alignment method defined on a meaningful class of phylogenetic networks strictly extending the class of phylogenetic trees. Simple, polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child...

  10. Point estimates in phylogenetic reconstructions

    PubMed Central

    Benner, Philipp; Ba?ák, Miroslav; Bourguignon, Pierre-Yves

    2014-01-01

    Motivation: The construction of statistics for summarizing posterior samples returned by a Bayesian phylogenetic study has so far been hindered by the poor geometric insights available into the space of phylogenetic trees, and ad hoc methods such as the derivation of a consensus tree makeup for the ill-definition of the usual concepts of posterior mean, while bootstrap methods mitigate the absence of a sound concept of variance. Yielding satisfactory results with sufficiently concentrated posterior distributions, such methods fall short of providing a faithful summary of posterior distributions if the data do not offer compelling evidence for a single topology. Results: Building upon previous work of Billera et al., summary statistics such as sample mean, median and variance are defined as the geometric median, Fréchet mean and variance, respectively. Their computation is enabled by recently published works, and embeds an algorithm for computing shortest paths in the space of trees. Studying the phylogeny of a set of plants, where several tree topologies occur in the posterior sample, the posterior mean balances correctly the contributions from the different topologies, where a consensus tree would be biased. Comparisons of the posterior mean, median and consensus trees with the ground truth using simulated data also reveals the benefits of a sound averaging method when reconstructing phylogenetic trees. Availability and implementation: We provide two independent implementations of the algorithm for computing Fréchet means, geometric medians and variances in the space of phylogenetic trees. TFBayes: https://github.com/pbenner/tfbayes, TrAP: https://github.com/bacak/TrAP. Contact: philipp.benner@mis.mpg.de PMID:25161244

  11. On distances between phylogenetic trees

    Microsoft Academic Search

    B. DasGupta; X. He; T. Jiang; M. Li; J. T. Tromp; L. Zhang

    1997-01-01

    Different phylogenetic trees for the same group of species are often produced either by proceduresthat use diverse optimality criteria [18] or from different genes [12] in the study of molecularevolution. Comparing these trees to find their similarities (e.g. agreement or consensus) anddissimilarities, i.e. distance, is thus an important issue in computational molecular biology.The nearest neighbor interchange (nni) distance [26, 24,

  12. A classification of the grouse (Aves: Tetraoninae) based on mitochondrial DNA sequences

    Microsoft Academic Search

    R. J. Gutiérrez; George F. Barrowclough; Jeffrey G. Groth

    We propose a new classification of the grouse that brings their taxonomy into agreement with our molecular phylogenetic studies. Our analyses provide, for the first time, a robust estimate of the evolutionary history of these birds. These analyses are based on aligned sequences of 3,809 basepairs of five complete mitochondrial genes. Our classification does not require novel genera and gen-

  13. Accurate and Reliable Cancer Classi cation Based on Pathway-Markers and Subnetwork-Markers

    E-print Network

    Su, Junjie

    2012-02-14

    Finding reliable gene markers for accurate disease classification is very challenging due to a number of reasons, including the small sample size of typical clinical data, high noise in gene expression measurements, and the heterogeneity across...

  14. Séance: reference-based phylogenetic analysis for 18S rRNA studies.

    PubMed

    Medlar, Alan; Aivelo, Tuomas; Löytynoja, Ari

    2014-11-30

    BackgroundMarker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Target regions are chosen for their discriminatory power, but the limited phylogenetic signal of short high¿throughput sequencing reads precludes accurate phylogenetic analysis. This is particularly unfortunate in the study of microscopic eukaryotes where horizontal gene flow is limited and the rRNA gene is expected to accurately reflect the species phylogeny. A promising alternative to full phylogenetic analysis is phylogenetic placement, where a reference phylogeny is inferred using the complete marker gene and iteratively extended with the short sequences from a metagenetic sample under study.ResultsBased on the phylogenetic placement approach we built Séance, a community analysis pipeline focused on the analysis of 18S marker gene data. Séance combines the alignment extension and phylogenetic placement capabilities of the Pagan multiple sequence alignment program with a suite of tools to preprocess, cluster and visualise datasets composed of many samples. We showcase Séance by analysing 454 data from a longitudinal study of intestinal parasite communities in wild rufous mouse lemurs (Microcebus rufus) as well as in simulation. We demonstrate both improved OTU picking at higher levels of sequence similarity for 454 data and show the accuracy of phylogenetic placement to be comparable to maximum likelihood methods for lower numbers of taxa.ConclusionsSéance is an open source community analysis pipeline that provides reference¿based phylogenetic analysis for rRNA marker gene studies. Whilst in this article we focus on studying nematodes using the 18S marker gene, the concepts are generic and reference data for alternative marker genes can be easily created. Séance can be downloaded from http://wasabiapp.org/software/seance/. PMID:25433763

  15. On-the-fly Classifying Sonar with Accurate Range and Bearing Estimation

    E-print Network

    On-the-fly Classifying Sonar with Accurate Range and Bearing Estimation Lindsay Kleeman Intelligent transducer pulse coded sonar system that performs target localisation in two dimensions and classification second. 1 Introduction Sonar classification provides a fast means for determining geometric properties

  16. Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples.

    PubMed

    Abe, Takashi; Sugawara, Hideaki; Kinouchi, Makoto; Kanaya, Shigehiko; Ikemura, Toshimichi

    2005-01-01

    A self-organizing map (SOM) was developed as a novel bioinformatics strategy for phylogenetic classification of sequence fragments obtained from pooled genome samples of uncultured microbes in environmental and clinical samples. This phylogenetic classification was possible without either orthologous sequence sets or sequence alignments. We first constructed SOMs for tetranucleotide frequencies in 210,000 5 kb sequence fragments obtained from 1502 prokaryotes for which at least 10 kb of genomic sequence has been deposited in public DNA databases. The sequences could be classified primarily according to phylogenetic groups without information regarding the species. We used the SOM method to classify sequence fragments derived from environmental samples of the Sargasso Sea and of an acidophilic biofilm growing in acid mine drainage. Phylogenetic diversity of the environmental sequences was effectively visualized on a single map. Sequences that were derived from a single genome but cloned independently could be reassociated in silico. G + C% has been used for a long period as a fundamental parameter for phylogenetic classification of microbes, but the G + C% is apparently too simple a parameter to differentiate a wide variety of known species. Oligonucleotide frequency can be used to distinguish the species because oligonucleotide frequencies vary significantly among their genomes. PMID:16769690

  17. TreeFam: a curated database of phylogenetic trees of animal gene families

    Microsoft Academic Search

    Heng Li; Avril Coghlan; Jue Ruan; Lachlan James M. Coin; Jean-karim Hériché; Lara Osmotherly; Ruiqiang Li; Tao Liu; Zhang Zhang; Lars Bolund; Gane Ka-shu Wong; Wei-mou Zheng; Paramvir Dehal; Jun Wang; Richard Durbin

    2006-01-01

    TreeFam is a database of phylogenetic trees of gene families found in animals. It aims to develop a curated resource that presents the accurate evolutionary his- 20 tory of all animal gene families, as well as reliable ortholog and paralog assignments. Curated families are being added progressively, based on seed align- ments and trees in a similar fashion to Pfam.

  18. Accurate spectral color measurements

    NASA Astrophysics Data System (ADS)

    Hiltunen, Jouni; Jaeaeskelaeinen, Timo; Parkkinen, Jussi P. S.

    1999-08-01

    Surface color measurement is of importance in a very wide range of industrial applications including paint, paper, printing, photography, textiles, plastics and so on. For a demanding color measurements spectral approach is often needed. One can measure a color spectrum with a spectrophotometer using calibrated standard samples as a reference. Because it is impossible to define absolute color values of a sample, we always work with approximations. The human eye can perceive color difference as small as 0.5 CIELAB units and thus distinguish millions of colors. This 0.5 unit difference should be a goal for the precise color measurements. This limit is not a problem if we only want to measure the color difference of two samples, but if we want to know in a same time exact color coordinate values accuracy problems arise. The values of two instruments can be astonishingly different. The accuracy of the instrument used in color measurement may depend on various errors such as photometric non-linearity, wavelength error, integrating sphere dark level error, integrating sphere error in both specular included and specular excluded modes. Thus the correction formulas should be used to get more accurate results. Another question is how many channels i.e. wavelengths we are using to measure a spectrum. It is obvious that the sampling interval should be short to get more precise results. Furthermore, the result we get is always compromise of measuring time, conditions and cost. Sometimes we have to use portable syste or the shape and the size of samples makes it impossible to use sensitive equipment. In this study a small set of calibrated color tiles measured with the Perkin Elmer Lamda 18 and the Minolta CM-2002 spectrophotometers are compared. In the paper we explain the typical error sources of spectral color measurements, and show which are the accuracy demands a good colorimeter should have.

  19. Classification Fun

    NSDL National Science Digital Library

    Shubinski, Carol

    2012-06-11

    Taxonomic information shows the evolutionary relationships between organisms. In this lesson plan, students will classify organisms by kingdom and apply their own understanding of classification to identify organisms. The students should already have an understanding of the basics of the five kindoms and the seven categories of classification. The document includes a pre-test on the topic to gauge student understanding and two classroom activities. The activity is intended for sixth grade students, and should take three to four class periods to complete.

  20. Choosing among Partition Models in Bayesian Phylogenetics

    PubMed Central

    Fan, Yu; Wu, Rui; Chen, Ming-Hui; Kuo, Lynn; Lewis, Paul O.

    2011-01-01

    Bayesian phylogenetic analyses often depend on Bayes factors (BFs) to determine the optimal way to partition the data. The marginal likelihoods used to compute BFs, in turn, are most commonly estimated using the harmonic mean (HM) method, which has been shown to be inaccurate. We describe a new more accurate method for estimating the marginal likelihood of a model and compare it with the HM method on both simulated and empirical data. The new method generalizes our previously described stepping-stone (SS) approach by making use of a reference distribution parameterized using samples from the posterior distribution. This avoids one challenging aspect of the original SS method, namely the need to sample from distributions that are close (in the Kullback–Leibler sense) to the prior. We specifically address the choice of partition models and find that using the HM method can lead to a strong preference for an overpartitioned model. In contrast to the HM method and the original SS method, we show using simulated data that the generalized SS method is strikingly more precise (repeatable BF values of the same data and partition model) and yields BF values that are much more reasonable than those produced by the HM method. Comparisons of HM and generalized SS methods on an empirical data set demonstrate that the generalized SS method tends to choose simpler partition schemes that are more in line with expectation based on inferred patterns of molecular evolution. The generalized SS method shares with thermodynamic integration the need to sample from a series of distributions in addition to the posterior. Such dedicated path-based Markov chain Monte Carlo analyses appear to be a cost of estimating marginal likelihoods accurately. PMID:20801907

  1. Neuromuscular disease classification system

    NASA Astrophysics Data System (ADS)

    Sáez, Aurora; Acha, Begoña; Montero-Sánchez, Adoración; Rivas, Eloy; Escudero, Luis M.; Serrano, Carmen

    2013-06-01

    Diagnosis of neuromuscular diseases is based on subjective visual assessment of biopsies from patients by the pathologist specialist. A system for objective analysis and classification of muscular dystrophies and neurogenic atrophies through muscle biopsy images of fluorescence microscopy is presented. The procedure starts with an accurate segmentation of the muscle fibers using mathematical morphology and a watershed transform. A feature extraction step is carried out in two parts: 24 features that pathologists take into account to diagnose the diseases and 58 structural features that the human eye cannot see, based on the assumption that the biopsy is considered as a graph, where the nodes are represented by each fiber, and two nodes are connected if two fibers are adjacent. A feature selection using sequential forward selection and sequential backward selection methods, a classification using a Fuzzy ARTMAP neural network, and a study of grading the severity are performed on these two sets of features. A database consisting of 91 images was used: 71 images for the training step and 20 as the test. A classification error of 0% was obtained. It is concluded that the addition of features undetectable by the human visual inspection improves the categorization of atrophic patterns.

  2. Phycas: software for bayesian phylogenetic analysis.

    PubMed

    Lewis, Paul O; Holder, Mark T; Swofford, David L

    2015-05-01

    Phycas is open source, freely available Bayesian phylogenetics software written primarily in C++ but with a Python interface. Phycas specializes in Bayesian model selection for nucleotide sequence data, particularly the estimation of marginal likelihoods, central to computing Bayes Factors. Marginal likelihoods can be estimated using newer methods (Thermodynamic Integration and Generalized Steppingstone) that are more accurate than the widely used Harmonic Mean estimator. In addition, Phycas supports two posterior predictive approaches to model selection: Gelfand-Ghosh and Conditional Predictive Ordinates. The General Time Reversible family of substitution models, as well as a codon model, are available, and data can be partitioned with all parameters unlinked except tree topology and edge lengths. Phycas provides for analyses in which the prior on tree topologies allows polytomous trees as well as fully resolved trees, and provides for several choices for edge length priors, including a hierarchical model as well as the recently described compound Dirichlet prior, which helps avoid overly informative induced priors on tree length. PMID:25577605

  3. Multisensor classification of sedimentary rocks

    NASA Technical Reports Server (NTRS)

    Evans, Diane

    1988-01-01

    A comparison is made between linear discriminant analysis and supervised classification results based on signatures from the Landsat TM, the Thermal Infrared Multispectral Scanner (TIMS), and airborne SAR, alone and combined into extended spectral signatures for seven sedimentary rock units exposed on the margin of the Wind River Basin, Wyoming. Results from a linear discriminant analysis showed that training-area classification accuracies based on the multisensor data were improved an average of 15 percent over TM alone, 24 percent over TIMS alone, and 46 percent over SAR alone, with similar improvement resulting when supervised multisensor classification maps were compared to supervised, individual sensor classification maps. When training area signatures were used to map spectrally similar materials in an adjacent area, the average classification accuracy improved 19 percent using the multisensor data over TM alone, 2 percent over TIMS alone, and 11 percent over SAR alone. It is concluded that certain sedimentary lithologies may be accurately mapped using a single sensor, but classification of a variety of rock types can be improved using multisensor data sets that are sensitive to different characteristics such as mineralogy and surface roughness.

  4. Phylogenetic analysis of North American Elymus and the monogenomic Triticeae (Poaceae) using three chloroplast DNA data sets

    Microsoft Academic Search

    Roberta J. Mason-Gamer; Nancy L. Orme; Claire M. Anderson

    2002-01-01

    Although the monogenomic genera of the Triticeae have been analyzed in numerous biosystematic studies, the allopolyploid genera have not been as extensively studied within a phylogenetic framework. We focus on North American species of Elymus, which, under the current genomic system of classification, are almost all allotetraploid, combining the St genome of Pseudoroegneria with the H genome of Hordeum. We

  5. Simple and fast classification of non-LTR retrotransposons based on phylogeny of their RT domain protein sequences

    PubMed Central

    Kapitonov, Vladimir V.; Tempel, Sébastien; Jurka, Jerzy

    2009-01-01

    Rapidly growing number of sequenced genomes requires fast and accurate computational tools for analysis of different transposable elements (TEs). In this paper we focus on rapid and reliable procedure for classification of autonomous non-LTR retrotransposons based on alignment and clustering of their reverse transcriptase (RT) domains. Typically, the RT domain protein sequences encoded by different non-LTR retrotransposons are similar to each other in terms of significant BLASTP E-values. Therefore, they can be easily detected by the routine BLASTP searches of genomic DNA sequences coding for proteins similar to the RT domains of known non-LTR retrotransposons. However, detailed classification of non-LTR retrotransposons, i.e. their assignment to specific clades, is a slow and complex procedure that is not formalized or integrated as a standard set of computational methods and data. Here we describe a tool (RTclass1) designed for the fast and accurate automated assignment of novel non-LTR retrotransposons to known or novel clades using phylogenetic analysis of the RT domain protein sequences. RTclass1 classifies a particular non-LTR retrotransposon based on its RT domain in less than 10 minutes on a standard desktop computer and achieves 99.5% accuracy. RT1class1 works either as a standalone program installed locally or as a web-server that can be accessed distantly by uploading sequence data through the internet (http://www.girinst.org/RTphylogeny/RTclass1). PMID:19651192

  6. Image Classification

    NSDL National Science Digital Library

    Cote, Paul

    In this exercise, students get experience with image classification. Images are an increasingly important source of information about land cover and land use over time because comparisons of historic and current images can provide an estimate of change in the landscape.

  7. Use of whole genome sequences to develop a molecular phylogenetic framework for Rhodococcus fascians and the Rhodococcus genus

    PubMed Central

    Creason, Allison L.; Davis, Edward W.; Putnam, Melodie L.; Vandeputte, Olivier M.; Chang, Jeff H.

    2014-01-01

    The accurate diagnosis of diseases caused by pathogenic bacteria requires a stable species classification. Rhodococcus fascians is the only documented member of its ill-defined genus that is capable of causing disease on a wide range of agriculturally important plants. Comparisons of genome sequences generated from isolates of Rhodococcus associated with diseased plants revealed a level of genetic diversity consistent with them representing multiple species. To test this, we generated a tree based on more than 1700 homologous sequences from plant-associated isolates of Rhodococcus, and obtained support from additional approaches that measure and cluster based on genome similarities. Results were consistent in supporting the definition of new Rhodococcus species within clades containing phytopathogenic members. We also used the genome sequences, along with other rhodococcal genome sequences to construct a molecular phylogenetic tree as a framework for resolving the Rhodococcus genus. Results indicated that Rhodococcus has the potential for having 20 species and also confirmed a need to revisit the taxonomic groupings within Rhodococcus. PMID:25237311

  8. Zooflagellate phylogeny and classification.

    PubMed

    Cavalier-Smith, T

    1995-01-01

    Zooflagellates are non-photosynthetic flagellates without plastids or cell walls which feed by phagocytosis or endocytosis. They are the most diverse of all eukaryotes and gave rise directly or indirectly to most, if not all, other groups of eukaryotes. They are here classified into thirteen or fourteen phyla, spread across four of the seven eukaryote kingdoms that I now recognize: (1) the probably primitively amitochondrial and entirely non-photosynthetic Archezoa; (2) the usually aerobic but predominantly non-photosynthetic Protozoa; (3) the always aerobic and usually photosynthetic Cryptista; (4) the always aerobic and predominantly photosynthetic Chromista. Whether the few non-photosynthetic haptophytes also lack plastids and thus are zooflagellates in the present sense is unclear. Six phyla (Archamoebae and Metamonada within the Archezoa; Percolozoa, Parabasala, Opalozoa, and Choanozoa within the Protozoa) consist largely or entirely of zooflagellates. One protozoan phylum (Euglenozoa) consists predominantly of zooflagellate families and genera, with a minority only of phytoflagellate genera: the photosynthetic euglenoids are probably all descended from a non-photosynthetic euglenoid which acquired a photosynthetic endosymblont related to the ancestor of green algae. In the phylum Dinozoa (i.e. dinoflagellates and protalveolates) most classes consist purely of zooflagellates, but the majority of species are photosynthetic. The photosynthetic chlorarachneans are related to the sarcomonad zooflagellates and to the filose amoebae, so that the classes Chlorarachnea and Sarcomonadea are now placed in the phylum Rhizopoda, which is also modified by segregating the lobose amoebae as the phylum Amoebozoa. Although most zooflagellates are primitively without photosynthesis, there is good molecular evidence for the secondary origin of the zooflagellate condition by the loss of plastids in the case of the colourless pedinellids. A classification of 62 orders including zooflagellates grouped into 36 classes consisting primarily of zooflagellates, and four classes containing a few zooflagellates is presented; the ultrastructural and molecular evidence for the phylogenetic ideas underlying the classification is summarized. PMID:8868448

  9. Phylogenetic Conservatism in Plant Phenology

    NASA Technical Reports Server (NTRS)

    Davies, T. Jonathan; Wolkovich, Elizabeth M.; Kraft, Nathan J. B.; Salamin, Nicolas; Allen, Jenica M.; Ault, Toby R.; Betancourt, Julio L.; Bolmgren, Kjell; Cleland, Elsa E.; Cook, Benjamin I.; Crimmins, Theresa M.; Mazer, Susan J.; McCabe, Gregory J.; Pau, Stephanie; Regetz, Jim; Schwartz, Mark D.; Travers, Steven E.

    2013-01-01

    Phenological events defined points in the life cycle of a plant or animal have been regarded as highly plastic traits, reflecting flexible responses to various environmental cues. The ability of a species to track, via shifts in phenological events, the abiotic environment through time might dictate its vulnerability to future climate change. Understanding the predictors and drivers of phenological change is therefore critical. Here, we evaluated evidence for phylogenetic conservatism the tendency for closely related species to share similar ecological and biological attributes in phenological traits across flowering plants. We aggregated published and unpublished data on timing of first flower and first leaf, encompassing 4000 species at 23 sites across the Northern Hemisphere. We reconstructed the phylogeny for the set of included species, first, using the software program Phylomatic, and second, from DNA data. We then quantified phylogenetic conservatism in plant phenology within and across sites. We show that more closely related species tend to flower and leaf at similar times. By contrasting mean flowering times within and across sites, however, we illustrate that it is not the time of year that is conserved, but rather the phenological responses to a common set of abiotic cues. Our findings suggest that species cannot be treated as statistically independent when modelling phenological responses.Closely related species tend to resemble each other in the timing of their life-history events, a likely product of evolutionarily conserved responses to environmental cues. The search for the underlying drivers of phenology must therefore account for species' shared evolutionary histories.

  10. Multipolar consensus for phylogenetic trees.

    PubMed

    Bonnard, Cécile; Berry, Vincent; Lartillot, Nicolas

    2006-10-01

    Collections of phylogenetic trees are usually summarized using consensus methods. These methods build a single tree, supposed to be representative of the collection. However, in the case of heterogeneous collections of trees, the resulting consensus may be poorly resolved (strict consensus, majority-rule consensus, ...), or may perform arbitrary choices among mutually incompatible clades, or splits (greedy consensus). Here, we propose an alternative method, which we call the multipolar consensus (MPC). Its aim is to display all the splits having a support above a predefined threshold, in a minimum number of consensus trees, or poles. We show that the problem is equivalent to a graph-coloring problem, and propose an implementation of the method. Finally, we apply the MPC to real data sets. Our results indicate that, typically, all the splits down to a weight of 10% can be displayed in no more than 4 trees. In addition, in some cases, biologically relevant secondary signals, which would not have been present in any of the classical consensus trees, are indeed captured by our method, indicating that the MPC provides a convenient exploratory method for phylogenetic analysis. The method was implemented in a package freely available at http://www.lirmm.fr/~cbonnard/MPC.html PMID:17060203

  11. Phylogenetic Origins of Brain Organisers

    PubMed Central

    Robertshaw, Ellen; Kiecker, Clemens

    2012-01-01

    The regionalisation of the nervous system begins early in embryogenesis, concomitant with the establishment of the anteroposterior (AP) and dorsoventral (DV) body axes. The molecular mechanisms that drive axis induction appear to be conserved throughout the animal kingdom and may be phylogenetically older than the emergence of bilateral symmetry. As a result of this process, groups of patterning genes that are equally well conserved are expressed at specific AP and DV coordinates of the embryo. In the emerging nervous system of vertebrate embryos, this initial pattern is refined by local signalling centres, secondary organisers, that regulate patterning, proliferation, and axonal pathfinding in adjacent neuroepithelium. The main secondary organisers for the AP neuraxis are the midbrain-hindbrain boundary, zona limitans intrathalamica, and anterior neural ridge and for the DV neuraxis the notochord, floor plate, and roof plate. A search for homologous secondary organisers in nonvertebrate lineages has led to controversy over their phylogenetic origins. Based on a recent study in hemichordates, it has been suggested that the AP secondary organisers evolved at the base of the deuterostome superphylum, earlier than previously thought. According to this view, the lack of signalling centres in some deuterostome lineages is likely to reflect a secondary loss due to adaptive processes. We propose that the relative evolutionary flexibility of secondary organisers has contributed to a broader morphological complexity of nervous systems in different clades. PMID:24278699

  12. Molecular phylogenetics and character evolution of morphologically diverse groups, Dendrobium section Dendrobium and allies

    PubMed Central

    Takamiya, Tomoko; Wongsawad, Pheravut; Sathapattayanon, Apirada; Tajima, Natsuko; Suzuki, Shunichiro; Kitamura, Saki; Shioda, Nao; Handa, Takashi; Kitanaka, Susumu; Iijima, Hiroshi; Yukawa, Tomohisa

    2014-01-01

    It is always difficult to construct coherent classification systems for plant lineages having diverse morphological characters. The genus Dendrobium, one of the largest genera in the Orchidaceae, includes ?1100 species, and enormous morphological diversification has hindered the establishment of consistent classification systems covering all major groups of this genus. Given the particular importance of species in Dendrobium section Dendrobium and allied groups as floriculture and crude drug genetic resources, there is an urgent need to establish a stable classification system. To clarify phylogenetic relationships in Dendrobium section Dendrobium and allied groups, we analysed the macromolecular characters of the group. Phylogenetic analyses of 210 taxa of Dendrobium were conducted on DNA sequences of internal transcribed spacer (ITS) regions of 18S–26S nuclear ribosomal DNA and the maturase-coding gene (matK) located in an intron of the plastid gene trnK using maximum parsimony and Bayesian methods. The parsimony and Bayesian analyses revealed 13 distinct clades in the group comprising section Dendrobium and its allied groups. Results also showed paraphyly or polyphyly of sections Amblyanthus, Aporum, Breviflores, Calcarifera, Crumenata, Dendrobium, Densiflora, Distichophyllae, Dolichocentrum, Holochrysa, Oxyglossum and Pedilonum. On the other hand, the monophyly of section Stachyobium was well supported. It was found that many of the morphological characters that have been believed to reflect phylogenetic relationships are, in fact, the result of convergence. As such, many of the sections that have been recognized up to this point were found to not be monophyletic, so recircumscription of sections is required. PMID:25107672

  13. Molecular and Morphological Analyses Reveal Phylogenetic Relationships of Stingrays Focusing on the Family Dasyatidae (Myliobatiformes)

    PubMed Central

    Lim, Kean Chong; Lim, Phaik-Eem; Chong, Ving Ching; Loh, Kar-Hoe

    2015-01-01

    Elucidating the phylogenetic relationships of the current but problematic Dasyatidae (Order Myliobatiformes) was the first priority of the current study. Here, we studied three molecular gene markers of 43 species (COI gene), 33 species (ND2 gene) and 34 species (RAG1 gene) of stingrays to draft out the phylogenetic tree of the order. Nine character states were identified and used to confirm the molecularly constructed phylogenetic trees. Eight or more clades (at different hierarchical level) were identified for COI, ND2 and RAG1 genes in the Myliobatiformes including four clades containing members of the present Dasyatidae, thus rendering the latter non-monophyletic. The uncorrected p-distance between these four ‘Dasytidae’ clades when compared to the distance between formally known families confirmed that these four clades should be elevated to four separate families. We suggest a revision of the present classification, retaining the Dasyatidae (Dasyatis and Taeniurops species) but adding three new families namely, Neotrygonidae (Neotrygon and Taeniura species), Himanturidae (Himantura species) and Pastinachidae (Pastinachus species). Our result indicated the need to further review the classification of Dasyatis microps. By resolving the non-monophyletic problem, the suite of nine character states enables the natural classification of the Myliobatiformes into at least thirteen families based on morphology. PMID:25867639

  14. Molecular phylogenetics and character evolution of morphologically diverse groups, Dendrobium section Dendrobium and allies.

    PubMed

    Takamiya, Tomoko; Wongsawad, Pheravut; Sathapattayanon, Apirada; Tajima, Natsuko; Suzuki, Shunichiro; Kitamura, Saki; Shioda, Nao; Handa, Takashi; Kitanaka, Susumu; Iijima, Hiroshi; Yukawa, Tomohisa

    2014-01-01

    It is always difficult to construct coherent classification systems for plant lineages having diverse morphological characters. The genus Dendrobium, one of the largest genera in the Orchidaceae, includes ?1100 species, and enormous morphological diversification has hindered the establishment of consistent classification systems covering all major groups of this genus. Given the particular importance of species in Dendrobium section Dendrobium and allied groups as floriculture and crude drug genetic resources, there is an urgent need to establish a stable classification system. To clarify phylogenetic relationships in Dendrobium section Dendrobium and allied groups, we analysed the macromolecular characters of the group. Phylogenetic analyses of 210 taxa of Dendrobium were conducted on DNA sequences of internal transcribed spacer (ITS) regions of 18S-26S nuclear ribosomal DNA and the maturase-coding gene (matK) located in an intron of the plastid gene trnK using maximum parsimony and Bayesian methods. The parsimony and Bayesian analyses revealed 13 distinct clades in the group comprising section Dendrobium and its allied groups. Results also showed paraphyly or polyphyly of sections Amblyanthus, Aporum, Breviflores, Calcarifera, Crumenata, Dendrobium, Densiflora, Distichophyllae, Dolichocentrum, Holochrysa, Oxyglossum and Pedilonum. On the other hand, the monophyly of section Stachyobium was well supported. It was found that many of the morphological characters that have been believed to reflect phylogenetic relationships are, in fact, the result of convergence. As such, many of the sections that have been recognized up to this point were found to not be monophyletic, so recircumscription of sections is required. PMID:25107672

  15. Propionibacterium acnes Types I and II Represent Phylogenetically Distinct Groups

    PubMed Central

    McDowell, Andrew; Valanne, Susanna; Ramage, Gordon; Tunney, Michael M.; Glenn, Josephine V.; McLorinan, Gregory C.; Bhatia, Ajay; Maisonneuve, Jean-Francois; Lodes, Michael; Persing, David H.; Patrick, Sheila

    2005-01-01

    Although two phenotypes of the opportunistic pathogen Propionibacterium acnes (types I and II) have been described, epidemiological investigations of their roles in different infections have not been widely reported. Using immunofluorescence microscopy with monoclonal antibodies (MAbs) QUBPa1 and QUBPa2, specific for types I and II, respectively, we investigated the prevalences of the two types among 132 P. acnes isolates. Analysis of isolates from failed prosthetic hip implants (n = 40) revealed approximately equal numbers of type I and II organisms. Isolates from failed prosthetic hip-associated bone (n = 6) and tissue (n = 38) samples, as well as isolates from acne (n = 22), dental infections (n = 8), and skin removed during surgical incision (n = 18) were predominately of type I. A total of 11 (8%) isolates showed atypical MAb labeling and could not be conclusively identified. Phylogenetic analysis of P. acnes by nucleotide sequencing revealed the 16S rRNA gene to be highly conserved between types I and II. In contrast, sequence analysis of recA and a putative hemolysin gene (tly) revealed significantly greater type-specific polymorphisms that corresponded to phylogenetically distinct cluster groups. All 11 isolates with atypical MAb labeling were identified as type I by sequencing. Within the recA and tly phylogenetic trees, nine of these isolates formed a cluster distinct from other type I organisms, suggesting a further phylogenetic subdivision within type I. Our study therefore demonstrates that the phenotypic differences between P. acnes types I and II reflect deeper differences in their phylogeny. Furthermore, nucleotide sequencing provides an accurate method for identifying the type status of P. acnes isolates. PMID:15634990

  16. Propionibacterium acnes types I and II represent phylogenetically distinct groups.

    PubMed

    McDowell, Andrew; Valanne, Susanna; Ramage, Gordon; Tunney, Michael M; Glenn, Josephine V; McLorinan, Gregory C; Bhatia, Ajay; Maisonneuve, Jean-Francois; Lodes, Michael; Persing, David H; Patrick, Sheila

    2005-01-01

    Although two phenotypes of the opportunistic pathogen Propionibacterium acnes (types I and II) have been described, epidemiological investigations of their roles in different infections have not been widely reported. Using immunofluorescence microscopy with monoclonal antibodies (MAbs) QUBPa1 and QUBPa2, specific for types I and II, respectively, we investigated the prevalences of the two types among 132 P. acnes isolates. Analysis of isolates from failed prosthetic hip implants (n = 40) revealed approximately equal numbers of type I and II organisms. Isolates from failed prosthetic hip-associated bone (n = 6) and tissue (n = 38) samples, as well as isolates from acne (n = 22), dental infections (n = 8), and skin removed during surgical incision (n = 18) were predominately of type I. A total of 11 (8%) isolates showed atypical MAb labeling and could not be conclusively identified. Phylogenetic analysis of P. acnes by nucleotide sequencing revealed the 16S rRNA gene to be highly conserved between types I and II. In contrast, sequence analysis of recA and a putative hemolysin gene (tly) revealed significantly greater type-specific polymorphisms that corresponded to phylogenetically distinct cluster groups. All 11 isolates with atypical MAb labeling were identified as type I by sequencing. Within the recA and tly phylogenetic trees, nine of these isolates formed a cluster distinct from other type I organisms, suggesting a further phylogenetic subdivision within type I. Our study therefore demonstrates that the phenotypic differences between P. acnes types I and II reflect deeper differences in their phylogeny. Furthermore, nucleotide sequencing provides an accurate method for identifying the type status of P. acnes isolates. PMID:15634990

  17. A phylogenetic model for understanding the effect of gene duplication on cancer progression

    PubMed Central

    Ma, Qin; Reeves, Jaxk H.; Liberles, David A.; Yu, Lili; Chang, Zheng; Zhao, Jing; Cui, Juan; Xu, Ying; Liu, Liang

    2014-01-01

    As biotechnology advances rapidly, a tremendous amount of cancer genetic data has become available, providing an unprecedented opportunity for understanding the genetic mechanisms of cancer. To understand the effects of duplications and deletions on cancer progression, two genomes (normal and tumor) were sequenced from each of five stomach cancer patients in different stages (I, II, III and IV). We developed a phylogenetic model for analyzing stomach cancer data. The model assumes that duplication and deletion occur in accordance with a continuous time Markov Chain along the branches of a phylogenetic tree attached with five extended branches leading to the tumor genomes. Moreover, coalescence times of the phylogenetic tree follow a coalescence process. The simulation study suggests that the maximum likelihood approach can accurately estimate parameters in the phylogenetic model. The phylogenetic model was applied to the stomach cancer data. We found that the expected number of changes (duplication and deletion) per gene for the tumor genomes is significantly higher than that for the normal genomes. The goodness-of-fit test suggests that the phylogenetic model with constant duplication and deletion rates can adequately fit the duplication data for the normal genomes. The analysis found nine duplicated genes that are significantly associated with stomach cancer. PMID:24371277

  18. ATV: display and manipulation of annotated phylogenetic trees ATV: display and manipulation of annotated phylogenetic trees

    E-print Network

    Eddy, Sean

    ATV: display and manipulation of annotated phylogenetic trees 8/10/01 1 ATV: display and manipulation of annotated phylogenetic trees Christian M. Zmasek and Sean R. Eddy Howard Hughes Medical: {zmasek,eddy}@genetics.wustl.edu Key words: tree display, tree viewer, phylogenetic tree, java

  19. CLASSIFICATION OF PROCEDURES INTERNATIONAL CLASSIFICATION OF DISEASES

    E-print Network

    Laksanacharoen, Sathaporn

    CLASSIFICATION OF PROCEDURES INTERNATIONAL CLASSIFICATION OF DISEASES 9th REVISION CLINICAL MODIFICATION 1 #12;2 #12;PREFACE This sixth edition of the International Classification of Diseases, 9th coding. The International Classification of Diseases, 9th Revision, published by the World Health

  20. The neuron classification problem.

    PubMed

    Bota, Mihail; Swanson, Larry W

    2007-11-01

    A systematic account of neuron cell types is a basic prerequisite for determining the vertebrate nervous system global wiring diagram. With comprehensive lineage and phylogenetic information unavailable, a general ontology based on structure-function taxonomy is proposed and implemented in a knowledge management system, and a prototype analysis of select regions (including retina, cerebellum, and hypothalamus) presented. The supporting Brain Architecture Knowledge Management System (BAMS) Neuron ontology is online and its user interface allows queries about terms and their definitions, classification criteria based on the original literature and "Petilla Convention" guidelines, hierarchies, and relations-with annotations documenting each ontology entry. Combined with three BAMS modules for neural regions, connections between regions and neuron types, and molecules, the Neuron ontology provides a general framework for physical descriptions and computational modeling of neural systems. The knowledge management system interacts with other web resources, is accessible in both XML and RDF/OWL, is extendible to the whole body, and awaits large-scale data population requiring community participation for timely implementation. PMID:17582506

  1. Primate Classification

    NSDL National Science Digital Library

    In this lesson students learn how classification of organisms is based on evolutionary relationships. They will also learn how primates are categorized, and how they are related. Students transfer examples (names) of primates from their location in an outline hierarchy of primate groups into a set of nested boxes reflecting that same hierarchy. A cladogram can then be drawn illustrating how these groups are related in an evolutionary way.

  2. Triangle Classification

    NSDL National Science Digital Library

    2010-12-29

    This geometry lesson from Illuminations presents the Triangle Classification problem. Students will attempt to classify the triangles formed in a plane when a randomly selected point is connected to the endpoints of a given line segment. Students should have access to a computer with internet access for the lesson. The material is intended for grades 9-12 and should require 1 class period to complete.

  3. [Molecular phylogenetic analysis of the genus Abies (Pinaceae) based on the nucleotide sequence of chloroplast DNA].

    PubMed

    Semerikova, S A; Semerikov, V L

    2014-01-01

    A phylogenetic study of firs (Abies Mill.) was conducted using nucleotide sequences of several chloroplast DNA regions with a total length of 5580 bp. The analysis included 37 taxa, which represented the main evolutionary lineages of the genus, and Keteleeria daviana. According to phylogenetic reconstruction the Abies species were subdivided into six main groups, generally corresponding to their geographic distribution. The phylogenetic tree had three basal clades. All of these clades contained American species, and only one of them contained Eurasian species. The divergence time calibrations, based on paleobotanical data and the chloroplast DNA mutation rate estimates in Pinaceae, produced similar results..The age of diversification among the clades of the present-day Abies was estimated as the end of the Oligocene-beginning of Miocene. The age of the separation of Mediterranean firs from the Asian-North American branch corresponds to the Miocene. The age of diversification within the young groups of Mediterranean, Asian, and boreal American firs (A. lasiocarpa, A. balsamea, A. fraseri) was estimated as the Pliocene-Pleistocene. Based on the phylogenetic reconstruction obtained, the most plausible biogeographic scenarios were suggested. It is noted that the existing systematic classification of the genus Abies strongly contradicts with phylogenetic reconstruction and requires revision. PMID:25711008

  4. Protein classification based on text document classification techniques.

    PubMed

    Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith

    2005-03-01

    The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. PMID:15645499

  5. Staghorn classification: Platform for morphometry assessment

    PubMed Central

    Mishra, Shashikant; Bhattu, Amit S.; Sabnis, Ravindra B.; Desai, Mahesh R.

    2014-01-01

    Introduction: The majority of staghorn classifications do not incorporate volumetric stone burden assessment. Accurate volumetric data can easily be acquired with the ever-increasingly available computerized tomography (CT) scan. This manuscript reviews the available staghorn stone classifications and rationalizes the morphometry-based classification. Materials and Methods: A Pubmed search was performed for articles concerning staghorn classification and morphometry. Twenty abstracts were shortlisted from a total of 43 published abstracts. In view of the paucity of manuscripts on staghorn morphometry (4), older staghorn classifications were analyzed with the aim to determine the most optimum one having relevance to the percutaneous nephrolithotomy (PCNL) monotherapy outcome. Results: All available staghorn classifications are limited with non-widespread applicability. The traditional partial and complete staghorn are limited due to non-descript stone volumetric data and considerable overlap of the intermediate ones in either group. A lack of standardized definition limits intergroup comparison as well. Staghorn morphometry is a recent addition to the clinical classification profiling of a staghorn calculus. It comprises extensive CT volumetric stone distribution assessment of a staghorn in a given pelvi–calyceal anatomy. It allowsmeaningful clinical classification of staghorn stones from a contemporary PCNL monotherapy perspective. Conclusions: Morphometry-based classification affords clinically relevant nomenclature in predicting the outcome of PCNL for staghorn stones. Further research is required to reduce the complexity associated with measuring the volumetric stone distribution in a given calyceal system. PMID:24497688

  6. Phylogenetic analysis of the spirochetes.

    PubMed

    Paster, B J; Dewhirst, F E; Weisburg, W G; Tordoff, L A; Fraser, G J; Hespell, R B; Stanton, T B; Zablen, L; Mandelco, L; Woese, C R

    1991-10-01

    The 16S rRNA sequences were determined for species of Spirochaeta, Treponema, Borrelia, Leptospira, Leptonema, and Serpula, using a modified Sanger method of direct RNA sequencing. Analysis of aligned 16S rRNA sequences indicated that the spirochetes form a coherent taxon composed of six major clusters or groups. The first group, termed the treponemes, was divided into two subgroups. The first treponeme subgroup consisted of Treponema pallidum, Treponema phagedenis, Treponema denticola, a thermophilic spirochete strain, and two species of Spirochaeta, Spirochaeta zuelzerae and Spirochaeta stenostrepta, with an average interspecies similarity of 89.9%. The second treponeme subgroup contained Treponema bryantii, Treponema pectinovorum, Treponema saccharophilum, Treponema succinifaciens, and rumen strain CA, with an average interspecies similarity of 86.2%. The average interspecies similarity between the two treponeme subgroups was 84.2%. The division of the treponemes into two subgroups was verified by single-base signature analysis. The second spirochete group contained Spirochaeta aurantia, Spirochaeta halophila, Spirochaeta bajacaliforniensis, Spirochaeta litoralis, and Spirochaeta isovalerica, with an average similarity of 87.4%. The Spirochaeta group was related to the treponeme group, with an average similarity of 81.9%. The third spirochete group contained borrelias, including Borrelia burgdorferi, Borrelia anserina, Borrelia hermsii, and a rabbit tick strain. The borrelias formed a tight phylogenetic cluster, with average similarity of 97%. THe borrelia group shared a common branch with the Spirochaeta group and was closer to this group than to the treponemes. A single spirochete strain isolated fromt the shew constituted the fourth group. The fifth group was composed of strains of Serpula (Treponema) hyodysenteriae and Serpula (Treponema) innocens. The two species of this group were closely related, with a similarity of greater than 99%. Leptonema illini, Leptospira biflexa, and Leptospira interrogans formed the sixth and most deeply branching group. The average similarity within this group was 83.2%. This study represents the first demonstration that pathogenic and saprophytic Leptospira species are phylogenetically related. The division of the spirochetes into six major phylogenetic clusters was defined also by sequence signature elements. These signature analyses supported the conclusion that the spirochetes represent a monophylectic bacterial phylum. PMID:1917844

  7. Taxonomy, Phylogenetics, and Philip S. Ward

    E-print Network

    Ward, Philip S.

    Chapter 1 Taxonomy, Phylogenetics, and Evolution Philip S. Ward 1.1 Introduction Since their origin features of evolutionary history. Species-level taxonomy has advanced more fitfully than ant phylogenetics and features of their biology are dis- cussed. The state of species-level taxonomy is eval- uated

  8. A Novel Approach for Compressing Phylogenetic Trees

    E-print Network

    Williams, Tiffani

    A Novel Approach for Compressing Phylogenetic Trees Suzanne J. Matthews, Seung-Jin Sul, and Tiffani,sulsj,tlw}@cse.tamu.edu Abstract. Phylogenetic trees are tree structures that depict relation- ships between organisms. Popular analysis techniques often produce large collections of candidate trees, which are expensive to store. We

  9. The challenge of constructing large phylogenetic trees

    Microsoft Academic Search

    Michael J. Sanderson; Amy C. Driskell

    2003-01-01

    The amount of sequence data available to reconstruct the evolutionary history of genes and species has increased 20-fold in the past decade. Consequently the size of phylogenetic analyses has grown as well, and phylogenetic methods, algorithms and their implementations have struggled to keep pace. Computational and other challenges raised by this burgeoning database emerge at several stages of analysis, from

  10. Phylogenetic fields of species: cross-species patterns of phylogenetic structure and geographical coexistence

    PubMed Central

    Villalobos, Fabricio; Rangel, Thiago F.; Diniz-Filho, José Alexandre F.

    2013-01-01

    Differential coexistence among species underlies geographical patterns of biodiversity. Understanding such patterns has relied either on ecological or historical approaches applied separately. Recently, macroecology and community phylogenetics have tried to integrate both ecological and historical approaches. However, macroecology is mostly non-phylogenetic, whereas community phylogenetics is largely focused on local scales. Here, we propose a conceptual framework to link macroecology and community phylogenetics by exploring the evolutionary context of large-scale species coexistence, introducing the phylogenetic field concept. This is defined as the phylogenetic structure of species co-occurrence within a focal species' geographical range. We developed concepts and methods for analysing phylogenetic fields and applied them to study coexistence patterns of the bat family Phyllostomidae. Our analyses showed that phyllostomid bats coexist mostly with closely related species, revealing a north–south gradient from overdispersed to clustered phylogenetic fields. Patterns at different phylogenetic levels (i.e. all species versus close relatives only) presented the same gradient. Results support the tropical niche conservatism hypothesis, potentially mediated by higher speciation rates in the region of origin coupled with shared environmental preferences among species. The phylogenetic field approach enables species-based community phylogenetics, instead of those that are site-based, allowing the description of historical processes at more appropriate macroecological and biogeographic scales. PMID:23390100

  11. Phylogenetic fields of species: cross-species patterns of phylogenetic structure and geographical coexistence.

    PubMed

    Villalobos, Fabricio; Rangel, Thiago F; Diniz-Filho, José Alexandre F

    2013-04-01

    Differential coexistence among species underlies geographical patterns of biodiversity. Understanding such patterns has relied either on ecological or historical approaches applied separately. Recently, macroecology and community phylogenetics have tried to integrate both ecological and historical approaches. However, macroecology is mostly non-phylogenetic, whereas community phylogenetics is largely focused on local scales. Here, we propose a conceptual framework to link macroecology and community phylogenetics by exploring the evolutionary context of large-scale species coexistence, introducing the phylogenetic field concept. This is defined as the phylogenetic structure of species co-occurrence within a focal species' geographical range. We developed concepts and methods for analysing phylogenetic fields and applied them to study coexistence patterns of the bat family Phyllostomidae. Our analyses showed that phyllostomid bats coexist mostly with closely related species, revealing a north-south gradient from overdispersed to clustered phylogenetic fields. Patterns at different phylogenetic levels (i.e. all species versus close relatives only) presented the same gradient. Results support the tropical niche conservatism hypothesis, potentially mediated by higher speciation rates in the region of origin coupled with shared environmental preferences among species. The phylogenetic field approach enables species-based community phylogenetics, instead of those that are site-based, allowing the description of historical processes at more appropriate macroecological and biogeographic scales. PMID:23390100

  12. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences.

    PubMed

    Svitashev, S; Bryngelsson, T; Vershinin, A; Pedersen, C; Säll, T; von Bothmer, R

    1994-12-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. in situ hybridization experiments showed dispersed organization of the sequences over all chromosomes of H. vulgare and the wild barley species H. bulbosum, H. marinum and H. murinum. Southern blot hybridization revealed different levels of polymorphism among barley species and the RFLP data were used to generate a phylogenetic tree for the genus Hordeum. Our data are in a good agreement with the classification system which suggests the division of the genus into four major groups, containing the genomes I, X, Y, and H. However, our investigation also supports previous molecular studies of barley species where the unique position of H. bulbosum has been pointed out. In our experiments, H. bulbosum generally had hybridization patterns different from those of H. vulgare, although both carry the I genome. Based on our results we present a hypothesis concerning the possible origin and phylogeny of the polyploid barley species H. secalinum, H. depressum and the H. brachyantherum complex. PMID:24178086

  13. Phylogenetic Relationships of American Willows (Salix L., Salicaceae)

    PubMed Central

    Lauron-Moreau, Aurélien; Pitre, Frédéric E.; Argus, George W.; Labrecque, Michel; Brouillet, Luc

    2015-01-01

    Salix L. is the largest genus in the family Salicaceae (450 species). Several classifications have been published, but taxonomic subdivision has been under continuous revision. Our goal is to establish the phylogenetic structure of the genus using molecular data on all American willows, using three DNA markers. This complete phylogeny of American willows allows us to propose a biogeographic framework for the evolution of the genus. Material was obtained for the 122 native and introduced willow species of America. Sequences were obtained from the ITS (ribosomal nuclear DNA) and two plastid regions, matK and rbcL. Phylogenetic analyses (parsimony, maximum likelihood, Bayesian inference) were performed on the data. Geographic distribution was mapped onto the tree. The species tree provides strong support for a division of the genus into two subgenera, Salix and Vetrix. Subgenus Salix comprises temperate species from the Americas and Asia, and their disjunction may result from Tertiary events. Subgenus Vetrix is composed of boreo-arctic species of the Northern Hemisphere and their radiation may coincide with the Quaternary glaciations. Sixteen species have ambiguous positions; genetic diversity is lower in subg. Vetrix. A molecular phylogeny of all species of American willows has been inferred. It needs to be tested and further resolved using other molecular data. Nonetheless, the genus clearly has two clades that have distinct biogeographic patterns. PMID:25880993

  14. Phylogenetic relationships of american willows (salix L., salicaceae).

    PubMed

    Lauron-Moreau, Aurélien; Pitre, Frédéric E; Argus, George W; Labrecque, Michel; Brouillet, Luc

    2015-01-01

    Salix L. is the largest genus in the family Salicaceae (450 species). Several classifications have been published, but taxonomic subdivision has been under continuous revision. Our goal is to establish the phylogenetic structure of the genus using molecular data on all American willows, using three DNA markers. This complete phylogeny of American willows allows us to propose a biogeographic framework for the evolution of the genus. Material was obtained for the 122 native and introduced willow species of America. Sequences were obtained from the ITS (ribosomal nuclear DNA) and two plastid regions, matK and rbcL. Phylogenetic analyses (parsimony, maximum likelihood, Bayesian inference) were performed on the data. Geographic distribution was mapped onto the tree. The species tree provides strong support for a division of the genus into two subgenera, Salix and Vetrix. Subgenus Salix comprises temperate species from the Americas and Asia, and their disjunction may result from Tertiary events. Subgenus Vetrix is composed of boreo-arctic species of the Northern Hemisphere and their radiation may coincide with the Quaternary glaciations. Sixteen species have ambiguous positions; genetic diversity is lower in subg. Vetrix. A molecular phylogeny of all species of American willows has been inferred. It needs to be tested and further resolved using other molecular data. Nonetheless, the genus clearly has two clades that have distinct biogeographic patterns. PMID:25880993

  15. SUNPLIN: Simulation with Uncertainty for Phylogenetic Investigations

    PubMed Central

    2013-01-01

    Background Phylogenetic comparative analyses usually rely on a single consensus phylogenetic tree in order to study evolutionary processes. However, most phylogenetic trees are incomplete with regard to species sampling, which may critically compromise analyses. Some approaches have been proposed to integrate non-molecular phylogenetic information into incomplete molecular phylogenies. An expanded tree approach consists of adding missing species to random locations within their clade. The information contained in the topology of the resulting expanded trees can be captured by the pairwise phylogenetic distance between species and stored in a matrix for further statistical analysis. Thus, the random expansion and processing of multiple phylogenetic trees can be used to estimate the phylogenetic uncertainty through a simulation procedure. Because of the computational burden required, unless this procedure is efficiently implemented, the analyses are of limited applicability. Results In this paper, we present efficient algorithms and implementations for randomly expanding and processing phylogenetic trees so that simulations involved in comparative phylogenetic analysis with uncertainty can be conducted in a reasonable time. We propose algorithms for both randomly expanding trees and calculating distance matrices. We made available the source code, which was written in the C++ language. The code may be used as a standalone program or as a shared object in the R system. The software can also be used as a web service through the link: http://purl.oclc.org/NET/sunplin/. Conclusion We compare our implementations to similar solutions and show that significant performance gains can be obtained. Our results open up the possibility of accounting for phylogenetic uncertainty in evolutionary and ecological analyses of large datasets. PMID:24229408

  16. Beach Classification

    NSDL National Science Digital Library

    Lisa Davis

    This activity provides students with an in-class practice of landscape interpretation using slides of beaches shown by the instructor. Students view a select number of slides and are asked to classify each beach shown using the Wright and Short Beach Classification: dissipative, reflexive, and intermediate by visually identifying landforms and processes of each beach type. The outcome of this activity is that students have practice identifying landforms and processes and applying their observations and interpretations of geomorphic features and processes for an applied purpose. Designed for a geomorphology course Has minimal/no quantitative component

  17. Accurate determination of inflationary perturbations

    E-print Network

    Ian J Grivell; Andrew R Liddle

    1996-07-18

    We use a numerical code for accurate computation of the amplitude of linear density perturbations and gravitational waves generated by single-field inflation models to study the accuracy of existing analytic results based on the slow-roll approximation. We use our code to calculate the coefficient of an expansion about the exact analytic result for power-law inflation; this generates a fitting function which can be applied to all inflationary models to obtain extremely accurate results. In the appropriate limit our results confirm the Stewart--Lyth analytic second-order calculation, and we find that their results are very accurate for inflationary models favoured by current observational constraints.

  18. Accurate Monitor 1.2

    NSDL National Science Digital Library

    With many computer users developing their own Web sites, some of them may be interested in monitoring how search engines may be ranking their site. This latest edition of Accurate Monitor may prove useful, as it allows individuals to find the position of their Web site in search engines like Altavista and Google. Additionally, Accurate Monitor can generate advanced statistics and monitor plugins, along with providing a flexible interface system. This version of Accurate Monitor is compatible with all systems running Windows 95 and higher.

  19. Explaining diversity in metagenomic datasets by phylogenetic-based feature weighting.

    PubMed

    Albanese, Davide; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-03-01

    Metagenomics is revolutionizing our understanding of microbial communities, showing that their structure and composition have profound effects on the ecosystem and in a variety of health and disease conditions. Despite the flourishing of new analysis methods, current approaches based on statistical comparisons between high-level taxonomic classes often fail to identify the microbial taxa that are differentially distributed between sets of samples, since in many cases the taxonomic schema do not allow an adequate description of the structure of the microbiota. This constitutes a severe limitation to the use of metagenomic data in therapeutic and diagnostic applications. To provide a more robust statistical framework, we introduce a class of feature-weighting algorithms that discriminate the taxa responsible for the classification of metagenomic samples. The method unambiguously groups the relevant taxa into clades without relying on pre-defined taxonomic categories, thus including in the analysis also those sequences for which a taxonomic classification is difficult. The phylogenetic clades are weighted and ranked according to their abundance measuring their contribution to the differentiation of the classes of samples, and a criterion is provided to define a reduced set of most relevant clades. Applying the method to public datasets, we show that the data-driven definition of relevant phylogenetic clades accomplished by our ranking strategy identifies features in the samples that are lost if phylogenetic relationships are not considered, improving our ability to mine metagenomic datasets. Comparison with supervised classification methods currently used in metagenomic data analysis highlights the advantages of using phylogenetic information. PMID:25815895

  20. Explaining Diversity in Metagenomic Datasets by Phylogenetic-Based Feature Weighting

    PubMed Central

    Albanese, Davide; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-01-01

    Metagenomics is revolutionizing our understanding of microbial communities, showing that their structure and composition have profound effects on the ecosystem and in a variety of health and disease conditions. Despite the flourishing of new analysis methods, current approaches based on statistical comparisons between high-level taxonomic classes often fail to identify the microbial taxa that are differentially distributed between sets of samples, since in many cases the taxonomic schema do not allow an adequate description of the structure of the microbiota. This constitutes a severe limitation to the use of metagenomic data in therapeutic and diagnostic applications. To provide a more robust statistical framework, we introduce a class of feature-weighting algorithms that discriminate the taxa responsible for the classification of metagenomic samples. The method unambiguously groups the relevant taxa into clades without relying on pre-defined taxonomic categories, thus including in the analysis also those sequences for which a taxonomic classification is difficult. The phylogenetic clades are weighted and ranked according to their abundance measuring their contribution to the differentiation of the classes of samples, and a criterion is provided to define a reduced set of most relevant clades. Applying the method to public datasets, we show that the data-driven definition of relevant phylogenetic clades accomplished by our ranking strategy identifies features in the samples that are lost if phylogenetic relationships are not considered, improving our ability to mine metagenomic datasets. Comparison with supervised classification methods currently used in metagenomic data analysis highlights the advantages of using phylogenetic information. PMID:25815895

  1. The logical basis of phylogenetic taxonomy.

    PubMed

    Sereno, Paul C

    2005-08-01

    Phylogenetic taxonomy, like modern Linnean taxonomy, was modeled on a phylogenetic tree rather than a cladogram and, like its predecessor, perpetuates the use of morphology as a means of recognizing clades. Both practices have generated confusion in graphical representation, operational terminology, and definitional rationale in phylogenetic taxonomy, the history of which is traced. The following points are made: (1) cladograms, rather than trees or hybrid cladogram-trees, provide the framework for the simplest graphical depiction of phylogenetic definitions; (2) a complete notational scheme for phylogenetic definitions is presented that distinguishes symbolic notation from shorthand and longhand versions; (3) phylogenetic definitions are composed of three components (paradigm, specifier, qualifier) arranged in two fundamental patterns-node and stem; (4) apomorphies do not constitute a fundamental definitional pattern but rather serve to qualify a stem-based definition (as do time and geographic range); (5) formulation of phylogenetic definitions involves three heuristic criteria (stability, simplicity, prior use); (6) reasoned definitional revision is encouraged and better defined (textual substitution, first-and second-order revision); and (7) a database, TaxonSearch, allows rapid recall of taxonomic and definitional information. PMID:16109704

  2. Discriminating the effects of phylogenetic hypothesis, tree resolution and clade age estimates on phylogenetic signal measurements.

    PubMed

    Seger, G D S; Duarte, L D S; Debastiani, V J; Kindel, A; Jarenkow, J A

    2013-09-01

    Understanding how species traits evolved over time is the central question to comprehend assembly rules that govern the phylogenetic structure of communities. The measurement of phylogenetic signal (PS) in ecologically relevant traits is a first step to understand phylogenetically structured community patterns. The different methods available to estimate PS make it difficult to choose which is most appropriate. Furthermore, alternative phylogenetic tree hypotheses, node resolution and clade age estimates might influence PS measurements. In this study, we evaluated to what extent these parameters affect different methods of PS analysis, and discuss advantages and disadvantages when selecting which method to use. We measured fruit/seed traits and flowering/fruiting phenology of endozoochoric species occurring in Southern Brazilian Araucaria forests and evaluated their PS using Mantel regressions, phylogenetic eigenvector regressions (PVR) and K statistic. Mantel regressions always gave less significant results compared to PVR and K statistic in all combinations of phylogenetic trees constructed. Moreover, a better phylogenetic resolution affected PS, independently of the method used to estimate it. Morphological seed traits tended to show higher PS than diaspores traits, while PS in flowering/fruiting phenology depended mostly on the method used to estimate it. This study demonstrates that different PS estimates are obtained depending on the chosen method and the phylogenetic tree resolution. This finding has implications for inferences on phylogenetic niche conservatism or ecological processes determining phylogenetic community structure. PMID:23368095

  3. Evolutionary models of phylogenetic trees.

    PubMed Central

    Pinelis, Iosif

    2003-01-01

    The most widely used evolutionary model for phylogenetic trees is the equal-rates Markov (ERM) model. A problem is that the ERM model predicts less imbalance than observed for trees inferred from real data; in fact, the observed imbalance tends to fall between the values predicted by the ERM model and those predicted by the proportional-to-distinguishable-arrangements (PDA) model. Here, a continuous multi-rate (MR) family of evolutionary models is presented which contains entire subfamilies corresponding to both the PDA and ERM models. Furthermore, this MR family covers an entire range from 'completely balanced' to 'completely unbalanced' models. In particular, the MR family contains other known evolutionary models. The MR family is very versatile and virtually free of assumptions on the character of evolution; yet it is highly susceptible to rigorous analyses. In particular, such analyses help to uncover adaptability, quasi-stabilization and prolonged stasis as major possible causes of the imbalance. However, the MR model is functionally simple and requires only three parameters to reproduce the observed imbalance. PMID:12965036

  4. Phylogenetic mapping of bacterial morphology

    NASA Technical Reports Server (NTRS)

    Siefert, J. L.; Fox, G. E.

    1998-01-01

    The availability of a meaningful molecular phylogeny for bacteria provides a context for examining the historical significance of various developments in bacterial evolution. Herein, the classical morphological descriptions of selected members of the domain Bacteria are mapped upon the genealogical ancestry deduced from comparison of small-subunit rRNA sequences. For the species examined in this study, a distinct pattern emerges which indicates that the coccus shape has arisen and accumulated independently multiple times in separate lineages and typically survived as a persistent end-state morphology. At least two other morphologies persist but have evolved only once. This study demonstrates that although bacterial morphology is not useful in defining bacterial phylogeny, it is remarkably consistent with that phylogeny once it is known. An examination of the experimental evidence available for morphogenesis as well as microbial fossil evidence corroborates these findings. It is proposed that the accumulation of persistent morphologies is a result of the biophysical properties of peptidoglycan and their genetic control, and that an evolved body-plan strategy based on peptidoglycan may have been a fate-sealing step in the evolution of Bacteria. More generally, this study illustrates that significant evolutionary insights can be obtained by examining biological and biochemical data in the context of a reliable phylogenetic structure.

  5. A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.

    PubMed

    Rajan, Vaibhav

    2013-03-01

    Inaccurate inference of positional homologies in multiple sequence alignments and systematic errors introduced by alignment heuristics obfuscate phylogenetic inference. Alignment masking, the elimination of phylogenetically uninformative or misleading sites from an alignment before phylogenetic analysis, is a common practice in phylogenetic analysis. Although masking is often done manually, automated methods are necessary to handle the much larger data sets being prepared today. In this study, we introduce the concept of subsplits and demonstrate their use in extracting phylogenetic signal from alignments. We design a clustering approach for alignment masking where each cluster contains similar columns-similarity being defined on the basis of compatible subsplits; our approach then identifies noisy clusters and eliminates them. Trees inferred from the columns in the retained clusters are found to be topologically closer to the reference trees. We test our method on numerous standard benchmarks (both synthetic and biological data sets) and compare its performance with other methods of alignment masking. We find that our method can eliminate sites more accurately than other methods, particularly on divergent data, and can improve the topologies of the inferred trees in likelihood-based analyses. Software available upon request from the author. PMID:23193120

  6. A parallel learning algorithm for text classification

    Microsoft Academic Search

    Canasai Kruengkrai; Chuleerat Jaruskulchai

    2002-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient labeled documents to learn accurately. Applying the Expectation-Maximization (EM) algorithm to this problem is an alternative approach that utilizes a large pool of unlabeled documents to augment the available labeled documents. Unfortunately, the time needed

  7. Deterministic Model for Acute Myelogenous Leukemia Classification

    E-print Network

    Chronopoulos, Anthony T.

    -- Leukemia is a type of cancer that affects the blood and the bone marrow. Manual data analysis is time consuming and not accurate. Attempts to build partial/full automated systems based on segmentation and classification of cells are present in literature, but they are still in prototype stage. Most of the existing

  8. Trends and concepts in fern classification

    PubMed Central

    Christenhusz, Maarten J. M.; Chase, Mark W.

    2014-01-01

    Background and Aims Throughout the history of fern classification, familial and generic concepts have been highly labile. Many classifications and evolutionary schemes have been proposed during the last two centuries, reflecting different interpretations of the available evidence. Knowledge of fern structure and life histories has increased through time, providing more evidence on which to base ideas of possible relationships, and classification has changed accordingly. This paper reviews previous classifications of ferns and presents ideas on how to achieve a more stable consensus. Scope An historical overview is provided from the first to the most recent fern classifications, from which conclusions are drawn on past changes and future trends. The problematic concept of family in ferns is discussed, with a particular focus on how this has changed over time. The history of molecular studies and the most recent findings are also presented. Key Results Fern classification generally shows a trend from highly artificial, based on an interpretation of a few extrinsic characters, via natural classifications derived from a multitude of intrinsic characters, towards more evolutionary circumscriptions of groups that do not in general align well with the distribution of these previously used characters. It also shows a progression from a few broad family concepts to systems that recognized many more narrowly and highly controversially circumscribed families; currently, the number of families recognized is stabilizing somewhere between these extremes. Placement of many genera was uncertain until the arrival of molecular phylogenetics, which has rapidly been improving our understanding of fern relationships. As a collective category, the so-called ‘fern allies’ (e.g. Lycopodiales, Psilotaceae, Equisetaceae) were unsurprisingly found to be polyphyletic, and the term should be abandoned. Lycopodiaceae, Selaginellaceae and Isoëtaceae form a clade (the lycopods) that is sister to all other vascular plants, whereas the whisk ferns (Psilotaceae), often included in the lycopods or believed to be associated with the first vascular plants, are sister to Ophioglossaceae and thus belong to the fern clade. The horsetails (Equisetaceae) are also members of the fern clade (sometimes inappropriately called ‘monilophytes’), but, within that clade, their placement is still uncertain. Leptosporangiate ferns are better understood, although deep relationships within this group are still unresolved. Earlier, almost all leptosporangiate ferns were placed in a single family (Polypodiaceae or Dennstaedtiaceae), but these families have been redefined to narrower more natural entities. Conclusions Concluding this paper, a classification is presented based on our current understanding of relationships of fern and lycopod clades. Major changes in our understanding of these families are highlighted, illustrating issues of classification in relation to convergent evolution and false homologies. Problems with the current classification and groups that still need study are pointed out. A summary phylogenetic tree is also presented. A new classification in which Aspleniaceae, Cyatheaceae, Polypodiaceae and Schizaeaceae are expanded in comparison with the most recent classifications is presented, which is a modification of those proposed by Smith et al. (2006, 2008) and Christenhusz et al. (2011). These classifications are now finding a wider acceptance and use, and even though a few amendments are made based on recently published results from molecular analyses, we have aimed for a stable family and generic classification of ferns. PMID:24532607

  9. Distribution of phylogenetic diversity under random extinction

    E-print Network

    Beata Faller; Fabio Pardi; Mike Steel

    2007-08-02

    Phylogenetic diversity is a measure for describing how much of an evolutionary tree is spanned by a subset of species. If one applies this to the (unknown) subset of current species that will still be present at some future time, then this `future phylogenetic diversity' provides a measure of the impact of various extinction scenarios in biodiversity conservation. In this paper we study the distribution of future phylogenetic diversity under a simple model of extinction (a generalized `field of bullets' model). We show that the distribution of future phylogenetic diversity converges to a normal distribution as the number of species grows (under mild conditions, which are necessary). We also describe an algorithm to compute the distribution efficiently, provided the edge lengths are integral, and briefly outline the significance of our findings for biodiversity conservation.

  10. Evolutionary Trees and phylogenetics: An algebraic perspective

    E-print Network

    Allman, Elizabeth S.

    . MBE (1988) 5:626-644. Gorilla AAGCTTCACCGGCGCAGTTGTTCTTATAATTGCCCACGGACTTACATCAT... Orangutan_Mac. Chimpanzee Human Gorilla Orangutan Gibbon Squirrel_Monkey Tarsier 0.1 Evolutionary trees and phylogenetics 6 sequences: Gorilla AAGCTTCACCGGCGCAGTTGTTCTTATAATTGCCCACGGACTTACATCAT... Orangutan

  11. Phylogenetics of Hydroidolina (Hydrozoa: paulyn cartwright1

    E-print Network

    Dunn, Casey

    Phylogenetics of Hydroidolina (Hydrozoa: Cnidaria) paulyn cartwright1 , nathaniel m. evans1 , casey, Cnidaria Submitted 30 November 2007; accepted 12 May 2008 I N T R O D U C T I O N Hydroidolina (¼Leptolina

  12. Multiple Sequence Alignment Errors and Phylogenetic Reconstruction

    E-print Network

    Graur, Dan

    Multiple Sequence Alignment Errors and Phylogenetic Reconstruction THESIS SUBMITTED FOR THE DEGREE and Ron Ophir. To them, my deepest gratitude. Time flies like an arrow Fruit flies like a banana - Groucho...................................................................................................6 Alignment Reconstruction

  13. Pareto-optimal phylogenetic tree reconciliation

    E-print Network

    Libeskind-Hadas, Ran

    Motivation: Phylogenetic tree reconciliation is a widely used method for reconstructing the evolutionary histories of gene families and species, hosts and parasites and other dependent pairs of entities. Reconciliation is ...

  14. Phylogenetics and the origin of?species

    PubMed Central

    Avise, John C.; Wollenberg, Kurt

    1997-01-01

    A recent criticism that the biological species concept (BSC) unduly neglects phylogeny is examined under a novel modification of coalescent theory that considers multiple, sex-defined genealogical pathways through sexual organismal pedigrees. A competing phylogenetic species concept (PSC) also is evaluated from this vantage. Two analytical approaches are employed to capture the composite phylogenetic information contained within the braided assemblages of hereditary pathways of a pedigree: (i) consensus phylogenetic trees across allelic transmission routes and (ii) composite phenograms from quantitative values of organismal coancestry. Outcomes from both approaches demonstrate that the supposed sharp distinction between biological and phylogenetic species concepts is illusory. Historical descent and reproductive ties are related aspects of phylogeny and jointly illuminate biotic discontinuity. PMID:9223259

  15. Classification of the acanthocephala.

    PubMed

    Amin, Omar M

    2013-09-01

    In 1985, Amin presented a new system for the classification of the Acanthocephala in Crompton and Nickol's (1985) book 'Biology of the Acanthocephala' and recognized the concepts of Meyer (1931, 1932, 1933) and Van Cleave (1936, 1941, 1947, 1948, 1949, 1951, 1952). This system became the standard for the taxonomy of this group and remains so to date. Many changes have taken place and many new genera and species, as well as higher taxa, have been described since. An updated version of the 1985 scheme incorporating new concepts in molecular taxonomy, gene sequencing and phylogenetic studies is presented. The hierarchy has undergone a total face lift with Amin's (1987) addition of a new class, Polyacanthocephala (and a new order and family) to remove inconsistencies in the class Palaeacanthocephala. Amin and Ha (2008) added a third order (and a new family) to the Palaeacanthocephala, Heteramorphida, which combines features from the palaeacanthocephalan families Polymorphidae and Heteracanthocephalidae. Other families and subfamilies have been added but some have been eliminated, e.g. the three subfamilies of Arythmacanthidae: Arhythmacanthinae Yamaguti, 1935; Neoacanthocephaloidinae Golvan, 1960; and Paracanthocephaloidinae Golvan, 1969. Amin (1985) listed 22 families, 122 genera and 903 species (4, 4 and 14 families; 13, 28 and 81 genera; 167, 167 and 569 species in Archiacanthocephala, Eoacanthocephala and Palaeacanthocephala, respectively). The number of taxa listed in the present treatment is 26 families (18% increase), 157 genera (29%), and 1298 species (44%) (4, 4 and 16; 18, 29 and 106; 189, 255 and 845, in the same order), which also includes 1 family, 1 genus and 4 species in the class Polyacanthocephala Amin, 1987, and 3 genera and 5 species in the fossil family Zhijinitidae. PMID:24261131

  16. A Phylogenetic analysis of the Southern Shift

    E-print Network

    Thomas, Erik Robert

    1989-01-01

    A PHYLOGENETIC ANALYSIS OF THE SOUTHERN SHIFT A Thesis by ERIK ROBERT THOMAS Submitted to the Office of Graduate Studies of Texas ASM University in partial fulfillment of the requirements for the degree of MASTER OF ARTS December 1989... Major Subject: English A PHYLOGENETIC ANALYSIS OF THE SOUTHERN SHIFT A Thesis by ERIK ROBERT THOMAS Approved as to style and content by: Guy Bailey (Chair of Committee) Barbara Johnstone (Member) Robert H. Benson (Member) J. Lawrence Mitchell...

  17. Terrain classification for a UGV

    NASA Astrophysics Data System (ADS)

    Sarwal, Alok; Baker, Chris; Rosenblum, Mark

    2005-05-01

    This work addresses the issue of Terrain Classification that can be applied for path planning for an Unmanned Ground Vehicle (UGV) platform. We are interested in classification of features such as rocks, bushes, trees and dirt roads. Currently, the data is acquired from a color camera mounted on the UGV as we can add range data from a second sensor in the future. The classification is accomplished by first, coarse segmenting a frame and then refining the initial segmentations through a convenient user interface. After the first frame, temporal information is exploited to improve the quality of the image segmentation and help classification adapt to changes due to ambient lighting, shadows, and scene changes as the platform moves. The Mean Shift Classifier algorithm provides segmentation of the current frame data. We have tested the above algorithms with four sequence of frames acquired in an environment with terrain representative of the type we expect to see in the field. A comparison of the results from this algorithm was done with accurate manually-segmented (ground-truth) data, for each frame in the sequence.

  18. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  19. How does cognition evolve? Phylogenetic comparative psychology

    PubMed Central

    Matthews, Luke J.; Hare, Brian A.; Nunn, Charles L.; Anderson, Rindy C.; Aureli, Filippo; Brannon, Elizabeth M.; Call, Josep; Drea, Christine M.; Emery, Nathan J.; Haun, Daniel B. M.; Herrmann, Esther; Jacobs, Lucia F.; Platt, Michael L.; Rosati, Alexandra G.; Sandel, Aaron A.; Schroepfer, Kara K.; Seed, Amanda M.; Tan, Jingzhi; van Schaik, Carel P.; Wobber, Victoria

    2014-01-01

    Now more than ever animal studies have the potential to test hypotheses regarding how cognition evolves. Comparative psychologists have developed new techniques to probe the cognitive mechanisms underlying animal behavior, and they have become increasingly skillful at adapting methodologies to test multiple species. Meanwhile, evolutionary biologists have generated quantitative approaches to investigate the phylogenetic distribution and function of phenotypic traits, including cognition. In particular, phylogenetic methods can quantitatively (1) test whether specific cognitive abilities are correlated with life history (e.g., lifespan), morphology (e.g., brain size), or socio-ecological variables (e.g., social system), (2) measure how strongly phylogenetic relatedness predicts the distribution of cognitive skills across species, and (3) estimate the ancestral state of a given cognitive trait using measures of cognitive performance from extant species. Phylogenetic methods can also be used to guide the selection of species comparisons that offer the strongest tests of a priori predictions of cognitive evolutionary hypotheses (i.e., phylogenetic targeting). Here, we explain how an integration of comparative psychology and evolutionary biology will answer a host of questions regarding the phylogenetic distribution and history of cognitive traits, as well as the evolutionary processes that drove their evolution. PMID:21927850

  20. Phylogenetic structure in tropical hummingbird communities

    PubMed Central

    Graham, Catherine H.; Parra, Juan L.; Rahbek, Carsten; McGuire, Jimmy A.

    2009-01-01

    How biotic interactions, current and historical environment, and biogeographic barriers determine community structure is a fundamental question in ecology and evolution, especially in diverse tropical regions. To evaluate patterns of local and regional diversity, we quantified the phylogenetic composition of 189 hummingbird communities in Ecuador. We assessed how species and phylogenetic composition changed along environmental gradients and across biogeographic barriers. We show that humid, low-elevation communities are phylogenetically overdispersed (coexistence of distant relatives), a pattern that is consistent with the idea that competition influences the local composition of hummingbirds. At higher elevations communities are phylogenetically clustered (coexistence of close relatives), consistent with the expectation of environmental filtering, which may result from the challenge of sustaining an expensive means of locomotion at high elevations. We found that communities in the lowlands on opposite sides of the Andes tend to be phylogenetically similar despite their large differences in species composition, a pattern implicating the Andes as an important dispersal barrier. In contrast, along the steep environmental gradient between the lowlands and the Andes we found evidence that species turnover is comprised of relatively distantly related species. The integration of local and regional patterns of diversity across environmental gradients and biogeographic barriers provides insight into the potential underlying mechanisms that have shaped community composition and phylogenetic diversity in one of the most species-rich, complex regions of the world. PMID:19805042

  1. Multilocus assessment of phylogenetic relationships in Alytes (Anura, Alytidae).

    PubMed

    Maia-Carvalho, Bruno; Gonçalves, Helena; Ferrand, Nuno; Martínez-Solano, Iñigo

    2014-10-01

    With the advent of large multilocus datasets, molecular systematics is experiencing very rapid progress, but important challenges remain regarding data analysis and interpretation. Midwife toads (genus Alytes) exemplify two of the most widespread problems for accurate phylogenetic reconstruction: discerning the causes of discordance between gene trees, and resolving short internodes produced during rapid, successive splitting events. The three species in subgenus Baleaphryne (A. maurus, A. dickhilleni and A. muletensis), the sister group to A. obstetricans, have disjunct and highly restricted geographical ranges, which are thought to result from old vicariant events affecting their common ancestor, but their phylogenetic relationships are still unresolved. In this study we re-address the phylogeny of Alytes with a special focus on the relationships in Baleaphryne with a multilocus dataset including >9000 base pairs of mitochondrial DNA and four nuclear markers (3142bp) in all recognized taxa, including all subspecies of A. obstetricans. Both concatenation and species tree analyses suggest that A. muletensis, endemic to the Balearic island of Mallorca, is the sister taxon to a clade comprising the southeastern Iberian endemic A. dickhilleni and the North African A. maurus. This scenario is consistent with palaeogeological evidence associated with the fragmentation of the Betic-Rifean Massif, followed by the opening of the Strait of Gibraltar. On the other hand, analyses of intraspecific variation in A. obstetricans are inconclusive regarding relationships between major clades and conflict with current subspecific taxonomy. PMID:24931729

  2. Indel reliability in indel-based phylogenetic inference.

    PubMed

    Ashkenazy, Haim; Cohen, Ofir; Pupko, Tal; Huchon, Dorothée

    2014-12-01

    It is often assumed that it is unlikely that the same insertion or deletion (indel) event occurred at the same position in two independent evolutionary lineages, and thus, indel-based inference of phylogeny should be less subject to homoplasy compared with standard inference which is based on substitution events. Indeed, indels were successfully used to solve debated evolutionary relationships among various taxonomical groups. However, indels are never directly observed but rather inferred from the alignment and thus indel-based inference may be sensitive to alignment errors. It is hypothesized that phylogenetic reconstruction would be more accurate if it relied only on a subset of reliable indels instead of the entire indel data. Here, we developed a method to quantify the reliability of indel characters by measuring how often they appear in a set of alternative multiple sequence alignments. Our approach is based on the assumption that indels that are consistently present in most alternative alignments are more reliable compared with indels that appear only in a small subset of these alignments. Using simulated and empirical data, we studied the impact of filtering and weighting indels by their reliability scores on the accuracy of indel-based phylogenetic reconstruction. The new method is available as a web-server at http://guidance.tau.ac.il/RELINDEL/. PMID:25409663

  3. The Accuracy of Fast Phylogenetic Methods for Large Datasets Luay Nakhleh Bernard M.E. Moret Usman Roshan

    E-print Network

    Moret, Bernard

    phylogenetic reconstruction methods are designed to be used on biomolecular (i.e., DNA, RNA, or amino of genomes. In particular, sequence­based reconstruction will play an important role, especially in resolving that reconstruction methods must be both fast and robust as well as accurate. We study the accuracy, convergence rate

  4. A close phylogenetic relationship between Sipuncula and Annelida evidenced from the complete mitochondrial genome sequence of Phascolosoma esculenta

    Microsoft Academic Search

    Xin Shen; Xiaoyin Ma; Jianfeng Ren; Fangqing Zhao

    2009-01-01

    BACKGROUND: There are many advantages to the application of complete mitochondrial (mt) genomes in the accurate reconstruction of phylogenetic relationships in Metazoa. Although over one thousand metazoan genomes have been sequenced, the taxonomic sampling is highly biased, left with many phyla without a single representative of complete mitochondrial genome. Sipuncula (peanut worms or star worms) is a small taxon of

  5. Multisensor data fusion for supervised land-cover classification using Bayesian and geostatistical techniques

    Microsoft Academic Search

    No-Wook Park; Wooil M. Moon; Kwang-Hoon Chi; Byung-Doo Kwon

    2002-01-01

    We propose a geostatistical approach incorporated to the Bayesian data fusion technique for supervised classification of multi-sensor\\u000a remote sensing data. The classification based only on the traditional spectral approach cannot preserve the accurate spatial\\u000a information and can result in unrealistic classification results. To obtain accurate spatial\\/contextual information, the indicator\\u000a kriging that allows one to estimate the probability of occurrence of

  6. Molecular phylogenetics in Hydra, a classical model in evolutionary developmental biology.

    PubMed

    Hemmrich, Georg; Anokhin, Boris; Zacharias, Helmut; Bosch, Thomas C G

    2007-07-01

    Among the earliest diverging animal phyla are the Cnidaria. Freshwater polyps of the genus Hydra (Cnidaria, Hydrozoa) have long been of general interest because different species of Hydra reveal fundamental principles that underlie development, differentiation, regeneration and also symbiosis. The phylogenetic relationships among the Hydra species most commonly used in current research are not resolved yet. Here we estimate the phylogenetic relations among eight scientifically important members of the genus Hydra with molecular data from two nuclear (18S rDNA, 28S rDNA) and two mitochondrial (16S rRNA, cytochrome oxidase subunit I (COI)) genes. The phylogenetic trees obtained by maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI) methods were generally compatible with present morphological classification patterns. However, the present analysis also bears on several long-standing questions about Hydra systematics and reveals some characteristics of the phylogenetic relationships of this genus that were unknown so far. It indicates that Hydra viridissima, the only species in Hydra, which contains symbiotic algae, might be considered as the sister group to all other species within this genus. Analyses of both nuclear and mitochondrial sequences support the view that Hydra oligactis and Hydra circumcincta are sisters to all other Hydra species. Unexpectedly, we also find that in contrast to its initial description, the strain used for making transgenic Hydra, Hydra vulgaris (strain AEP) is more closely related to Hydra carnea than to other species of Hydra. PMID:17174108

  7. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    PubMed Central

    2010-01-01

    Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service. PMID:21034504

  8. A Bayesian Phylogenetic Method to Estimate Unknown Sequence Ages

    PubMed Central

    Shapiro, Beth; Drummond, Alexei J.; Suchard, Marc A.; Pybus, Oliver G.; Rambaut, Andrew

    2011-01-01

    Heterochronous data sets comprise molecular sequences sampled at different points in time. If the temporal range of the sampled sequences is large relative to the rate of mutation, the sampling times can directly calibrate evolutionary rates to calendar time. Here, we extend this calibration process to provide a full probabilistic method that utilizes temporal information in heterochronous data sets to estimate sampling times (leaf-ages) for sequenced for which this information unavailable. Our method is similar to relaxing the constraints of the molecular clock on specific lineages within a phylogenetic tree. Using a combination of synthetic and empirical data sets, we demonstrate that the method estimates leaf-ages reliably and accurately. Potential applications of our approach include incorporating samples of uncertain or radiocarbon-infinite age into ancient DNA analyses, evaluating the temporal signal in a particular sequence or data set, and exploring the reliability of sequence ages that are somehow contentious. PMID:20889726

  9. Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection

    PubMed Central

    Xie, Wangang; Lewis, Paul O.; Fan, Yu; Kuo, Lynn; Chen, Ming-Hui

    2011-01-01

    The marginal likelihood is commonly used for comparing different evolutionary models in Bayesian phylogenetics and is the central quantity used in computing Bayes Factors for comparing model fit. A popular method for estimating marginal likelihoods, the harmonic mean (HM) method, can be easily computed from the output of a Markov chain Monte Carlo analysis but often greatly overestimates the marginal likelihood. The thermodynamic integration (TI) method is much more accurate than the HM method but requires more computation. In this paper, we introduce a new method, steppingstone sampling (SS), which uses importance sampling to estimate each ratio in a series (the “stepping stones”) bridging the posterior and prior distributions. We compare the performance of the SS approach to the TI and HM methods in simulation and using real data. We conclude that the greatly increased accuracy of the SS and TI methods argues for their use instead of the HM method, despite the extra computation needed. PMID:21187451

  10. Inferring ancient divergences requires genes with strong phylogenetic signals.

    PubMed

    Salichos, Leonidas; Rokas, Antonis

    2013-05-16

    To tackle incongruence, the topological conflict between different gene trees, phylogenomic studies couple concatenation with practices such as rogue taxon removal or the use of slowly evolving genes. Phylogenomic analysis of 1,070 orthologues from 23 yeast genomes identified 1,070 distinct gene trees, which were all incongruent with the phylogeny inferred from concatenation. Incongruence severity increased for shorter internodes located deeper in the phylogeny. Notably, whereas most practices had little or negative impact on the yeast phylogeny, the use of genes or internodes with high average internode support significantly improved the robustness of inference. We obtained similar results in analyses of vertebrate and metazoan phylogenomic data sets. These results question the exclusive reliance on concatenation and associated practices, and argue that selecting genes with strong phylogenetic signals and demonstrating the absence of significant incongruence are essential for accurately reconstructing ancient divergences. PMID:23657258

  11. Evaluating Support for the Current Classification of Eukaryotic Diversity

    PubMed Central

    Parfrey, Laura Wegener; Barbero, Erika; Lasser, Elyse; Dunthorn, Micah; Bhattacharya, Debashish; Patterson, David J; Katz, Laura A

    2006-01-01

    Perspectives on the classification of eukaryotic diversity have changed rapidly in recent years, as the four eukaryotic groups within the five-kingdom classification—plants, animals, fungi, and protists—have been transformed through numerous permutations into the current system of six “supergroups.” The intent of the supergroup classification system is to unite microbial and macroscopic eukaryotes based on phylogenetic inference. This supergroup approach is increasing in popularity in the literature and is appearing in introductory biology textbooks. We evaluate the stability and support for the current six-supergroup classification of eukaryotes based on molecular genealogies. We assess three aspects of each supergroup: (1) the stability of its taxonomy, (2) the support for monophyly (single evolutionary origin) in molecular analyses targeting a supergroup, and (3) the support for monophyly when a supergroup is included as an out-group in phylogenetic studies targeting other taxa. Our analysis demonstrates that supergroup taxonomies are unstable and that support for groups varies tremendously, indicating that the current classification scheme of eukaryotes is likely premature. We highlight several trends contributing to the instability and discuss the requirements for establishing robust clades within the eukaryotic tree of life. PMID:17194223

  12. Probabilistic Graphical Model Representation in Phylogenetics

    PubMed Central

    Höhna, Sebastian; Heath, Tracy A.; Boussau, Bastien; Landis, Michael J.; Ronquist, Fredrik; Huelsenbeck, John P.

    2014-01-01

    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution. [Computation; graphical models; inference; modularization; statistical phylogenetics; tree plate.] PMID:24951559

  13. Phylogenetics of the phlebotomine sand fly group Verrucarum (Diptera: Psychodidae: Lutzomyia).

    PubMed

    Cohnstaedt, Lee W; Beati, Lorenza; Caceres, Abraham G; Ferro, Cristina; Munstermann, Leonard E

    2011-06-01

    Within the sand fly genus Lutzomyia, the Verrucarum species group contains several of the principal vectors of American cutaneous leishmaniasis and human bartonellosis in the Andean region of South America. The group encompasses 40 species for which the taxonomic status, phylogenetic relationships, and role of each species in disease transmission remain unresolved. Mitochondrial cytochrome c oxidase I (COI) phylogenetic analysis of a 667-bp fragment supported the morphological classification of the Verrucarum group into series. Genetic sequences from seven species were grouped in well-supported monophyletic lineages. Four species, however, clustered in two paraphyletic lineages that indicate conspecificity--the Lutzomyia longiflocosa-Lutzomyia sauroida pair and the Lutzomyia quasitownsendi-Lutzomyia torvida pair. COI sequences were also evaluated as a taxonomic tool based on interspecific genetic variability within the Verrucarum group and the intraspecific variability of one of its members, Lutzomyia verrucarum, across its known distribution. PMID:21633028

  14. Teaching Molecular Phylogenetics through Investigating a Real-World Phylogenetic Problem

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2012-01-01

    A phylogenetics exercise is incorporated into the "Introduction to biocomputing" course, a junior-level course at Savannah State University. This exercise is designed to help students learn important concepts and practical skills in molecular phylogenetics through solving a real-world problem. In this application, students are required to identify…

  15. Isaac Newton Institute for Mathematical Sciences, Cambridge, UK Phylogenetics: New data, new Phylogenetic challenges

    E-print Network

    Warnow,Tandy

    Isaac Newton Institute for Mathematical Sciences, Cambridge, UK Phylogenetics: New data, new molecular evolution. The workshop is taking place at the Isaac Newton Institute, where the highly Phylogenetic challenges Follow-up Meeting 20 ­ 24 JUNE 2011 in association with the Newton Institute programme

  16. Classification challenges in perfectionism.

    PubMed

    Rice, Kenneth G; Richardson, Clarissa M E

    2014-10-01

    High performance expectations are central to perfectionism, but because most participants endorse high standards, it becomes difficult for practitioners and researchers to accurately screen for perfectionists. We addressed problems linked to the measurement and classification of perfectionism by testing various strategies aimed at broadening the range and skew of scores on the Standards subscale from the Almost Perfect Scale-Revised (APS-R; Slaney, Mobley, Trippi, Ashby, & Johnson, 1996). Randomly assigned participants (N = 506) completed the APS-R following standard instructions or 1 of 2 variations, one prompting participants to consider their responses in light of a normal distribution of scores and another in which participants used a visual analog (slider) scale. The visual analog scale produced more differentiated scores, but range restrictions and skewed distributions remained for all 3 variations. Statistical transformations improved skew. Factor mixture modeling was conducted using transformed and nontransformed perfectionism scores along with criterion indicators of emotion regulation (reappraisal or suppression), perceived stress, and depression. Results supported a 3-class model, although more balanced distributions of classes emerged than were previously reported. Perfectionists were differentiated from nonperfectionists by their higher standards scores. Maladaptive perfectionists scored highest among the classes on most self-critical perfectionism indicators, suppression, perceived stress, and depression. Adaptive perfectionists had the lowest levels of perceived stress and depression and scored highest on reappraisal. Both perfectionist classes had generally comparable concerns about mistakes, but criterion indicators suggested those were more problematic for maladaptive perfectionists. Results supported the value of incorporating adaptive and maladaptive criterion indicators in classification models. PMID:25111705

  17. Threat Diversity Will Erode Mammalian Phylogenetic Diversity in the Near Future

    PubMed Central

    Jono, Clémentine M. A.; Pavoine, Sandrine

    2012-01-01

    To reduce the accelerating rate of phylogenetic diversity loss, many studies have searched for mechanisms that could explain why certain species are at risk, whereas others are not. In particular, it has been demonstrated that species might be affected by both extrinsic threat factors as well as intrinsic biological traits that could render a species more sensitive to extinction; here, we focus on extrinsic factors. Recently, the International Union for Conservation of Nature developed a new classification of threat types, including climate change, urbanization, pollution, agriculture and aquaculture, and harvesting/hunting. We have used this new classification to analyze two main factors that could explain the expected future loss of mammalian phylogenetic diversity: 1. differences in the type of threats that affect mammals and 2. differences in the number of major threats that accumulate for a single species. Our results showed that Cetartiodactyla, Diprotodontia, Monotremata, Perissodactyla, Primates, and Proboscidea could lose a high proportion of their current phylogenetic diversity in the coming decades. In contrast, Chiroptera, Didelphimorphia, and Rodentia could lose less phylogenetic diversity than expected if extinctions were random. Some mammalian clades, including Marsupiala, Chiroptera, and a subclade of Primates, are affected by particular threat types, most likely due solely to their geographic locations and associations with particular habitats. However, regardless of the geography, habitat, and taxon considered, it is not the threat type, but the threat diversity that determines the extinction risk for species and clades. Thus, some mammals might be randomly located in areas subjected to a large diversity of threats; they might also accumulate detrimental traits that render them sensitive to different threats, which is a characteristic that could be associated with large body size. Any action reducing threat diversity is expected to have a significant impact on future mammalian phylogeny. PMID:23029443

  18. Molecular phylogenetics of subtribe Aeridinae (Orchidaceae): insights from plastid matK and nuclear ribosomal ITS sequences.

    PubMed

    Topik, Hidayat; Yukawa, Tomohisa; Ito, Motomi

    2005-08-01

    We conducted phylogenetic analyses using two DNA sequence data sets derived from matK, the maturase-coding gene located in an intron of the plastid gene trnK, and the internal transcribed spacer region of 18S-26S nuclear ribosomal DNA to examine relationships in subtribe Aeridinae (Orchidaceae). Specifically, we investigated (1) phylogenetic relationships among genera in the subtribe, (2) the congruence between previous classifications of the subtribe and the phylogenetic relationships inferred from the molecular data, and (3) evolutionary trends of taxonomically important characters of the subtribe, such as pollinia, a spurred lip, and a column foot. In all, 75 species representing 62 genera in subtribe Aeridinae were examined. Our analyses provided the following insights: (1) monophyly of subtribe Aeridinae was tentatively supported in which 14 subclades reflecting phylogenetic relationships can be recognized, (2) results are inconsistent with previous classifications of the subtribe, and (3) repeated evolution of previously emphasized characters such as pollinia number and apertures, length of spur, and column foot was confirmed. It was found that the inconsistencies are mainly caused by homoplasy of these characters. At the genus level, Phalaenopsis, Cleisostoma, and Sarcochilus are shown to be non-monophyletic. PMID:16025359

  19. Genomic Repeat Abundances Contain Phylogenetic Signal

    PubMed Central

    Dodsworth, Steven; Chase, Mark W.; Kelly, Laura J.; Leitch, Ilia J.; Macas, Ji?í; Novák, Petr; Piednoël, Mathieu; Weiss-Schneeweiss, Hanna; Leitch, Andrew R.

    2015-01-01

    A large proportion of genomic information, particularly repetitive elements, is usually ignored when researchers are using next-generation sequencing. Here we demonstrate the usefulness of this repetitive fraction in phylogenetic analyses, utilizing comparative graph-based clustering of next-generation sequence reads, which results in abundance estimates of different classes of genomic repeats. Phylogenetic trees are then inferred based on the genome-wide abundance of different repeat types treated as continuously varying characters; such repeats are scattered across chromosomes and in angiosperms can constitute a majority of nuclear genomic DNA. In six diverse examples, five angiosperms and one insect, this method provides generally well-supported relationships at interspecific and intergeneric levels that agree with results from more standard phylogenetic analyses of commonly used markers. We propose that this methodology may prove especially useful in groups where there is little genetic differentiation in standard phylogenetic markers. At the same time as providing data for phylogenetic inference, this method additionally yields a wealth of data for comparative studies of genome evolution. PMID:25261464

  20. Genomic repeat abundances contain phylogenetic signal.

    PubMed

    Dodsworth, Steven; Chase, Mark W; Kelly, Laura J; Leitch, Ilia J; Macas, Ji?í; Novák, Petr; Piednoël, Mathieu; Weiss-Schneeweiss, Hanna; Leitch, Andrew R

    2015-01-01

    A large proportion of genomic information, particularly repetitive elements, is usually ignored when researchers are using next-generation sequencing. Here we demonstrate the usefulness of this repetitive fraction in phylogenetic analyses, utilizing comparative graph-based clustering of next-generation sequence reads, which results in abundance estimates of different classes of genomic repeats. Phylogenetic trees are then inferred based on the genome-wide abundance of different repeat types treated as continuously varying characters; such repeats are scattered across chromosomes and in angiosperms can constitute a majority of nuclear genomic DNA. In six diverse examples, five angiosperms and one insect, this method provides generally well-supported relationships at interspecific and intergeneric levels that agree with results from more standard phylogenetic analyses of commonly used markers. We propose that this methodology may prove especially useful in groups where there is little genetic differentiation in standard phylogenetic markers. At the same time as providing data for phylogenetic inference, this method additionally yields a wealth of data for comparative studies of genome evolution. PMID:25261464

  1. Prioritizing Populations for Conservation Using Phylogenetic Networks

    PubMed Central

    Volkmann, Logan; Martyn, Iain; Moulton, Vincent; Spillner, Andreas; Mooers, Arne O.

    2014-01-01

    In the face of inevitable future losses to biodiversity, ranking species by conservation priority seems more than prudent. Setting conservation priorities within species (i.e., at the population level) may be critical as species ranges become fragmented and connectivity declines. However, existing approaches to prioritization (e.g., scoring organisms by their expected genetic contribution) are based on phylogenetic trees, which may be poor representations of differentiation below the species level. In this paper we extend evolutionary isolation indices used in conservation planning from phylogenetic trees to phylogenetic networks. Such networks better represent population differentiation, and our extension allows populations to be ranked in order of their expected contribution to the set. We illustrate the approach using data from two imperiled species: the spotted owl Strix occidentalis in North America and the mountain pygmy-possum Burramys parvus in Australia. Using previously published mitochondrial and microsatellite data, we construct phylogenetic networks and score each population by its relative genetic distinctiveness. In both cases, our phylogenetic networks capture the geographic structure of each species: geographically peripheral populations harbor less-redundant genetic information, increasing their conservation rankings. We note that our approach can be used with all conservation-relevant distances (e.g., those based on whole-genome, ecological, or adaptive variation) and suggest it be added to the assortment of tools available to wildlife managers for allocating effort among threatened populations. PMID:24586451

  2. Worldwide phylogenetic relationship of avian poxviruses

    USGS Publications Warehouse

    Gyuranecz, Miklós; Foster, Jeffrey T.; Dán, Ádám; Ip, Hon S.; Egstad, Kristina F.; Parker, Patricia G.; Higashiguchi, Jenni M.; Skinner, Michael A.; Höfle, Ursula; Kreizinger, Zsuzsa; Dorrestein, Gerry M.; Solt, Szabolcs; Sós, Endre; Kim, Young Jun; Uhart, Marcela; Pereda, Ariel; González-Hein, Gisela; Hidalgo, Hector; Blanco, Juan-Manuel; Erdélyi, Károly

    2013-01-01

    Poxvirus infections have been found in 230 species of wild and domestic birds worldwide in both terrestrial and marine environments. This ubiquity raises the question of how infection has been transmitted and globally dispersed. We present a comprehensive global phylogeny of 111 novel poxvirus isolates in addition to all available sequences from GenBank. Phylogenetic analysis of Avipoxvirus genus has traditionally relied on one gene region (4b core protein). In this study we have expanded the analyses to include a second locus (DNA polymerase gene), allowing for a more robust phylogenetic framework, finer genetic resolution within specific groups and the detection of potential recombination. Our phylogenetic results reveal several major features of avipoxvirus evolution and ecology and propose an updated avipoxvirus taxonomy, including three novel subclades. The characterization of poxviruses from 57 species of birds in this study extends the current knowledge of their host range and provides the first evidence of the phylogenetic effect of genetic recombination of avipoxviruses. The repeated occurrence of avian family or order-specific grouping within certain clades (e.g. starling poxvirus, falcon poxvirus, raptor poxvirus, etc.) indicates a marked role of host adaptation, while the sharing of poxvirus species within prey-predator systems emphasizes the capacity for cross-species infection and limited host adaptation. Our study provides a broad and comprehensive phylogenetic analysis of the Avipoxvirus genus, an ecologically and environmentally important viral group, to formulate a genome sequencing strategy that will clarify avipoxvirus taxonomy.

  3. Worldwide Phylogenetic Relationship of Avian Poxviruses

    PubMed Central

    Foster, Jeffrey T.; Dán, Ádám; Ip, Hon S.; Egstad, Kristina F.; Parker, Patricia G.; Higashiguchi, Jenni M.; Skinner, Michael A.; Höfle, Ursula; Kreizinger, Zsuzsa; Dorrestein, Gerry M.; Solt, Szabolcs; Sós, Endre; Kim, Young Jun; Uhart, Marcela; Pereda, Ariel; González-Hein, Gisela; Hidalgo, Hector; Blanco, Juan-Manuel; Erdélyi, Károly

    2013-01-01

    Poxvirus infections have been found in 230 species of wild and domestic birds worldwide in both terrestrial and marine environments. This ubiquity raises the question of how infection has been transmitted and globally dispersed. We present a comprehensive global phylogeny of 111 novel poxvirus isolates in addition to all available sequences from GenBank. Phylogenetic analysis of the Avipoxvirus genus has traditionally relied on one gene region (4b core protein). In this study we expanded the analyses to include a second locus (DNA polymerase gene), allowing for a more robust phylogenetic framework, finer genetic resolution within specific groups, and the detection of potential recombination. Our phylogenetic results reveal several major features of avipoxvirus evolution and ecology and propose an updated avipoxvirus taxonomy, including three novel subclades. The characterization of poxviruses from 57 species of birds in this study extends the current knowledge of their host range and provides the first evidence of the phylogenetic effect of genetic recombination of avipoxviruses. The repeated occurrence of avian family or order-specific grouping within certain clades (e.g., starling poxvirus, falcon poxvirus, raptor poxvirus, etc.) indicates a marked role of host adaptation, while the sharing of poxvirus species within prey-predator systems emphasizes the capacity for cross-species infection and limited host adaptation. Our study provides a broad and comprehensive phylogenetic analysis of the Avipoxvirus genus, an ecologically and environmentally important viral group, to formulate a genome sequencing strategy that will clarify avipoxvirus taxonomy. PMID:23408635

  4. The phylogenetic significance of colour patterns in marine teleost larvae

    PubMed Central

    Baldwin, Carole C

    2013-01-01

    Ichthyologists, natural-history artists, and tropical-fish aquarists have described, illustrated, or photographed colour patterns in adult marine fishes for centuries, but colour patterns in marine fish larvae have largely been neglected. Yet the pelagic larval stages of many marine fishes exhibit subtle to striking, ephemeral patterns of chromatophores that warrant investigation into their potential taxonomic and phylogenetic significance. Colour patterns in larvae of over 200 species of marine teleosts, primarily from the western Caribbean, were examined from digital colour photographs, and their potential utility in elucidating evolutionary relationships at various taxonomic levels was assessed. Larvae of relatively few basal marine teleosts exhibit erythrophores, xanthophores, or iridophores (i.e. nonmelanistic chromatophores), but one or more of those types of chromatophores are visible in larvae of many basal marine neoteleosts and nearly all marine percomorphs. Whether or not the presence of nonmelanistic chromatophores in pelagic marine larvae diagnoses any major teleost taxonomic group cannot be determined based on the preliminary survey conducted, but there is a trend toward increased colour from elopomorphs to percomorphs. Within percomorphs, patterns of nonmelanistic chromatophores may help resolve or contribute evidence to existing hypotheses of relationships at multiple levels of classification. Mugilid and some beloniform larvae share a unique ontogenetic transformation of colour pattern that lends support to the hypothesis of a close relationship between them. Larvae of some tetraodontiforms and lophiiforms are strikingly similar in having the trunk enclosed in an inflated sac covered with xanthophores, a character that may help resolve the relationships of these enigmatic taxa. Colour patterns in percomorph larvae also appear to diagnose certain groups at the interfamilial, familial, intergeneric, and generic levels. Slight differences in generic colour patterns, including whether the pattern comprises xanthophores or erythrophores, often distinguish species. The homology, ontogeny, and possible functional significance of colour patterns in larvae are discussed. Considerably more investigation of larval colour patterns in marine teleosts is needed to assess fully their value in phylogenetic reconstruction. PMID:24039297

  5. Time-Accurate Computational Simulation

    NASA Technical Reports Server (NTRS)

    Pao, S. Paul; Buning, Pieter G.

    2004-01-01

    Time accurate CFD may offer a faster approach to S&C aerodynamic database population than the conventional point by point steady state CFD. We would directly simulate -, -sweeps or other configuration movements typically of measurement sequence in wind tunnels. A second objective is to demonstrate potential applications to assessment of S&C dynamic derivatives by simulating vehicle motions such as free to roll, and nonlinearity such as the trends of aerodynamic forces near CL-max or flow hysteresis.

  6. Molecular phylogeny of Arcoidea with emphasis on Arcidae species (Bivalvia: Pteriomorphia) along the coast of China: Challenges to current classification of arcoids.

    PubMed

    Feng, Yanwei; Li, Qi; Kong, Lingfeng

    2015-04-01

    The current classifications of arcoids are based on phenetic similarity, which display considerable convergence in several shell and anatomical characters, challenging phylogenetic analysis. Independent molecular analysis of DNA sequences is often necessary for accurate taxonomic assignments of arcoids, especially when morphological characters are equivocal. Here we present molecular evidence of the phylogenetic relationships among arcoid species based on Bayesian inference and Maximum Likelihood analyses of three nuclear genes (18SrRNA, 28SrRNA, and histone H3) and two mitochondrial genes (COI and 12S). Tree topologies are discussed by considering traditional arrangements of taxonomic units and previous molecular studies. The results confirm the monophyly of the order Arcoida, the family Noetiidae, and the subfamilies Anadarinae and Striarcinae, with support for the inclusion of the Glycymerididae in the Arcoidea. The subfamily Arcinae and the genera Arca, Barbatia, Scapharca, Anadara, and Glycymeris are non-monophyletic, suggesting that taxonomic issues still remain. The families Noetiidae, Cucullaeidae, and Glycymerididae appear as subgroups within, rather than sister groups to, the Arcidae. This study strongly suggests the need to carry out a taxonomic revision of the Arcoidea, especially the Arcidae, through combined analysis of morphological, paleontological, and molecular data. PMID:25721537

  7. Comparative genomic analysis and phylogenetic position of Theileria equi

    PubMed Central

    2012-01-01

    Background Transmission of arthropod-borne apicomplexan parasites that cause disease and result in death or persistent infection represents a major challenge to global human and animal health. First described in 1901 as Piroplasma equi, this re-emergent apicomplexan parasite was renamed Babesia equi and subsequently Theileria equi, reflecting an uncertain taxonomy. Understanding mechanisms by which apicomplexan parasites evade immune or chemotherapeutic elimination is required for development of effective vaccines or chemotherapeutics. The continued risk of transmission of T. equi from clinically silent, persistently infected equids impedes the goal of returning the U. S. to non-endemic status. Therefore comparative genomic analysis of T. equi was undertaken to: 1) identify genes contributing to immune evasion and persistence in equid hosts, 2) identify genes involved in PBMC infection biology and 3) define the phylogenetic position of T. equi relative to sequenced apicomplexan parasites. Results The known immunodominant proteins, EMA1, 2 and 3 were discovered to belong to a ten member gene family with a mean amino acid identity, in pairwise comparisons, of 39%. Importantly, the amino acid diversity of EMAs is distributed throughout the length of the proteins. Eight of the EMA genes were simultaneously transcribed. As the agents that cause bovine theileriosis infect and transform host cell PBMCs, we confirmed that T. equi infects equine PBMCs, however, there is no evidence of host cell transformation. Indeed, a number of genes identified as potential manipulators of the host cell phenotype are absent from the T. equi genome. Comparative genomic analysis of T. equi revealed the phylogenetic positioning relative to seven apicomplexan parasites using deduced amino acid sequences from 150 genes placed it as a sister taxon to Theileria spp. Conclusions The EMA family does not fit the paradigm for classical antigenic variation, and we propose a novel model describing the role of the EMA family in persistence. T. equi has lost the putative genes for host cell transformation, or the genes were acquired by T. parva and T. annulata after divergence from T. equi. Our analysis identified 50 genes that will be useful for definitive phylogenetic classification of T. equi and closely related organisms. PMID:23137308

  8. Avian vocalizations and phylogenetic?signal

    PubMed Central

    McCracken, Kevin G.; Sheldon, Frederick H.

    1997-01-01

    The difficulty of separating genetic and ecological components of vocalizations has discouraged biologists from using vocal characters to reconstruct phylogenetic and ecological history. By considering the physics of vocalizations in terms of habitat structure, we predict which of five vocal characters of herons are most likely to be influenced by ecology and which by phylogeny, and test this prediction against a molecular-based phylogeny. The characters most subject to ecological convergence, and thus of least phylogenetic value, are first peak-energy frequency and frequency range, because sound penetration through vegetation depends largely on frequency. The most phylogenetically informative characters are number of syllables, syllable structure, and fundamental frequency, because these are more reflective of behavior and syringeal structure. Continued study of the physical principles that distinguish between potentially informative and convergent vocal characters and general patterns of homology in such characters should lead to wider use of vocalizations in the study of evolutionary history. PMID:9108064

  9. Definitions in phylogenetic taxonomy: critique and rationale.

    PubMed

    Sereno, P C

    1999-06-01

    A general rationale for the formulation and placement of taxonomic definitions in phylogenetic taxonomy is proposed, and commonly used terms such as "crown taxon" or "node-based definition" are more precisely defined. In the formulation of phylogenetic definitions, nested reference taxa stabilize taxonomic content. A definitional configuration termed a node-stem triplet also stabilizes the relationship between the trio of taxa at a branchpoint, in the face of local change in phylogenetic relationships or addition/deletion of taxa. Crown-total taxonomies use survivorship as a criterion for placement of node-stem triplets within a taxonomic hierarchy. Diversity, morphology, and tradition also constitute heuristic criteria for placement of node-stem triplets. PMID:12066711

  10. A comprehensive phylogenetic analysis of deadenylases.

    PubMed

    Pavlopoulou, Athanasia; Vlachakis, Dimitrios; Balatsos, Nikolaos A A; Kossida, Sophia

    2013-01-01

    Deadenylases catalyze the shortening of the poly(A) tail at the messenger ribonucleic acid (mRNA) 3'-end in eukaryotes. Therefore, these enzymes influence mRNA decay, and constitute a major emerging group of promising anti-cancer pharmacological targets. Herein, we conducted full phylogenetic analyses of the deadenylase homologs in all available genomes in an effort to investigate evolutionary relationships between the deadenylase families and to identify invariant residues, which probably play key roles in the function of deadenylation across species. Our study includes both major Asp-Glu-Asp-Asp (DEDD) and exonuclease-endonuclease-phospatase (EEP) deadenylase superfamilies. The phylogenetic analysis has provided us with important information regarding conserved and invariant deadenylase amino acids across species. Knowledge of the phylogenetic properties and evolution of the domain of deadenylases provides the foundation for the targeted drug design in the pharmaceutical industry and modern exonuclease anti-cancer scientific research. PMID:24348009

  11. Construction of the Platform for Phylogenetic Analysis

    NASA Astrophysics Data System (ADS)

    Meng, Zhen; Lin, Xiaoguang; He, Xing; Gao, Yanping; Liu, Hongmei; Liu, Yong; Zhou, Yuanchun; Li, Jianhui; Chen, Zhiduan; Zhang, Shouzhou; Li, Yong

    Based on discussing the history of advancement to building the tree of life using genetic and genomic information, effective strategies and methods for the construction of the tree of life, this paper carried out business process analysis and application design. It implements a phylogenetic analysis platform for the land plants based on this analysis. The platform extracts molecular data from the international public databases in batch, which is automated acquisition, cleaning function for users to understand the situation of peer data. The process of phylogenetic reconstruction includes several public modes and tools, such as batch extraction, multiple sequence alignment, cleaning & editing, tree reconstruction, phylogeny evaluation and visualization. All these procedures demand a number of interactive interfaces for phylogenetic tree automatic generation and decision-making aids experiment.

  12. A Comprehensive Phylogenetic Analysis of Deadenylases

    PubMed Central

    Pavlopoulou, Athanasia; Vlachakis, Dimitrios; Balatsos, Nikolaos A.A.; Kossida, Sophia

    2013-01-01

    Deadenylases catalyze the shortening of the poly(A) tail at the messenger ribonucleic acid (mRNA) 3?-end in eukaryotes. Therefore, these enzymes influence mRNA decay, and constitute a major emerging group of promising anti-cancer pharmacological targets. Herein, we conducted full phylogenetic analyses of the deadenylase homologs in all available genomes in an effort to investigate evolutionary relationships between the deadenylase families and to identify invariant residues, which probably play key roles in the function of deadenylation across species. Our study includes both major Asp-Glu-Asp-Asp (DEDD) and exonuclease-endonuclease-phospatase (EEP) deadenylase superfamilies. The phylogenetic analysis has provided us with important information regarding conserved and invariant deadenylase amino acids across species. Knowledge of the phylogenetic properties and evolution of the domain of deadenylases provides the foundation for the targeted drug design in the pharmaceutical industry and modern exonuclease anti-cancer scientific research. PMID:24348009

  13. A Novel Approach for Compressing Phylogenetic Trees

    NASA Astrophysics Data System (ADS)

    Matthews, Suzanne J.; Sul, Seung-Jin; Williams, Tiffani L.

    Phylogenetic trees are tree structures that depict relationships between organisms. Popular analysis techniques often produce large collections of candidate trees, which are expensive to store. We introduce TreeZip, a novel algorithm to compress phylogenetic trees based on their shared evolutionary relationships. We evaluate TreeZip's performance on fourteen tree collections ranging from 2,505 trees on 328 taxa to 150,000 trees on 525 taxa corresponding to 0.6 MB to 434 MB in storage. Our results show that TreeZip is very effective, typically compressing a tree file to less than 2% of its original size. When coupled with standard compression methods such as 7zip, TreeZip can compress a file to less than 1% of its original size. Our results strongly suggest that TreeZip is very effective at compressing phylogenetic trees, which allows for easier exchange of data with colleagues around the world.

  14. Molecular phylogenetics of the hummingbird genus Coeligena.

    PubMed

    Parra, Juan Luis; Remsen, J V; Alvarez-Rebolledo, Mauricio; McGuire, Jimmy A

    2009-11-01

    Advances in the understanding of biological radiations along tropical mountains depend on the knowledge of phylogenetic relationships among species. Here we present a species-level molecular phylogeny based on a multilocus dataset for the Andean hummingbird genus Coeligena. We compare this phylogeny to previous hypotheses of evolutionary relationships and use it as a framework to understand patterns in the evolution of sexual dichromatism and in the biogeography of speciation within the Andes. Previous phylogenetic hypotheses based mostly on similarities in coloration conflicted with our molecular phylogeny, emphasizing the unreliability of color characters for phylogenetic inference. Two major clades, one monochromatic and the other dichromatic, were found in Coeligena. Closely related species were either allopatric or parapatric on opposite mountain slopes. No sister lineages replaced each other along an elevational gradient. Our results indicate the importance of geographic isolation for speciation in this group and the potential interaction between isolation and sexual selection to promote diversification. PMID:19596453

  15. Ensemble sparse classification of Alzheimer's disease.

    PubMed

    Liu, Manhua; Zhang, Daoqiang; Shen, Dinggang

    2012-04-01

    The high-dimensional pattern classification methods, e.g., support vector machines (SVM), have been widely investigated for analysis of structural and functional brain images (such as magnetic resonance imaging (MRI)) to assist the diagnosis of Alzheimer's disease (AD) including its prodromal stage, i.e., mild cognitive impairment (MCI). Most existing classification methods extract features from neuroimaging data and then construct a single classifier to perform classification. However, due to noise and small sample size of neuroimaging data, it is challenging to train only a global classifier that can be robust enough to achieve good classification performance. In this paper, instead of building a single global classifier, we propose a local patch-based subspace ensemble method which builds multiple individual classifiers based on different subsets of local patches and then combines them for more accurate and robust classification. Specifically, to capture the local spatial consistency, each brain image is partitioned into a number of local patches and a subset of patches is randomly selected from the patch pool to build a weak classifier. Here, the sparse representation-based classifier (SRC) method, which has shown to be effective for classification of image data (e.g., face), is used to construct each weak classifier. Then, multiple weak classifiers are combined to make the final decision. We evaluate our method on 652 subjects (including 198 AD patients, 225 MCI and 229 normal controls) from Alzheimer's Disease Neuroimaging Initiative (ADNI) database using MR images. The experimental results show that our method achieves an accuracy of 90.8% and an area under the ROC curve (AUC) of 94.86% for AD classification and an accuracy of 87.85% and an AUC of 92.90% for MCI classification, respectively, demonstrating a very promising performance of our method compared with the state-of-the-art methods for AD/MCI classification using MR images. PMID:22270352

  16. Maximizing the phylogenetic diversity of seed banks.

    PubMed

    Griffiths, Kate E; Balding, Sharon T; Dickie, John B; Lewis, Gwilym P; Pearce, Tim R; Grenyer, Richard

    2015-04-01

    Ex situ conservation efforts such as those of zoos, botanical gardens, and seed banks will form a vital complement to in situ conservation actions over the coming decades. It is therefore necessary to pay the same attention to the biological diversity represented in ex situ conservation facilities as is often paid to protected-area networks. Building the phylogenetic diversity of ex situ collections will strengthen our capacity to respond to biodiversity loss. Since 2000, the Millennium Seed Bank Partnership has banked seed from 14% of the world's plant species. We assessed the taxonomic, geographic, and phylogenetic diversity of the Millennium Seed Bank collection of legumes (Leguminosae). We compared the collection with all known legume genera, their known geographic range (at country and regional levels), and a genus-level phylogeny of the legume family constructed for this study. Over half the phylogenetic diversity of legumes at the genus level was represented in the Millennium Seed Bank. However, pragmatic prioritization of species of economic importance and endangerment has led to the banking of a less-than-optimal phylogenetic diversity and prioritization of range-restricted species risks an underdispersed collection. The current state of the phylogenetic diversity of legumes in the Millennium Seed Bank could be substantially improved through the strategic banking of relatively few additional taxa. Our method draws on tools that are widely applied to in situ conservation planning, and it can be used to evaluate and improve the phylogenetic diversity of ex situ collections. Maximizar la Riqueza Filogenética de los Bancos de Semillas. PMID:25196170

  17. Cirrhosis classification based on texture classification of random features.

    PubMed

    Liu, Hui; Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang

    2014-01-01

    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM. PMID:24707317

  18. Cirrhosis Classification Based on Texture Classification of Random Features

    PubMed Central

    Shao, Ying; Guo, Dongmei; Zheng, Yuanjie; Zhao, Zuowei; Qiu, Tianshuang

    2014-01-01

    Accurate staging of hepatic cirrhosis is important in investigating the cause and slowing down the effects of cirrhosis. Computer-aided diagnosis (CAD) can provide doctors with an alternative second opinion and assist them to make a specific treatment with accurate cirrhosis stage. MRI has many advantages, including high resolution for soft tissue, no radiation, and multiparameters imaging modalities. So in this paper, multisequences MRIs, including T1-weighted, T2-weighted, arterial, portal venous, and equilibrium phase, are applied. However, CAD does not meet the clinical needs of cirrhosis and few researchers are concerned with it at present. Cirrhosis is characterized by the presence of widespread fibrosis and regenerative nodules in the hepatic, leading to different texture patterns of different stages. So, extracting texture feature is the primary task. Compared with typical gray level cooccurrence matrix (GLCM) features, texture classification from random features provides an effective way, and we adopt it and propose CCTCRF for triple classification (normal, early, and middle and advanced stage). CCTCRF does not need strong assumptions except the sparse character of image, contains sufficient texture information, includes concise and effective process, and makes case decision with high accuracy. Experimental results also illustrate the satisfying performance and they are also compared with typical NN with GLCM. PMID:24707317

  19. A Phylogenetic Analysis of the Brassicales Clade Based on an Alignment-Free Sequence Comparison Method

    PubMed Central

    Hatje, Klas; Kollmar, Martin

    2012-01-01

    Phylogenetic analyses reveal the evolutionary derivation of species. A phylogenetic tree can be inferred from multiple sequence alignments of proteins or genes. The alignment of whole genome sequences of higher eukaryotes is a computational intensive and ambitious task as is the computation of phylogenetic trees based on these alignments. To overcome these limitations, we here used an alignment-free method to compare genomes of the Brassicales clade. For each nucleotide sequence a Chaos Game Representation (CGR) can be computed, which represents each nucleotide of the sequence as a point in a square defined by the four nucleotides as vertices. Each CGR is therefore a unique fingerprint of the underlying sequence. If the CGRs are divided by grid lines each grid square denotes the occurrence of oligonucleotides of a specific length in the sequence (Frequency Chaos Game Representation, FCGR). Here, we used distance measures between FCGRs to infer phylogenetic trees of Brassicales species. Three types of data were analyzed because of their different characteristics: (A) Whole genome assemblies as far as available for species belonging to the Malvidae taxon. (B) EST data of species of the Brassicales clade. (C) Mitochondrial genomes of the Rosids branch, a supergroup of the Malvidae. The trees reconstructed based on the Euclidean distance method are in general agreement with single gene trees. The Fitch–Margoliash and Neighbor joining algorithms resulted in similar to identical trees. Here, for the first time we have applied the bootstrap re-sampling concept to trees based on FCGRs to determine the support of the branchings. FCGRs have the advantage that they are fast to calculate, and can be used as additional information to alignment based data and morphological characteristics to improve the phylogenetic classification of species in ambiguous cases. PMID:22952468

  20. Understanding phylogenetic incongruence: lessons from phyllostomid bats

    PubMed Central

    Dávalos, Liliana M; Cirranello, Andrea L; Geisler, Jonathan H; Simmons, Nancy B

    2012-01-01

    All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive morphological convergence among nectar-feeding lineages, and incongruent gene trees. Applying methods to account for nucleotide sequence saturation reduces, but does not completely eliminate, phylogenetic conflict. We ruled out paralogy, lateral gene transfer, and poor taxon sampling and outgroup choices among the processes leading to incongruent gene trees in phyllostomid bats. Uncovering and countering the possible effects of introgression and lineage sorting of ancestral polymorphism on gene trees will require great leaps in genomic and allelic sequencing in this species-rich mammalian family. We also found evidence for adaptive molecular evolution leading to convergence in mitochondrial proteins among nectar-feeding lineages. In conclusion, the biological processes that generate phylogenetic conflict are ubiquitous, and overcoming incongruence requires better models and more data than have been collected even in well-studied organisms such as phyllostomid bats. PMID:22891620

  1. Understanding phylogenetic incongruence: lessons from phyllostomid bats.

    PubMed

    Dávalos, Liliana M; Cirranello, Andrea L; Geisler, Jonathan H; Simmons, Nancy B

    2012-11-01

    All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive morphological convergence among nectar-feeding lineages, and incongruent gene trees. Applying methods to account for nucleotide sequence saturation reduces, but does not completely eliminate, phylogenetic conflict. We ruled out paralogy, lateral gene transfer, and poor taxon sampling and outgroup choices among the processes leading to incongruent gene trees in phyllostomid bats. Uncovering and countering the possible effects of introgression and lineage sorting of ancestral polymorphism on gene trees will require great leaps in genomic and allelic sequencing in this species-rich mammalian family. We also found evidence for adaptive molecular evolution leading to convergence in mitochondrial proteins among nectar-feeding lineages. In conclusion, the biological processes that generate phylogenetic conflict are ubiquitous, and overcoming incongruence requires better models and more data than have been collected even in well-studied organisms such as phyllostomid bats. PMID:22891620

  2. Jumping Emerging Substrings in Image Classification

    NASA Astrophysics Data System (ADS)

    Kobyli?ski, ?ukasz; Walczak, Krzysztof

    We propose a new image classification scheme based on the idea of mining jumping emerging substrings between classes of images represented by visual features. Jumping emerging substrings (JES) are string patterns, which occur frequently in one set of string data and are absent in another. By representing images in symbolic manner, according to their color and texture characteristics, we enable mining of JESs in sets of visual data and use mined patterns to create efficient and accurate classifiers. In this paper we describe our approach to image representation and provide experimental results of JES-based classification of well-known image datasets.

  3. BIMLR: a method for constructing rooted phylogenetic networks from rooted phylogenetic trees.

    PubMed

    Wang, Juan; Guo, Maozu; Xing, Linlin; Che, Kai; Liu, Xiaoyan; Wang, Chunyu

    2013-09-15

    Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. PMID:23816409

  4. Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids

    PubMed Central

    Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry

    2006-01-01

    Background The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place Cucumis as sister to the Myrtales and therefore do not support the monophyly of the eurosid I clade. Conclusion Phylogenies based on DNA sequences from complete chloroplast genome sequences provide strong support for the position of the Vitaceae as the earliest diverging lineage of rosids. Our phylogenetic analyses support recent assertions that inadequate taxon sampling and incorrect model specification for concatenated multi-gene data sets can mislead phylogenetic inferences when using whole chloroplast genomes for phylogeny reconstruction. PMID:16603088

  5. Automatic classification of blank substrate defects

    NASA Astrophysics Data System (ADS)

    Boettiger, Tom; Buck, Peter; Paninjath, Sankaranarayanan; Pereira, Mark; Ronald, Rob; Rost, Dan; Samir, Bhamidipati

    2014-10-01

    Mask preparation stages are crucial in mask manufacturing, since this mask is to later act as a template for considerable number of dies on wafer. Defects on the initial blank substrate, and subsequent cleaned and coated substrates, can have a profound impact on the usability of the finished mask. This emphasizes the need for early and accurate identification of blank substrate defects and the risk they pose to the patterned reticle. While Automatic Defect Classification (ADC) is a well-developed technology for inspection and analysis of defects on patterned wafers and masks in the semiconductors industry, ADC for mask blanks is still in the early stages of adoption and development. Calibre ADC is a powerful analysis tool for fast, accurate, consistent and automatic classification of defects on mask blanks. Accurate, automated classification of mask blanks leads to better usability of blanks by enabling defect avoidance technologies during mask writing. Detailed information on blank defects can help to select appropriate job-decks to be written on the mask by defect avoidance tools [1][4][5]. Smart algorithms separate critical defects from the potentially large number of non-critical defects or false defects detected at various stages during mask blank preparation. Mechanisms used by Calibre ADC to identify and characterize defects include defect location and size, signal polarity (dark, bright) in both transmitted and reflected review images, distinguishing defect signals from background noise in defect images. The Calibre ADC engine then uses a decision tree to translate this information into a defect classification code. Using this automated process improves classification accuracy, repeatability and speed, while avoiding the subjectivity of human judgment compared to the alternative of manual defect classification by trained personnel [2]. This paper focuses on the results from the evaluation of Automatic Defect Classification (ADC) product at MP Mask Technology Center (MPMask). The Calibre ADC tool was qualified on production mask blanks against the manual classification. The classification accuracy of ADC is greater than 95% for critical defects with an overall accuracy of 90%. The sensitivity to weak defect signals and locating the defect in the images is a challenge we are resolving. The performance of the tool has been demonstrated on multiple mask types and is ready for deployment in full volume mask manufacturing production flow. Implementation of Calibre ADC is estimated to reduce the misclassification of critical defects by 60-80%.

  6. The New Higher Level Classification of Eukaryotes with Emphasis on the Taxonomy of Protists

    Microsoft Academic Search

    SINA M. ADL; ALASTAIR G. B. SIMPSON; MARK A. FARMER; ROBERT A. ANDERSEN; O. ROGER ANDERSON; JOHN R. BARTA; SAMUEL S. BOWSER; GUY BRUGEROLLE; ROBERT A. FENSOME; SUZANNE FREDERICQ; TIMOTHY Y. JAMES; SERGEI KARPOV; PAUL KUGRENS; JOHN KRUG; LOUISE A. LEWIS; JEAN LODGE; DENIS H. LYNN; DAVID G. MANN; RICHARD M. MCCOURT; LEONEL MENDOZA; OJVIND MOESTRUP; SHARON E. MOZLEY-STANDRIDGE; THOMAS A. NERAD; CAROL A. SHEARER; ALEXEY V. SMIRNOV; FREDERICK W. SPIEGEL; MAX F. J. R. TAYLOR

    2005-01-01

    This revision of the classification of unicellular eukaryotes updates that of Levine et al. (1980) for the protozoa and expands it to include other protists. Whereas the previous revision was primarily to incorporate the results of ultrastructural studies, this revision incorporates results from both ultrastructural research since 1980 and molecular phylogenetic studies. We propose a scheme that is based on

  7. NASA Position Classification Handbook

    NASA Technical Reports Server (NTRS)

    1987-01-01

    The NASA Position Classification Handbook provides: a concise unitary reference document covering most aspects of position classification within NASA, information regarding the characteristics of NASA's own position classification program--the NASA Supplemental Classification System--and its origins, information concerning responsibilities of various levels of NASA management for position classification, and information concerning overall operation of a classification program. The provisions of this handbook pertain to the position classification function agency-wide. Although it will be particularly useful to personnel specialists, it also can serve as a convenient reference on position classification for line managers and supervisors and administrative personnel who deal with personnel management matters. Recommendations or questions concerning the content of this handbook should be directed to Director, Personnel Programs Division (Code NP), NASA Headquarters.

  8. Accurate vacuum-polarization calculations

    NASA Astrophysics Data System (ADS)

    Persson, Hans; Lindgren, Ingvar; Salomonson, Sten; Sunnergren, Per

    1993-10-01

    A numerical scheme for evaluating the part of the one-photon vacuum-polarization effect not accounted for by the Uehling potential (the Wichmann-Kroll effect) is presented. The method can be used with an arbitary atomic model potential describing the bound electrons. Benchmark results for this effect are presented for hydrogenlike levels using a uniform nuclear-charge distribution. The effect of direct and exchange electron screening on the vacuum polarization are discussed in connection with the accurately measured 2p1/2-2s1/2 transition in lithiumlike uranium.

  9. Efficient and accurate fragmentation methods.

    PubMed

    Pruitt, Spencer R; Bertoni, Colleen; Brorsen, Kurt R; Gordon, Mark S

    2014-09-16

    Conspectus Three novel fragmentation methods that are available in the electronic structure program GAMESS (general atomic and molecular electronic structure system) are discussed in this Account. The fragment molecular orbital (FMO) method can be combined with any electronic structure method to perform accurate calculations on large molecular species with no reliance on capping atoms or empirical parameters. The FMO method is highly scalable and can take advantage of massively parallel computer systems. For example, the method has been shown to scale nearly linearly on up to 131?000 processor cores for calculations on large water clusters. There have been many applications of the FMO method to large molecular clusters, to biomolecules (e.g., proteins), and to materials that are used as heterogeneous catalysts. The effective fragment potential (EFP) method is a model potential approach that is fully derived from first principles and has no empirically fitted parameters. Consequently, an EFP can be generated for any molecule by a simple preparatory GAMESS calculation. The EFP method provides accurate descriptions of all types of intermolecular interactions, including Coulombic interactions, polarization/induction, exchange repulsion, dispersion, and charge transfer. The EFP method has been applied successfully to the study of liquid water, ?-stacking in substituted benzenes and in DNA base pairs, solvent effects on positive and negative ions, electronic spectra and dynamics, non-adiabatic phenomena in electronic excited states, and nonlinear excited state properties. The effective fragment molecular orbital (EFMO) method is a merger of the FMO and EFP methods, in which interfragment interactions are described by the EFP potential, rather than the less accurate electrostatic potential. The use of EFP in this manner facilitates the use of a smaller value for the distance cut-off (Rcut). Rcut determines the distance at which EFP interactions replace fully quantum mechanical calculations on fragment-fragment (dimer) interactions. The EFMO method is both more accurate and more computationally efficient than the most commonly used FMO implementation (FMO2), in which all dimers are explicitly included in the calculation. While the FMO2 method itself does not incorporate three-body interactions, such interactions are included in the EFMO method via the EFP self-consistent induction term. Several applications (ranging from clusters to proteins) of the three methods are discussed to demonstrate their efficacy. The EFMO method will be especially exciting once the analytic gradients have been completed, because this will allow geometry optimizations, the prediction of vibrational spectra, reaction path following, and molecular dynamics simulations using the method. PMID:24810424

  10. Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach

    PubMed Central

    Erickson, David L.; Jones, Frank A.; Swenson, Nathan G.; Pei, Nancai; Bourg, Norman A.; Chen, Wenna; Davies, Stuart J.; Ge, Xue-jun; Hao, Zhanqing; Howe, Robert W.; Huang, Chun-Lin; Larson, Andrew J.; Lum, Shawn K. Y.; Lutz, James A.; Ma, Keping; Meegaskumbura, Madhava; Mi, Xiangcheng; Parker, John D.; Fang-Sun, I.; Wright, S. Joseph; Wolf, Amy T.; Ye, W.; Xing, Dingliang; Zimmerman, Jess K.; Kress, W. John

    2014-01-01

    Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK, and psbA-trnH) and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance (PD) metrics that are commonly used to infer assembly processes were estimated for each plot [PD, Mean Phylogenetic Distance (MPD), and Mean Nearest Taxon Distance (MNTD)]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for individual plots, estimates of phylogenetic diversity in the mega-phylogeny were more consistent, thereby removing a potential source of bias at the plot-level, and demonstrating the value of assessing phylogenetic relationships simultaneously within a mega-phylogeny. An unexpected result of the comparisons among plots based on the mega-phylogeny was that the communities in the ForestGEO plots in general appear to be assemblages of more closely related species than expected by chance, and that differentiation among communities is very low, suggesting deep floristic connections among communities and new avenues for future analyses in community ecology. PMID:25414723

  11. Comparative evolutionary diversity and phylogenetic structure across multiple forest dynamics plots: a mega-phylogeny approach.

    PubMed

    Erickson, David L; Jones, Frank A; Swenson, Nathan G; Pei, Nancai; Bourg, Norman A; Chen, Wenna; Davies, Stuart J; Ge, Xue-Jun; Hao, Zhanqing; Howe, Robert W; Huang, Chun-Lin; Larson, Andrew J; Lum, Shawn K Y; Lutz, James A; Ma, Keping; Meegaskumbura, Madhava; Mi, Xiangcheng; Parker, John D; Fang-Sun, I; Wright, S Joseph; Wolf, Amy T; Ye, W; Xing, Dingliang; Zimmerman, Jess K; Kress, W John

    2014-01-01

    Forest dynamics plots, which now span longitudes, latitudes, and habitat types across the globe, offer unparalleled insights into the ecological and evolutionary processes that determine how species are assembled into communities. Understanding phylogenetic relationships among species in a community has become an important component of assessing assembly processes. However, the application of evolutionary information to questions in community ecology has been limited in large part by the lack of accurate estimates of phylogenetic relationships among individual species found within communities, and is particularly limiting in comparisons between communities. Therefore, streamlining and maximizing the information content of these community phylogenies is a priority. To test the viability and advantage of a multi-community phylogeny, we constructed a multi-plot mega-phylogeny of 1347 species of trees across 15 forest dynamics plots in the ForestGEO network using DNA barcode sequence data (rbcL, matK, and psbA-trnH) and compared community phylogenies for each individual plot with respect to support for topology and branch lengths, which affect evolutionary inference of community processes. The levels of taxonomic differentiation across the phylogeny were examined by quantifying the frequency of resolved nodes throughout. In addition, three phylogenetic distance (PD) metrics that are commonly used to infer assembly processes were estimated for each plot [PD, Mean Phylogenetic Distance (MPD), and Mean Nearest Taxon Distance (MNTD)]. Lastly, we examine the partitioning of phylogenetic diversity among community plots through quantification of inter-community MPD and MNTD. Overall, evolutionary relationships were highly resolved across the DNA barcode-based mega-phylogeny, and phylogenetic resolution for each community plot was improved when estimated within the context of the mega-phylogeny. Likewise, when compared with phylogenies for individual plots, estimates of phylogenetic diversity in the mega-phylogeny were more consistent, thereby removing a potential source of bias at the plot-level, and demonstrating the value of assessing phylogenetic relationships simultaneously within a mega-phylogeny. An unexpected result of the comparisons among plots based on the mega-phylogeny was that the communities in the ForestGEO plots in general appear to be assemblages of more closely related species than expected by chance, and that differentiation among communities is very low, suggesting deep floristic connections among communities and new avenues for future analyses in community ecology. PMID:25414723

  12. Osteology and phylogenetic interrelationships of sturgeons (Acipenseridae)

    Microsoft Academic Search

    Eric K. Findeis

    Sturgeons (Acipenseridae) are anancient and unique assemblage of fishes historically important to discussions of actinopterygian evolution. Despite their basal position within Actinopterygii, rigorous comparative morphological studies of acipenserids have never been made, and most ideas about acipenserid evolution hinge on an untested impression that shovelnose sturgeons (Scaphirhynchini) are phylogenetically primitive. This impression promoted ideas that: (1) the earliest acipenserids were

  13. Quantifying MCMC Exploration of Phylogenetic Tree Space

    PubMed Central

    Whidden, Chris; Matsen, Frederick A.

    2015-01-01

    In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks. PMID:25631175

  14. Phylogenetic inference via sequential Monte Carlo.

    PubMed

    Bouchard-Côté, Alexandre; Sankararaman, Sriram; Jordan, Michael I

    2012-07-01

    Bayesian inference provides an appealing general framework for phylogenetic analysis, able to incorporate a wide variety of modeling assumptions and to provide a coherent treatment of uncertainty. Existing computational approaches to bayesian inference based on Markov chain Monte Carlo (MCMC) have not, however, kept pace with the scale of the data analysis problems in phylogenetics, and this has hindered the adoption of bayesian methods. In this paper, we present an alternative to MCMC based on Sequential Monte Carlo (SMC). We develop an extension of classical SMC based on partially ordered sets and show how to apply this framework--which we refer to as PosetSMC--to phylogenetic analysis. We provide a theoretical treatment of PosetSMC and also present experimental evaluation of PosetSMC on both synthetic and real data. The empirical results demonstrate that PosetSMC is a very promising alternative to MCMC, providing up to two orders of magnitude faster convergence. We discuss other factors favorable to the adoption of PosetSMC in phylogenetics, including its ability to estimate marginal likelihoods, its ready implementability on parallel and distributed computing platforms, and the possibility of combining with MCMC in hybrid MCMC-SMC schemes. Software for PosetSMC is available at http://www.stat.ubc.ca/ bouchard/PosetSMC. PMID:22223445

  15. Phylogenetic identification of lateral genetic transfer events

    Microsoft Academic Search

    Robert G Beiko; Nicholas Hamilton

    2006-01-01

    BACKGROUND: Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a

  16. El Comp as Flamenco: A Phylogenetic Analysis

    Microsoft Academic Search

    J. Miguel; Godfried T. Toussaint

    2004-01-01

    The amenc o music of Andalucia in Southern Spain is characterized by hand clapping pat- terns in which the underlying meter is manifested through accented claps. A phylogenetic analy- sis of the v e 12\\/8 time metric timelines used in Flamenco music is presented using two distance measures: the chronotonic distance of Gustafson and a new distance measure called the

  17. Large-Scale Inference of Phylogenetic Trees

    E-print Network

    Poirazi, Yiota

    Large-Scale Inference of Phylogenetic Trees Alexandros Stamatakis Institute of Computer Science Trees Alexandros Stamatakis As of July 1st 2006 Swiss Institute of Bioinformatics at Lausanne #12 Tree-of-life New insights in medical & biological research #12;© Alexandros Stamatakis, March 2006 7

  18. Quantifying MCMC Exploration of Phylogenetic Tree Space.

    PubMed

    Whidden, Chris; Matsen, Frederick A

    2015-05-01

    In order to gain an understanding of the effectiveness of phylogenetic Markov chain Monte Carlo (MCMC), it is important to understand how quickly the empirical distribution of the MCMC converges to the posterior distribution. In this article, we investigate this problem on phylogenetic tree topologies with a metric that is especially well suited to the task: the subtree prune-and-regraft (SPR) metric. This metric directly corresponds to the minimum number of MCMC rearrangements required to move between trees in common phylogenetic MCMC implementations. We develop a novel graph-based approach to analyze tree posteriors and find that the SPR metric is much more informative than simpler metrics that are unrelated to MCMC moves. In doing so, we show conclusively that topological peaks do occur in Bayesian phylogenetic posteriors from real data sets as sampled with standard MCMC approaches, investigate the efficiency of Metropolis-coupled MCMC (MCMCMC) in traversing the valleys between peaks, and show that conditional clade distribution (CCD) can have systematic problems when there are multiple peaks. PMID:25631175

  19. Zen and the art of phylogenetic inference

    E-print Network

    Czygrinow, Andrzej

    by Darwin in 1859 as "descent with modification": descent - an unbroken lineage of information flow through modification - `change', occurs within lineages by various processes, such as mutation, selection, genetic and diversification of gene families and genomes Phylogenetics of extinct organisms via analyses of "ancient DNA

  20. Molecular Phylogenetics of Mastodon and Tyrannosaurus rex

    Microsoft Academic Search

    Chris L. Organ; Mary H. Schweitzer; Wenxia Zheng; Lisa M. Freimark; Lewis C. Cantley; John M. Asara

    2008-01-01

    We report a molecular phylogeny for a nonavian dinosaur, extending our knowledge of trait evolution within nonavian dinosaurs into the macromolecular level of biological organization. Fragments of collagen alpha1(I) and alpha2(I) proteins extracted from fossil bones of Tyrannosaurus rex and Mammut americanum (mastodon) were analyzed with a variety of phylogenetic methods. Despite missing sequence data, the mastodon groups with elephant

  1. Generic circumscription in menyanthaceae: A phylogenetic evaluation

    Microsoft Academic Search

    Nicholas P. Tippery; Donald H. Les; Donald J. Padgett; Surrey W. L. Jacobs

    2008-01-01

    Menyanthaceae consist of five genera of aquatic and wetland plants distributed worldwide. The three monotypic genera (Liparophyllum, Menyanthes, and Nephrophyllidium) are clearly differentiated morphologically, but the two larger genera (Nymphoides and Villarsia) contain several taxa of uncertain affinity. We undertook a phylogenetic analysis, using a combination of morphological and molecular data, to resolve relationships among species and to evaluate the

  2. Tutorial on Phylogenetic Tree Estimation Junhyong Kim

    E-print Network

    Kim, Junhyong

    Tutorial on Phylogenetic Tree Estimation Junhyong Kim Department of Ecology and Evolutionary of Computer Science University of Texas Austin, TX e-mail: tandy@cs.utexas.edu 1 Tutorial Summary All polynomial time methods that can handle large evolutionary datasets. This tutorial will present

  3. On the analysis of phylogenetically paired designs

    PubMed Central

    Funk, Jennifer L; Rakovski, Cyril S; Macpherson, J Michael

    2015-01-01

    As phylogenetically controlled experimental designs become increasingly common in ecology, the need arises for a standardized statistical treatment of these datasets. Phylogenetically paired designs circumvent the need for resolved phylogenies and have been used to compare species groups, particularly in the areas of invasion biology and adaptation. Despite the widespread use of this approach, the statistical analysis of paired designs has not been critically evaluated. We propose a mixed model approach that includes random effects for pair and species. These random effects introduce a “two-layer” compound symmetry variance structure that captures both the correlations between observations on related species within a pair as well as the correlations between the repeated measurements within species. We conducted a simulation study to assess the effect of model misspecification on Type I and II error rates. We also provide an illustrative example with data containing taxonomically similar species and several outcome variables of interest. We found that a mixed model with species and pair as random effects performed better in these phylogenetically explicit simulations than two commonly used reference models (no or single random effect) by optimizing Type I error rates and power. The proposed mixed model produces acceptable Type I and II error rates despite the absence of a phylogenetic tree. This design can be generalized to a variety of datasets to analyze repeated measurements in clusters of related subjects/species.

  4. The challenge of constructing large phylogenetic trees

    E-print Network

    Sanderson, Mike

    .2500 species [6]. Plant phylogenetic analyses have also become large in the genome `direction'. The large number of plant model systems has spawned numerous EST (expressed sequence tag) projects, which of this distribution for all plant proteins in a recent release of GenBank. A few species have been sequenced for many

  5. Mitochondrial phylogenetics and evolution of mysticete whales.

    PubMed

    Sasaki, Takeshi; Nikaido, Masato; Hamilton, Healy; Goto, Mutsuo; Kato, Hidehiro; Kanda, Naohisa; Pastene, Luis; Cao, Ying; Fordyce, R; Hasegawa, Masami; Okada, Norihiro

    2005-02-01

    The phylogenetic relationships among baleen whales (Order: Cetacea) remain uncertain despite extensive research in cetacean molecular phylogenetics and a potential morphological sample size of over 2 million animals harvested. Questions remain regarding the number of species and the monophyly of genera, as well as higher order relationships. Here, we approach mysticete phylogeny with complete mitochondrial genome sequence analysis. We determined complete mtDNA sequences of 10 extant Mysticeti species, inferred their phylogenetic relationships, and estimated node divergence times. The mtDNA sequence analysis concurs with previous molecular studies in the ordering of the principal branches, with Balaenidae (right whales) as sister to all other mysticetes base, followed by Neobalaenidae (pygmy right whale), Eschrichtiidae (gray whale), and finally Balaenopteridae (rorquals + humpback whale). The mtDNA analysis further suggests that four lineages exist within the clade of Eschrichtiidae + Balaenopteridae, including a sister relationship between the humpback and fin whales, and a monophyletic group formed by the blue, sei, and Bryde's whales, each of which represents a newly recognized phylogenetic relationship in Mysticeti. We also estimated the divergence times of all extant mysticete species, accounting for evolutionary rate heterogeneity among lineages. When the mtDNA divergence estimates are compared with the mysticete fossil record, several lineages have molecular divergence estimates strikingly older than indicated by paleontological data. We suggest this discrepancy reflects both a large amount of ancestral polymorphism and long generation times of ancestral baleen whale populations. PMID:15805012

  6. MOLECULAR PHYLOGENETIC RELATIONSHIPS AMONG DIABROTICA SPECIES (ACCESSION NO. AF195198)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Corn rootworms of the genus Diabrotica (Coleoptera: Chrysomelidae) are the most serious pest of corn in midwestern United States. Despite their economic importance, phylogenetic relationships within the genus remain unclear. Phylogenetic analysis of five Diabrotica was undertaken using DNA sequences...

  7. MOLECULAR PHYLOGENETIC RELATIONSHIPS AMONG DIABROTICA SPECIES (ACCESSION NO. AF195202)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Corn rootworms of the genus Diabrotica (Coleoptera: Chrysomelidae) are the most serious pest of corn in midwestern United States. Despite their economic importance, phylogenetic relationships within the genus remain unclear. Phylogenetic analysis of five Diabrotica was undertaken using DNA sequences...

  8. Phylogenetic character mapping of proteomic diversity shows high correlation with subspecific phylogenetic diversity in Trypanosoma cruzi.

    PubMed

    Telleria, Jenny; Biron, David G; Brizard, Jean-Paul; Demettre, Edith; Séveno, Martial; Barnabé, Christian; Ayala, Francisco J; Tibayrenc, Michel

    2010-11-23

    We performed a phylogenetic character mapping on 26 stocks of Trypanosoma cruzi, the parasite responsible for Chagas disease, and 2 stocks of the sister taxon T. cruzi marinkellei to test for possible associations between T. cruzi-subspecific phylogenetic diversity and levels of protein expression, as examined by proteomic analysis and mass spectrometry. We observed a high level of correlation (P < 10(-4)) between genetic distance, as established by multilocus enzyme electrophoresis, and proteomic dissimilarities estimated by proteomic Euclidian distances. Several proteins were found to be specifically associated to T. cruzi phylogenetic subdivisions (discrete typing units). This study explores the previously uncharacterized links between infraspecific phylogenetic diversity and gene expression in a human pathogen. It opens the way to searching for new vaccine and drug targets and for identification of specific biomarkers at the subspecific level of pathogens. PMID:21059959

  9. Phylogenetic character mapping of proteomic diversity shows high correlation with subspecific phylogenetic diversity in Trypanosoma cruzi

    PubMed Central

    Telleria, Jenny; Biron, David G.; Brizard, Jean-Paul; Demettre, Edith; Séveno, Martial; Barnabé, Christian; Ayala, Francisco J.; Tibayrenc, Michel

    2010-01-01

    We performed a phylogenetic character mapping on 26 stocks of Trypanosoma cruzi, the parasite responsible for Chagas disease, and 2 stocks of the sister taxon T. cruzi marinkellei to test for possible associations between T. cruzi–subspecific phylogenetic diversity and levels of protein expression, as examined by proteomic analysis and mass spectrometry. We observed a high level of correlation (P < 10?4) between genetic distance, as established by multilocus enzyme electrophoresis, and proteomic dissimilarities estimated by proteomic Euclidian distances. Several proteins were found to be specifically associated to T. cruzi phylogenetic subdivisions (discrete typing units). This study explores the previously uncharacterized links between infraspecific phylogenetic diversity and gene expression in a human pathogen. It opens the way to searching for new vaccine and drug targets and for identification of specific biomarkers at the subspecific level of pathogens. PMID:21059959

  10. Cnidarian phylogenetic relationships as revealed by mitogenomics

    PubMed Central

    2013-01-01

    Background Cnidaria (corals, sea anemones, hydroids, jellyfish) is a phylum of relatively simple aquatic animals characterized by the presence of the cnidocyst: a cell containing a giant capsular organelle with an eversible tubule (cnida). Species within Cnidaria have life cycles that involve one or both of the two distinct body forms, a typically benthic polyp, which may or may not be colonial, and a typically pelagic mostly solitary medusa. The currently accepted taxonomic scheme subdivides Cnidaria into two main assemblages: Anthozoa (Hexacorallia?+?Octocorallia) – cnidarians with a reproductive polyp and the absence of a medusa stage – and Medusozoa (Cubozoa, Hydrozoa, Scyphozoa, Staurozoa) – cnidarians that usually possess a reproductive medusa stage. Hypothesized relationships among these taxa greatly impact interpretations of cnidarian character evolution. Results We expanded the sampling of cnidarian mitochondrial genomes, particularly from Medusozoa, to reevaluate phylogenetic relationships within Cnidaria. Our phylogenetic analyses based on a mitochogenomic dataset support many prior hypotheses, including monophyly of Hexacorallia, Octocorallia, Medusozoa, Cubozoa, Staurozoa, Hydrozoa, Carybdeida, Chirodropida, and Hydroidolina, but reject the monophyly of Anthozoa, indicating that the Octocorallia?+?Medusozoa relationship is not the result of sampling bias, as proposed earlier. Further, our analyses contradict Scyphozoa [Discomedusae?+?Coronatae], Acraspeda [Cubozoa?+?Scyphozoa], as well as the hypothesis that Staurozoa is the sister group to all the other medusozoans. Conclusions Cnidarian mitochondrial genomic data contain phylogenetic signal informative for understanding the evolutionary history of this phylum. Mitogenome-based phylogenies, which reject the monophyly of Anthozoa, provide further evidence for the polyp-first hypothesis. By rejecting the traditional Acraspeda and Scyphozoa hypotheses, these analyses suggest that the shared morphological characters in these groups are plesiomorphies, originated in the branch leading to Medusozoa. The expansion of mitogenomic data along with improvements in phylogenetic inference methods and use of additional nuclear markers will further enhance our understanding of the phylogenetic relationships and character evolution within Cnidaria. PMID:23302374

  11. APE: Analyses of Phylogenetics and Evolution in R language

    Microsoft Academic Search

    Emmanuel Paradis; Julien Claude; Korbinian Strimmer

    2004-01-01

    Abstract Acknowledgements References Abstract Summary: Analysis of Phylogenetics and Evolution (APE) is a package written in the R language,for use in molecular evolution and phylogenet- ics. APE provides both utility functions for reading and writing data and manipulating phylogenetic trees, as well as several advanced methods for phylogenetic and evolutionary analysis (e.g. comparative,and popu- lation genetic methods). APE takes advantage,of

  12. Combining Substrate Specificity Analysis with Support Vector Classifiers Reveals Feruloyl Esterase as a Phylogenetically Informative Protein Group

    PubMed Central

    Olivares-Hernández, Roberto; Sunner, Hampus; Frisvad, Jens C.; Olsson, Lisbeth; Nielsen, Jens; Panagiotou, Gianni

    2010-01-01

    Background Our understanding of how fungi evolved to develop a variety of ecological niches, is limited but of fundamental biological importance. Specifically, the evolution of enzymes affects how well species can adapt to new environmental conditions. Feruloyl esterases (FAEs) are enzymes able to hydrolyze the ester bonds linking ferulic acid to plant cell wall polysaccharides. The diversity of substrate specificities found in the FAE family shows that this family is old enough to have experienced the emergence and loss of many activities. Methodology/Principal Findings In this study we evaluate the relative activity of FAEs against a variety of model substrates as a novel predictive tool for Ascomycota taxonomic classification. Our approach consists of two analytical steps; (1) an initial unsupervised analysis to cluster the FAEs substrate specificity data which were generated by cultivation of 34 Ascomycota strains and then an analysis of the produced enzyme cocktail against 10 substituted cinnamate and phenylalkanoate methyl esters, (2) a second, supervised analysis for training a predictor built on these substrate activities. By applying both linear and non-linear models we were able to correctly predict the taxonomic Class (?86% correct classification), Order (?88% correct classification) and Family (?88% correct classification) that the 34 Ascomycota belong to, using the activity profiles of the FAEs. Conclusion/Significance The good correlation with the FAEs substrate specificities that we have defined via our phylogenetic analysis not only suggests that FAEs are phylogenetically informative proteins but it is also a considerable step towards improved FAEs functional prediction. PMID:20877647

  13. PHYLO-ASP: Phylogenetic Systematics with Answer Set Programming

    E-print Network

    Erdem, Esra

    PHYLO-ASP: Phylogenetic Systematics with Answer Set Programming Esra Erdem Faculty of Engineering,3,4], and the second step is studied in [5,6]. We call our ASP-based approach to phylogenetic tree and phylogenetic network reconstruction as PHYLO-ASP. We illus- trated the applicability and effectiveness of PHYLO-ASP

  14. Phylogenetic Hidden Markov Models Adam Siepel and David Haussler

    E-print Network

    Keinan, Alon

    Phylogenetic Hidden Markov Models Adam Siepel and David Haussler Center for Biomolecular Science and Engineering University of California, Santa Cruz Santa Cruz, CA 95064, USA Phylogenetic hidden Markov models. In addition, we discuss how hidden Markov models (HMMs), phylogenetic models, and phylo-HMMs all can

  15. Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests

    PubMed Central

    2010-01-01

    Background To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is computationally hard. In addition, the application of these methods to real data raises the problem of sorting out real and artifactual phylogenetic conflict. Results We present Prunier, a new method for phylogenetic detection of LGT based on the search for a maximum statistical agreement forest (MSAF) between a gene tree and a reference tree. The program is flexible as it can use any definition of "agreement" among trees. We evaluate the performance of Prunier and two other programs (EEEP and RIATA-HGT) for their ability to detect transferred genes in realistic simulations where gene trees are reconstructed from sequences. Prunier proposes a single scenario that compares to the other methods in terms of sensitivity, but shows higher specificity. We show that LGT scenarios carry a strong signal about the position of the root of the species tree and could be used to identify the direction of evolutionary time on the species tree. We use Prunier on a biological dataset of 23 universal proteins and discuss their suitability for inferring the tree of life. Conclusions The ability of Prunier to take into account branch support in the process of reconciliation allows a gain in complexity, in comparison to EEEP, and in accuracy in comparison to RIATA-HGT. Prunier's greedy algorithm proposes a single scenario of LGT for a gene family, but its quality always compares to the best solutions provided by the other algorithms. When the root position is uncertain in the species tree, Prunier is able to infer a scenario per root at a limited additional computational cost and can easily run on large datasets. Prunier is implemented in C++, using the Bio++ library and the phylogeny program Treefinder. It is available at: http://pbil.univ-lyon1.fr/software/prunier PMID:20550700

  16. Accurate extraction of the News

    E-print Network

    Shrirang S. Deshingkar

    2006-09-14

    We propose a new scheme for extracting gravitational radiation from a characteristic numerical simulation of a spacetime. This method is similar in conception to our earlier work but analytical and numerical implementation is different. The scheme is based on direct transformation to the Bondi coordinates and the gravitational waves are extracted by calculating the Bondi news function in Bondi coordinates. The entire calculation is done in a way which will make the implementation easy when we use uniform Bondi angular grid at $\\mathcal I^+$. Using uniform Bondi grid for news calculation has added advantage that we have to solve only ordinary differential equations instead of partial differential equation. For the test problems this new scheme allows us to extract gravitational radiation much more accurately than the previous schemes.

  17. Guidelines for accurate TOD measurement

    NASA Astrophysics Data System (ADS)

    Bijl, Piet; Valeton, J. M.

    1999-07-01

    Guidelines to perform Triangle Orientation Discrimination (TOD) measurements are given in the present paper. The optimal range of test pattern sizes and contrasts are specified, as well as the required number of presentations for a threshold estimate. Special attention is paid to the statistical analysis. A standard frequency-of-serving curve is fitted to the observer data in order to obtain 75%- correct thresholds. A (chi) 2-statistic provides an objective criterion for acceptance or rejection of the threshold estimates. Finally, a complete TOD curve is obtained by fitting a weighted least-square polynomial through the 75%-correct thresholds. Further, a simple Go- NoGo screening procedure with objective pass/fail criteria, based on the TOD methodology, is proposed. With the TOD methodology, accurate sensor performance measured and Go- NoGo testing have become very easy to carry out. Therefore, the investment in a thoroughly design measurement setup will apply itself back easily.

  18. A new generation patient classification system.

    PubMed

    Giovannetti, P; Johnson, J M

    1990-05-01

    The traditional approaches to monitoring reliability and validity are time consuming and costly and frequently lead to the abandonment of reasonable classification instruments. With the advent of micro computers and relational database software, nursing executives are now able to track reliability and validity on an ongoing basis. Because patient classification data are retained, they can be used to provide more accurate work load predictions and descriptions. Further, the ability to identify the source of problems with unit-specific instruments or nurse classifiers leads to the required corrective mechanism. Increased confidence in the reliability and validity of the work load data provides the staff nurse with direct evidence of the accuracy of the patient classification instrument and the nursing executive with a significantly stronger tool in budget planning and tracking, contract negotiations and financial risk reduction. PMID:2335788

  19. DNA sequence support for a close phylogenetic relationship between some storks and New World vultures.

    PubMed Central

    Avise, J C; Nelson, W S; Sibley, C G

    1994-01-01

    Nucleotide sequences from the mitochondrial cytochrome b gene were used to address a controversial suggestion that New World vultures are related more closely to storks than to Old World vultures. Phylogenetic analyses of 1-kb sequences from 18 relevant avian species indicate that the similarities in morphology and behavior between New World and Old World vultures probably manifest convergent adaptations associated with carrion-feeding, rather than propinquity of descent. Direct sequence evidence for a close phylogenetic alliance between at least some New World vultures and storks lends support to conclusions reached previously from DNA.DNA hybridization methods and detailed morphology-based appraisals, and it illustrates how mistaken assumptions of homology for organismal adaptations can compromise biological classifications. However, there was a lack of significant resolution for most other branches in the cytochrome b phylogenetic reconstructions. This irresolution is most likely attributable to a close temporal clustering of nodes, rather than to ceiling effects (mutational saturation) producing an inappropriate window of resolution for the cytochrome b sequences. Images PMID:8197203

  20. Diversity of Clonostachys species assessed by molecular phylogenetics and MALDI-TOF mass spectrometry.

    PubMed

    Abreu, Lucas M; Moreira, Gláucia M; Ferreira, Douglas; Rodrigues-Filho, Edson; Pfenning, Ludwig H

    2014-12-01

    We assessed the species diversity among 45 strains of Clonostachys from different substrates and localities in Brazil using molecular phylogenetics, and compared the results with the phenotypic classification of strains obtained from matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS). Phylogenetic analyses were based on beta tubulin (Tub), ITS-LSU rDNA, and a combined Tub-ITS DNA dataset. MALDI-TOF MS analyses were performed using intact conidia and conidiophores of strains cultivated on oatmeal agar and 4% malt extract agar. Six known species were identified: Clonostachys byssicola, Clonostachys candelabrum, Clonostachys pseudochroleuca, Clonostachys rhizophaga, Clonostachys rogersoniana, and Clonostachys rosea. Two clades and two singleton lineages did not correspond to known species represented in the reference DNA dataset and were identified as Clonostachys sp. 1-4. Multivariate cluster analyses of MALDI-TOF MS data classified the strains into eight clusters and three singletons, corresponding to the ten identified species plus one additional cluster containing two strains of C. rogersoniana that split from the other co-specific strains. The consistent results of MALDI-TOF MS supported the identification of strains assigned to C. byssicola and C. pseudochroleuca, which did not form well supported clades in all phylogenetic analyses, but formed distinct clusters in the MALDI-TOF dendrograms. PMID:25457948

  1. mtDNA Diversity and Phylogenetic State of Korean Cattle Breed, Chikso

    PubMed Central

    Kim, Jae-Hwan; Byun, Mi Jeong; Kim, Myung-Jick; Suh, Sang Won; Ko, Yeoung-Gyu; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Yu, Dae Jung; Kim, Woo Hyun; Choi, Seong-Bok

    2013-01-01

    In order to analyze the genetic diversity and phylogenetic status of the Korean Chikso breed, we determined sequences of mtDNA cytochrome b (cyt b) gene and performed phylogenetic analysis using 239 individuals from 5 Chikso populations. Five non-synonymous mutations of a total of 15 polymorphic sites were identified among 239 cyt b coding sequences. Thirteen haplotypes were defined, and haplotype diversity was 0.4709 ranging from 0.2577 to 0.6114. Thirty-five haplotypes (C1–C35) were classified among 9 Asia and 3 European breeds. C2 was a major haplotype that contained 206 sequences (64.6%) from all breeds used. C3–C13 haplotypes were Chikso-specific haplotypes. C1 and C2 haplotypes contained 80.5% of cyt b sequences of Hanwoo, Yanbian, Zaosheng and JB breeds. In phylogenetic analyses, the Chikso breed was contained into B. taurus lineage and was genetically more closely related to two Chinese breeds than to Korean brown cattle, Hanwoo. These results suggest that Chikso and Hanwoo have a genetic difference based on the mtDNA cyt b gene as well as their coat color, sufficient for classification as a separate breed. PMID:25049772

  2. Molecular phylogenetics unveils the ancient evolutionary origins of the enigmatic fairy armadillos.

    PubMed

    Delsuc, Frédéric; Superina, Mariella; Tilak, Marie-Ka; Douzery, Emmanuel J P; Hassanin, Alexandre

    2012-02-01

    Fairy armadillos or pichiciegos (Xenarthra, Dasypodidae) are among the most elusive mammals. Due to their subterranean and nocturnal lifestyle, their basic biology and evolutionary history remain virtually unknown. Two distinct species with allopatric distributions are recognized: Chlamyphorus truncatus is restricted to central Argentina, while Calyptophractus retusus occurs in the Gran Chaco of Argentina, Paraguay, and Bolivia. To test their monophyly and resolve their phylogenetic affinities within armadillos, we obtained sequence data from modern and museum specimens for two mitochondrial genes (12S RNA [MT-RNR1] and NADH dehydrogenase 1 [MT-ND1]) and two nuclear exons (breast cancer 1 early onset exon 11 [BRCA1] and von Willebrand factor exon 28 [VWF]). Phylogenetic analyses provided a reference phylogeny and timescale for living xenarthran genera. Our results reveal monophyletic pichiciegos as members of a major armadillo subfamily (Chlamyphorinae). Their strictly fossorial lifestyle probably evolved as a response to the Oligocene aridification that occurred in South America after their divergence from Tolypeutinae around 32 million years ago (Mya). The ancient divergence date (?17Mya) for separation between the two species supports their taxonomic classification into distinct genera. The synchronicity with Middle Miocene marine incursions along the Paraná river basin suggests a vicariant origin for pichiciegos by the disruption of their ancestral range. Their phylogenetic distinctiveness and rarity in the wild argue in favor of high conservation priority. PMID:22122941

  3. mtDNA Diversity and Phylogenetic State of Korean Cattle Breed, Chikso.

    PubMed

    Kim, Jae-Hwan; Byun, Mi Jeong; Kim, Myung-Jick; Suh, Sang Won; Ko, Yeoung-Gyu; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Yu, Dae Jung; Kim, Woo Hyun; Choi, Seong-Bok

    2013-02-01

    In order to analyze the genetic diversity and phylogenetic status of the Korean Chikso breed, we determined sequences of mtDNA cytochrome b (cyt b) gene and performed phylogenetic analysis using 239 individuals from 5 Chikso populations. Five non-synonymous mutations of a total of 15 polymorphic sites were identified among 239 cyt b coding sequences. Thirteen haplotypes were defined, and haplotype diversity was 0.4709 ranging from 0.2577 to 0.6114. Thirty-five haplotypes (C1-C35) were classified among 9 Asia and 3 European breeds. C2 was a major haplotype that contained 206 sequences (64.6%) from all breeds used. C3-C13 haplotypes were Chikso-specific haplotypes. C1 and C2 haplotypes contained 80.5% of cyt b sequences of Hanwoo, Yanbian, Zaosheng and JB breeds. In phylogenetic analyses, the Chikso breed was contained into B. taurus lineage and was genetically more closely related to two Chinese breeds than to Korean brown cattle, Hanwoo. These results suggest that Chikso and Hanwoo have a genetic difference based on the mtDNA cyt b gene as well as their coat color, sufficient for classification as a separate breed. PMID:25049772

  4. A phylogenetic analysis of Prunus and the Amygdaloideae (Rosaceae) using ITS sequences of nuclear ribosomal DNA.

    PubMed

    Lee, S; Wen, J

    2001-01-01

    The economically important plum or cherry genus (PRUNUS:) and the subfamily Amygdaloideae of the Rosaceae have a controversial taxonomic history due to the lack of a phylogenetic framework. Phylogenetic analysis using the ITS sequences of nuclear ribosomal DNA (nrDNA) was conducted to construct the evolutionary history and evaluate the historical classifications of PRUNUS: and the Amygdaloideae. The analyses suggest two major groups within the Amygdaloideae: (1) PRUNUS: s.l. (sensu lato) and MADDENIA:, and (2) EXOCHORDA:, Oemleria, and PRINSEPIA: The ITS phylogeny supports the recent treatment of including EXOCHORDA: (formerly in the Spiraeoideae) in the Amygdaloideae. MADDENIA: is found to be nested within PRUNUS: s.l. in the parsimony and distance analyses, but basal to PRUNUS: s.l. in the maximum likelihood analysis. Within PRUNUS:, two major groups are recognizable: (1) the AMYGDALUS:-PRUNUS: group, and (2) the CERASUS:-LAUROCERASUS:-PADUS: group. The clades in the ITS phylogeny are not congruent with most subgeneric groups in the widely used classification of PRUNUS: by Rehder. A broadly defined PRUNUS: is supported. PMID:11159135

  5. [Sequence variation of mitochondrial cytochrome b gene and phylogenetic relationships among twelve species of Charadriiformes].

    PubMed

    Chen, Xiao-Fang; Wang, Xiang; Yuan, Xiao-Dong; Tang, Min-Qian; Li, Yu-Xiang; Guo, Yu-Mei; Li, Qing-Wei

    2003-05-01

    Studies of the phylogenetic relationships of the Charadriiformes have been largely based on conservative morphological characters. During the past 10 years, many studies on the evolutionary biology of birds adopted phylogenetic information obtained from mitochondrial DNA, but few work on the Charadriiformes has been reported to date. Therefore, phylogenetic relationships and classification of the Charadriiformes remains controversial. In this study, we try to shed light on these relationships via DNA sequence analysis of the mitochondrial Cyt b gene in 12 species of Charadriiformes. It was a preliminary study of the origin and evolution of the species by using nucleotide sequence data. Using the well-known PCR techniques, the complete mitochondrial Cyt b gene sequences were amplified and sequenced respectively from Charadrius mongolus, Charadrius alexandrinus, Numenius madagascariensis, Numenius arquat, Numenius phaeopus, Tringa totanus, Tringa glareola, Xenus cineres, Arenaria interpres, Calidris tenuirostris, Recurvirostra avosetts and Haematopus ostralensis. The 1143 bp long DNA sequences of the gene from these species were obtained, in which 381 variable sites were identified without insertions or deletions. The nucleic acid sequence variation of the mitochondrial Cyt b gene was 5.16%-16.01% among these species. Phylogenetic trees constructed using the NJ method, MP method and ML method with Ciconia ciconia as the outgroup indicate that the 12 species of Charadriiformes examined in this study are clustered in two major clades. The first clade includes T. totanus, T. glareola, A. interpres, C. tenuirostris, X. cineres, N. madagascariensis, N. arquata and N. phaeopus. The second one includes C. mongolus, C. alexandrinus, R. avosetts and H. ostralensis. Our molecular data show that the phylogenetic relationships among species of Scolopacidae are consistent with the classification based on morphological studies; R. avosetts and H. ostralensis are relatively closer, and form a sister group, and then form paraphyletic group with a sister group which comprised of C. mongolus and C. alexandrinus. The results support Sibley's opinion of assigning R. avosetts and H. ostralensis which form Recurvirostrinae as a taxon of the Charadriidae, and the Charadriidae dividing into two subfamilies: Recurvirostrinae and Charadriinae respectively. PMID:12924155

  6. Phylogenetic and Biological Significance of Evolutionary Elements from Metazoan Mitochondrial Genomes

    PubMed Central

    Yuan, Jianbo; Zhu, Qingming; Liu, Bin

    2014-01-01

    The evolutionary history of living species is usually inferred through the phylogenetic analysis of molecular and morphological information using various mathematical models. New challenges in phylogenetic analysis are centered mostly on the search for accurate and efficient methods to handle the huge amounts of sequence data generated from newer genome sequencing. The next major challenge is the determination of relationships between the evolution of structural elements and their functional implementation, which is largely ignored in previous analyses. Here, we described the discovery of structural elements in metazoan mitochondrial genomes, termed key K-strings, that can serve as a basis for phylogenetic tree construction. Although comprising only a small fraction (0.73%) of all K-strings, these key K-strings are pivotal to the tree construction because they allow for a significant reduction in the computational time required to construct phylogenetic trees, and more importantly, they make significant improvement to the results of phylogenetic inference. The trees constructed from the key K-strings were consistent overall to our current view of metazoan phylogeny and exhibited a more rational topology than the trees constructed by using other conventional methods. Surprisingly, the key K-strings tended to accumulate in the conserved regions of the original sequences, which were most likely due to strong selection pressure. Furthermore, the special structural features of the key K-strings should have some potential applications in the study of the structures and functions relationship of proteins and in the determination of evolutionary trajectory of species. The novelty and potential importance of key K-strings lead us to believe that they are essential evolutionary elements. As such, they may play important roles in the process of species evolution and their physical existence. Further studies could lead to discoveries regarding the relationship between evolution and processes of speciation. PMID:24465405

  7. Phylogenetic and biological significance of evolutionary elements from metazoan mitochondrial genomes.

    PubMed

    Yuan, Jianbo; Zhu, Qingming; Liu, Bin

    2014-01-01

    The evolutionary history of living species is usually inferred through the phylogenetic analysis of molecular and morphological information using various mathematical models. New challenges in phylogenetic analysis are centered mostly on the search for accurate and efficient methods to handle the huge amounts of sequence data generated from newer genome sequencing. The next major challenge is the determination of relationships between the evolution of structural elements and their functional implementation, which is largely ignored in previous analyses. Here, we described the discovery of structural elements in metazoan mitochondrial genomes, termed key K-strings, that can serve as a basis for phylogenetic tree construction. Although comprising only a small fraction (0.73%) of all K-strings, these key K-strings are pivotal to the tree construction because they allow for a significant reduction in the computational time required to construct phylogenetic trees, and more importantly, they make significant improvement to the results of phylogenetic inference. The trees constructed from the key K-strings were consistent overall to our current view of metazoan phylogeny and exhibited a more rational topology than the trees constructed by using other conventional methods. Surprisingly, the key K-strings tended to accumulate in the conserved regions of the original sequences, which were most likely due to strong selection pressure. Furthermore, the special structural features of the key K-strings should have some potential applications in the study of the structures and functions relationship of proteins and in the determination of evolutionary trajectory of species. The novelty and potential importance of key K-strings lead us to believe that they are essential evolutionary elements. As such, they may play important roles in the process of species evolution and their physical existence. Further studies could lead to discoveries regarding the relationship between evolution and processes of speciation. PMID:24465405

  8. Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships

    PubMed Central

    2014-01-01

    Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia. PMID:25001059

  9. Highly Incomplete Taxa Can Rescue Phylogenetic Analyses from the Negative Impacts of Limited Taxon Sampling

    PubMed Central

    Wiens, John J.; Tiu, Jonathan

    2012-01-01

    Background Phylogenies are essential to many areas of biology, but phylogenetic methods may give incorrect estimates under some conditions. A potentially common scenario of this type is when few taxa are sampled and terminal branches for the sampled taxa are relatively long. However, the best solution in such cases (i.e., sampling more taxa versus more characters) has been highly controversial. A widespread assumption in this debate is that added taxa must be complete (no missing data) in order to save analyses from the negative impacts of limited taxon sampling. Here, we evaluate whether incomplete taxa can also rescue analyses under these conditions (empirically testing predictions from an earlier simulation study). Methodology/Principal Findings We utilize DNA sequence data from 16 vertebrate species with well-established phylogenetic relationships. In each replicate, we randomly sample 4 species, estimate their phylogeny (using Bayesian, likelihood, and parsimony methods), and then evaluate whether adding in the remaining 12 species (which have 50, 75, or 90% of their data replaced with missing data cells) can improve phylogenetic accuracy relative to analyzing the 4 complete taxa alone. We find that in those cases where sampling few taxa yields an incorrect estimate, adding taxa with 50% or 75% missing data can frequently (>75% of relevant replicates) rescue Bayesian and likelihood analyses, recovering accurate phylogenies for the original 4 taxa. Even taxa with 90% missing data can sometimes be beneficial. Conclusions We show that adding taxa that are highly incomplete can improve phylogenetic accuracy in cases where analyses are misled by limited taxon sampling. These surprising empirical results confirm those from simulations, and show that the benefits of adding taxa may be obtained with unexpectedly small amounts of data. These findings have important implications for the debate on sampling taxa versus characters, and for studies attempting to resolve difficult phylogenetic problems. PMID:22900065

  10. Hierarchical Markov random-field modeling for texture classification in chest radiographs

    NASA Astrophysics Data System (ADS)

    Vargas-Voracek, Rene; Floyd, Carey E., Jr.; Nolte, Loren W.; McAdams, Page

    1996-04-01

    A hierarchical Markov random field (MRF) modeling approach is presented for the classification of textures in selected regions of interest (ROIs) of chest radiographs. The procedure integrates possible texture classes and their spatial definition with other components present in an image such as noise and background trend. Classification is performed as a maximum a-posteriori (MAP) estimation of texture class and involves an iterative Gibbs- sampling technique. Two cases are studied: classification of lung parenchyma versus bone and classification of normal lung parenchyma versus miliary tuberculosis (MTB). Accurate classification was obtained for all examined cases showing the potential of the proposed modeling approach for texture analysis of radiographic images.

  11. Sparse Representation for Tumor Classification Based on Feature Extraction Using Latent Low-Rank Representation

    PubMed Central

    Zheng, Chun-Hou; Zhang, Jun; Wang, Hong-Qiang

    2014-01-01

    Accurate tumor classification is crucial to the proper treatment of cancer. To now, sparse representation (SR) has shown its great performance for tumor classification. This paper conceives a new SR-based method for tumor classification by using gene expression data. In the proposed method, we firstly use latent low-rank representation for extracting salient features and removing noise from the original samples data. Then we use sparse representation classifier (SRC) to build tumor classification model. The experimental results on several real-world data sets show that our method is more efficient and more effective than the previous classification methods including SVM, SRC, and LASSO. PMID:24678505

  12. Security classification of information

    SciTech Connect

    Quist, A.S.

    1993-04-01

    This document is the second of a planned four-volume work that comprehensively discusses the security classification of information. The main focus of Volume 2 is on the principles for classification of information. Included herein are descriptions of the two major types of information that governments classify for national security reasons (subjective and objective information), guidance to use when determining whether information under consideration for classification is controlled by the government (a necessary requirement for classification to be effective), information disclosure risks and benefits (the benefits and costs of classification), standards to use when balancing information disclosure risks and benefits, guidance for assigning classification levels (Top Secret, Secret, or Confidential) to classified information, guidance for determining how long information should be classified (classification duration), classification of associations of information, classification of compilations of information, and principles for declassifying and downgrading information. Rules or principles of certain areas of our legal system (e.g., trade secret law) are sometimes mentioned to .provide added support to some of those classification principles.

  13. Recursive heuristic classification

    NASA Technical Reports Server (NTRS)

    Wilkins, David C.

    1994-01-01

    The author will describe a new problem-solving approach called recursive heuristic classification, whereby a subproblem of heuristic classification is itself formulated and solved by heuristic classification. This allows the construction of more knowledge-intensive classification programs in a way that yields a clean organization. Further, standard knowledge acquisition and learning techniques for heuristic classification can be used to create, refine, and maintain the knowledge base associated with the recursively called classification expert system. The method of recursive heuristic classification was used in the Minerva blackboard shell for heuristic classification. Minerva recursively calls itself every problem-solving cycle to solve the important blackboard scheduler task, which involves assigning a desirability rating to alternative problem-solving actions. Knowing these ratings is critical to the use of an expert system as a component of a critiquing or apprenticeship tutoring system. One innovation of this research is a method called dynamic heuristic classification, which allows selection among dynamically generated classification categories instead of requiring them to be prenumerated.

  14. Molecular phylogenetics of mastodon and Tyrannosaurus rex.

    PubMed

    Organ, Chris L; Schweitzer, Mary H; Zheng, Wenxia; Freimark, Lisa M; Cantley, Lewis C; Asara, John M

    2008-04-25

    We report a molecular phylogeny for a nonavian dinosaur, extending our knowledge of trait evolution within nonavian dinosaurs into the macromolecular level of biological organization. Fragments of collagen alpha1(I) and alpha2(I) proteins extracted from fossil bones of Tyrannosaurus rex and Mammut americanum (mastodon) were analyzed with a variety of phylogenetic methods. Despite missing sequence data, the mastodon groups with elephant and the T. rex groups with birds, consistent with predictions based on genetic and morphological data for mastodon and on morphological data for T. rex. Our findings suggest that molecular data from long-extinct organisms may have the potential for resolving relationships at critical areas in the vertebrate evolutionary tree that have, so far, been phylogenetically intractable. PMID:18436782

  15. How accurate is Limber's equation?

    E-print Network

    P. Simon

    2007-08-24

    The so-called Limber equation is widely used in the literature to relate the projected angular clustering of galaxies to the spatial clustering of galaxies in an approximate way. This paper gives estimates of where the regime of applicability of Limber's equation stops. Limber's equation is accurate for small galaxy separations but breaks down beyond a certain separation that depends mainly on the ratio sigma/R and to some degree on the power-law index, gamma, of spatial clustering xi; sigma is the one-sigma width of the galaxy distribution in comoving distance, and R the mean comoving distance. As rule-of-thumb, a 10% relative error is reached at 260 sigma/R arcmin for gamma~1.6, if the spatial clustering is a power-law. More realistic xi are discussed in the paper. Limber's equation becomes increasingly inaccurate for larger angular separations. Ignoring this effect and blindly applying Limber's equation can possibly bias results for the inferred spatial correlation. It is suggested to use in cases of doubt, or maybe even in general, the exact equation that can easily be integrated numerically in the form given in the paper.

  16. How Accurate Is Peer Grading?

    PubMed Central

    Parks, John W.

    2010-01-01

    Previously we showed that weekly, written, timed, and peer-graded practice exams help increase student performance on written exams and decrease failure rates in an introductory biology course. Here we analyze the accuracy of peer grading, based on a comparison of student scores to those assigned by a professional grader. When students graded practice exams by themselves, they were significantly easier graders than a professional; overall, students awarded ?25% more points than the professional did. This difference represented ?1.33 points on a 10-point exercise, or 0.27 points on each of the five 2-point questions posed. When students graded practice exams as a group of four, the same student-expert difference occurred. The student-professional gap was wider for questions that demanded higher-order versus lower-order cognitive skills. Thus, students not only have a harder time answering questions on the upper levels of Bloom's taxonomy, they have a harder time grading them. Our results suggest that peer grading may be accurate enough for low-risk assessments in introductory biology. Peer grading can help relieve the burden on instructional staff posed by grading written answers—making it possible to add practice opportunities that increase student performance on actual exams. PMID:21123695

  17. Intraregional classification of wine via ICP-MS elemental fingerprinting.

    PubMed

    Coetzee, P P; van Jaarsveld, F P; Vanhaecke, F

    2014-12-01

    The feasibility of elemental fingerprinting in the classification of wines according to their provenance vineyard soil was investigated in the relatively small geographical area of a single wine district. Results for the Stellenbosch wine district (Western Cape Wine Region, South Africa), comprising an area of less than 1,000 km(2), suggest that classification of wines from different estates (120 wines from 23 estates) is indeed possible using accurate elemental data and multivariate statistical analysis based on a combination of principal component analysis, cluster analysis, and discriminant analysis. This is the first study to demonstrate the successful classification of wines at estate level in a single wine district in South Africa. The elements B, Ba, Cs, Cu, Mg, Rb, Sr, Tl and Zn were identified as suitable indicators. White and red wines were grouped in separate data sets to allow successful classification of wines. Correlation between wine classification and soil type distributions in the area was observed. PMID:24996361

  18. Classification of earth terrain using polarimetric synthetic aperture radar images

    NASA Technical Reports Server (NTRS)

    Lim, H. H.; Swartz, A. A.; Yueh, H. A.; Kong, J. A.; Shin, R. T.; Van Zyl, J. J.

    1989-01-01

    Supervised and unsupervised classification techniques are developed and used to classify the earth terrain components from SAR polarimetric images of San Francisco Bay and Traverse City, Michigan. The supervised techniques include the Bayes classifiers, normalized polarimetric classification, and simple feature classification using discriminates such as the absolute and normalized magnitude response of individual receiver channel returns and the phase difference between receiver channels. An algorithm is developed as an unsupervised technique which classifies terrain elements based on the relationship between the orientation angle and the handedness of the transmitting and receiving polariation states. It is found that supervised classification produces the best results when accurate classifier training data are used, while unsupervised classification may be applied when training data are not available.

  19. An aetiological classification of birth defects for epidemiological research

    PubMed Central

    Wellesley, D; Boyd, P; Dolk, H; Pattenden, S

    2005-01-01

    Background: Congenital anomaly registers collect data on antenatally and postnatally detected anomalies for surveillance, research, and public health purposes. Each anomaly is coded using the International Statistical Classification of Diseases and Related Health Problems (ICD-9/ICD-10) based on body systems, allowing accurate comparisons between registers for individual anomalies. When commencing an environmental, epidemiological study, it became clear to us that there is no standard classification that takes aetiology into account. This paper describes a new classification for use in studies addressing aetiology. Method: A classification system was evolved and piloted using cases in a study of geographical variation in congenital anomaly prevalence.1 Cases that were difficult to categorise were noted, and after discussion with a team of experts, the classification was adjusted accordingly. Results and conclusion: A robust, hierarchical method of classifying birth defects into eight categories has been produced, for use at source of data registration in conjunction with, but independent of, ICD coding. PMID:15635076

  20. The revised lung adenocarcinoma classification-an imaging guide.

    PubMed

    Gardiner, Natasha; Jogai, Sanjay; Wallis, Adam

    2014-10-01

    Advances in our understanding of the pathology, radiology and clinical behaviour of peripheral lung adenocarcinomas facilitated a more robust terminology and classification of these lesions. The International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society (IASLC/ATS/ERS) classification introduced new terminology to better reflect this heterogeneous group of adenocarcinomas formerly known as bronchoalveolar cell carcinoma (BAC). There is now a clear distinction between pre-invasive, minimally invasive and frankly invasive lesions. The radiographic appearance of these ranges from pure ground glass nodules to solid mass lesions. Radiologists must be aware of the new classification in order to work alongside multidisciplinary colleagues to allow accurate staging and treatment. This article reviews the new classification of lung adenocarcinomas. Management options of these lesions with particular focus on radiological implications of the new classification will be reviewed. PMID:25349704

  1. Genetic and phylogenetic clustering of enteroviruses

    Microsoft Academic Search

    T. Poyry; L. Kinnunen; T. Hyypia; B. Brown; C. Horsnell; T. Hovi; G. Stanway

    1996-01-01

    Genetic and phylogenetic analysis of enteroviruses showed that in the 5'NCR enteroviruses formed three clusters: polioviruses (PVs), coxsackievirus A type 21 (CAV21), CAV24 and enterovirus type 70 (ENV70) formed one cluster; coxsackievirus B isolates (CBVs), CAV9, CAV16, ENV71, echovirus type 11 (EV11), EV12 and all partially sequenced echoviruses and swine vesicular disease virus (SVDV) belonged to another cluster and bovine

  2. The phylogenetic position of Udonella (Platyhelminthes) 1

    Microsoft Academic Search

    D. T. J. Littlewood; K. Rohde; K. A. Clough

    1998-01-01

    Phylogenetic analysis of molecular data from complete 18S rRNA and partial 28S rRNA genes, of a variety of platyhelminths, places the enigmatic Udonella caligorum firmly as a monopisthocotylean monogenean. Both maximum parsimony and a modified distance measure, operating under a maximum likelihood model, gave identical solutions for each data set. These data further support morphological evidence from ultrastructural studies indicating

  3. Topology improves phylogenetic motif functional site predictions.

    PubMed

    Kc, Dukka B; Livesay, Dennis R

    2011-01-01

    Prediction of protein functional sites from sequence-derived data remains an open bioinformatics problem. We have developed a phylogenetic motif (PM) functional site prediction approach that identifies functional sites from alignment fragments that parallel the evolutionary patterns of the family. In our approach, PMs are identified by comparing tree topologies of each alignment fragment to that of the complete phylogeny. Herein, we bypass the phylogenetic reconstruction step and identify PMs directly from distance matrix comparisons. In order to optimize the new algorithm, we consider three different distance matrices and 13 different matrix similarity scores. We assess the performance of the various approaches on a structurally nonredundant data set that includes three types of functional site definitions. Without exception, the predictive power of the original approach outperforms the distance matrix variants. While the distance matrix methods fail to improve upon the original approach, our results are important because they clearly demonstrate that the improved predictive power is based on the topological comparisons. Meaning that phylogenetic trees are a straightforward, yet powerful way to improve functional site prediction accuracy. While complementary studies have shown that topology improves predictions of protein-protein interactions, this report represents the first demonstration that trees improve functional site predictions as well. PMID:21071810

  4. Phylogenetic overdispersion in Floridian oak communities.

    PubMed

    Cavender-Bares, J; Ackerly, D D; Baum, D A; Bazzaz, F A

    2004-06-01

    Closely related species that occur together in communities and experience similar environmental conditions are likely to share phenotypic traits because of the process of environmental filtering. At the same time, species that are too similar are unlikely to co-occur because of competitive exclusion. In an effort to explain the coexistence of 17 oak species within forest communities in North Central Florida, we examined correlations between the phylogenetic relatedness of oak species, their degree of co-occurrence within communities and niche overlap across environmental gradients, and their similarity in ecophysiological and life-history traits. We show that the oaks are phylogenetically overdispersed because co-occurring species are more distantly related than expected by chance, and oaks within the same clade show less niche overlap than expected. Hence, communities are more likely to include members of both the red oak and the white + live oak clades than only members of one clade. This pattern of phylogenetic overdispersion arises because traits important for habitat specialization show evolutionary convergence. We hypothesize further that certain conserved traits permit coexistence of distantly related congeners. These results provide an explanation for how oak diversity is maintained at the community level in North Central Florida. PMID:15266381

  5. Exploring hierarchical visualization designs using phylogenetic trees

    NASA Astrophysics Data System (ADS)

    Li, Shaomeng; Crouser, R. Jordan; Griffin, Garth; Gramazio, Connor; Schulz, Hans-Jörg; Childs, Hank; Chang, Remco

    2015-01-01

    Ongoing research on information visualization has produced an ever-increasing number of visualization designs. Despite this activity, limited progress has been made in categorizing this large number of information visualizations. This makes understanding their common design features challenging, and obscures the yet unexplored areas of novel designs. With this work, we provide categorization from an evolutionary perspective, leveraging a computational model to represent evolutionary processes, the phylogenetic tree. The result - a phylogenetic tree of a design corpus of hierarchical visualizations - enables better understanding of the various design features of hierarchical information visualizations, and further illuminates the space in which the visualizations lie, through support for interactive clustering and novel design suggestions. We demonstrate these benefits with our software system, where a corpus of two-dimensional hierarchical visualization designs is constructed into a phylogenetic tree. This software system supports visual interactive clustering and suggesting for novel designs; the latter capacity is also demonstrated via collaboration with an artist who sketched new designs using our system.

  6. Phylogenetic relationships in Brassicaceae tribe Alysseae inferred from nuclear ribosomal and chloroplast DNA sequence data.

    PubMed

    Rešetnik, Ivana; Satovic, Zlatko; Schneeweiss, Gerald M; Liber, Zlatko

    2013-12-01

    Numerous molecular systematic studies within Brassicaceae have resulted in a strongly improved classification of the family, as morphologically defined units at and above the generic level were often found to poorly reflect phylogenetic relationships. Here, we focus on tribe Alysseae, which despite its size (accounting for about 7% of all species) has only received limited coverage in previous phylogenetic studies. Specifically, we want to test phylogenetic hypotheses implied by current tribal and generic circumscriptions and to put diversification within tribe Alysseae into a temporal context. To this end, sequence data from the nrDNA ITS and two plastid regions (ndhF gene, trnL-F intergenic spacer) were obtained for 176 accessions, representing 16 out of 17 currently recognized genera of the tribe, and were phylogenetically analysed, among others, using a relaxed molecular clock. Due to large discrepancies with respect to published ages of Brassicaceae, age estimates concerning Alysseae are, however, burdened with considerable uncertainty. The tribe is monophyletic and contains four strongly supported major clades and Alyssum homalocarpum, whose relationships among each other remain uncertain due to incongruences between nuclear and plastid DNA markers. The largest genus of the tribe, Alyssum, is not monophyletic and contains, apart from A. homalocarpum, two distinct lineages, corresponding to sections Alyssum, Psilonema, Gamosepalum and to sections Odontarrhena and Meniocus, respectively. Clypeola, whose monophyly is supported only by the plastid data, is very closely related to and possibly nested within the second Alyssum lineage. Species of the genus Fibigia intermingle with those of Alyssoides, Clastopus, Degenia, and Physoptychis, rendering Fibigia polyphyletic. The monotypic genera Leptoplax and Physocardamum are embedded in Bornmuellera. PMID:23850498

  7. Classification of Stellar Spectra

    NASA Astrophysics Data System (ADS)

    Garrison, R.; Murdin, P.

    2000-11-01

    How does a scientist approach the problem of trying to understand countless billions of objects? One of the first steps is to organize the data and set up a classification scheme which can provide the best insights into the nature of the objects. Perception and insight are the main purposes of classification. In astronomy, where there are `billions and billions' of stars, classification is an ong...

  8. Classiology and soil classification

    NASA Astrophysics Data System (ADS)

    Rozhkov, V. A.

    2012-03-01

    Classiology can be defined as a science studying the principles and rules of classification of objects of any nature. The development of the theory of classification and the particular methods for classifying objects are the main challenges of classiology; to a certain extent, they are close to the challenges of pattern recognition. The methodology of classiology integrates a wide range of methods and approaches: from expert judgment to formal logic, multivariate statistics, and informatics. Soil classification assumes generalization of available data and practical experience, formalization of our notions about soils, and their representation in the form of an information system. As an information system, soil classification is designed to predict the maximum number of a soil's properties from the position of this soil in the classification space. The existing soil classification systems do not completely satisfy the principles of classiology. The violation of logical basis, poor structuring, low integrity, and inadequate level of formalization make these systems verbal schemes rather than classification systems sensu stricto. The concept of classification as listing (enumeration) of objects makes it possible to introduce the notion of the information base of classification. For soil objects, this is the database of soil indices (properties) that might be applied for generating target-oriented soil classification system. Mathematical methods enlarge the prognostic capacity of classification systems; they can be applied to assess the quality of these systems and to recognize new soil objects to be included in the existing systems. The application of particular principles and rules of classiology for soil classification purposes is discussed in this paper.

  9. Ecology/Geography Classification

    NSDL National Science Digital Library

    Brianne Meick

    This short lesson was designed in collaboration with a 7th grade Life Science teacher (Paul Jeffery). The idea behind the lesson is to help students better understand ecological and geographical classifications by teaching them at the same time in their Life Science class and their Geography class. Teaching the two classifications together will help reinforce the idea of classification. While this lesson would best be taught outdoors it can also be adapted to the indoors.

  10. Biomarker selection and classification of "-omics" data using a two-step bayes classification framework.

    PubMed

    Assawamakin, Anunchai; Prueksaaroon, Supakit; Kulawonganunchai, Supasak; Shaw, Philip James; Varavithya, Vara; Ruangrajitpakorn, Taneth; Tongsima, Sissades

    2013-01-01

    Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly. Here, a novel two-step machine-learning framework is presented to address this need. First, a Naïve Bayes estimator is used to rank features from which the top-ranked will most likely contain the most informative features for prediction of the underlying biological classes. The top-ranked features are then used in a Hidden Naïve Bayes classifier to construct a classification prediction model from these filtered attributes. In order to obtain the minimum set of the most informative biomarkers, the bottom-ranked features are successively removed from the Naïve Bayes-filtered feature list one at a time, and the classification accuracy of the Hidden Naïve Bayes classifier is checked for each pruned feature set. The performance of the proposed two-step Bayes classification framework was tested on different types of -omics datasets including gene expression microarray, single nucleotide polymorphism microarray (SNParray), and surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) proteomic data. The proposed two-step Bayes classification framework was equal to and, in some cases, outperformed other classification methods in terms of prediction accuracy, minimum number of classification markers, and computational time. PMID:24106694

  11. INVENTORY AND CLASSIFICATION OF GREAT LAKES COASTAL WETLANDS FOR MONITORING AND ASSESSMENT AT LARGE SPATIAL SCALES

    EPA Science Inventory

    Monitoring aquatic resources for regional assessments requires an accurate and comprehensive inventory of the resource and useful classification of exosystem similarities. Our research effort to create an electronic database and work with various ways to classify coastal wetlands...

  12. Hyperspectral Data Classification Using Factor Graphs

    NASA Astrophysics Data System (ADS)

    Makarau, A.; Müller, R.; Palubinskas, G.; Reinartz, P.

    2012-07-01

    Accurate classification of hyperspectral data is still a competitive task and new classification methods are developed to achieve desired tasks of hyperspectral data use. The objective of this paper is to develop a new method for hyperspectral data classification ensuring the classification model properties like transferability, generalization, probabilistic interpretation, etc. While factor graphs (undirected graphical models) are unfortunately not widely employed in remote sensing tasks, these models possess important properties such as representation of complex systems to model estimation/decision making tasks. In this paper we present a new method for hyperspectral data classification using factor graphs. Factor graph (a bipartite graph consisting of variables and factor vertices) allows factorization of a more complex function leading to definition of variables (employed to store input data), latent variables (allow to bridge abstract class to data), and factors (defining prior probabilities for spectral features and abstract classes; input data mapping to spectral features mixture and further bridging of the mixture to an abstract class). Latent variables play an important role by defining two-level mapping of the input spectral features to a class. Configuration (learning) on training data of the model allows calculating a parameter set for the model to bridge the input data to a class. The classification algorithm is as follows. Spectral bands are separately pre-processed (unsupervised clustering is used) to be defined on a finite domain (alphabet) leading to a representation of the data on multinomial distribution. The represented hyperspectral data is used as input evidence (evidence vector is selected pixelwise) in a configured factor graph and an inference is run resulting in the posterior probability. Variational inference (Mean field) allows to obtain plausible results with a low calculation time. Calculating the posterior probability for each class and comparison of the probabilities leads to classification. Since the factor graphs operate on input data represented on an alphabet (the represented data transferred into multinomial distribution) the number of training samples can be relatively low. Classification assessment on Salinas hyperspectral data benchmark allowed to obtain a competitive accuracy of classification. Employment of training data consisting of 20 randomly selected points for a class allowed to obtain the overall classification accuracy equal to 85.32% and Kappa equal to 0.8358. Representation of input data on a finite domain discards the curse of dimensionality problem allowing to use large hyperspectral data with a moderately high number of bands.

  13. The classification of lands managed for conservation: existing and proposed frameworks, with particular reference to Australia

    Microsoft Academic Search

    James A. Fitzsimons; Geoff Wescott

    2004-01-01

    Comprehensive classification systems to accurately account for lands managed for biodiversity conservation, are an essential component of conservation planning and policy. The current international classification systems for lands managed for nature conservation are reviewed, with a particular emphasis on Australia. The need for a broader, all-encompassing, categorisation of lands managed for conservation is presented and a proposed broader categorisation system

  14. Molecular classification of Pakistani collared dove through DNA barcoding.

    PubMed

    Awan, Ali Raza; Umar, Emma; Zia ul Haq, Muhammad; Firyal, Sehrish

    2013-11-01

    Pakistan is bestowed by a diversified array of wild bird species including collared doves of which the taxonomy has been least studied and reported. DNA barcoding is a geno-taxonomic tool that has been used for characterization of bird species using mitochondrial cytochrome c oxidase I gene (COI). This study aimed to identify taxonomic order of Pakistani collared dove using DNA barcoding. Purposely herein, we present a phylogenetic analysis of Pakistani collared dove based on 650 base pairs of COI gene sequences. Analysis of phylogenetic tree revealed that Pakistani collared dove shared a common clade with Eurasian collared dove (Streptopelia decaocto) and African collared dove (Streptopelia roseogrisea) which indicated a super-species group in Streptopelia genus. This is the first report of molecular classification of Pakistani collared dove using DNA barcoding. PMID:24072655

  15. CTEP Simplified Disease Classification Overview

    Cancer.gov

    CTEP Simplified Disease Classification Overview The CTEP Simplified Disease Classification (CTEP SDC) v1.0 is a restructured, more intuitive classification of diseases, designed to meet the needs of CTEP while still allowing reporting based on the

  16. Genetic characterization of Indian peste des petits ruminants virus (PPRV) by sequencing and phylogenetic analysis of fusion protein and nucleoprotein gene segments.

    PubMed

    Kerur, N; Jhala, M K; Joshi, C G

    2008-08-01

    Peste des petits ruminants (PPR) is an important viral disease of sheep and goats, endemic in India. The study was undertaken to characterize the local PPRV by sequencing fusion (F) protein and nucleoprotein (N) gene segments and phylogenetic analysis, so as to focus on genetic variation in the field viruses. Selected regions of PPRV genome were amplified from clinical samples collected from 32 sheep and goats by RT-PCR and the resulting amplicons were sequenced for phylogenetic analysis. The phylogenetic tree based on the 322bp F gene sequences of PPRV from five different locations clustered them into lineage 4 along with other Asian isolates. While the 425bp N gene sequences revealed a different pattern of branching, yielding three distinct clusters for Nigerian, Turkey and Indian isolates. Thus, classification of PPRV into lineages based on the N gene sequences appeared to yield better picture of molecular epidemiology for PPRV. PMID:17850836

  17. Classification in Art.

    ERIC Educational Resources Information Center

    DiMaggio, Paul

    1987-01-01

    Proposes a framework for analyzing the relationships between social structure, patterns of artistic consumption and production, and the classification of artistic genres. Societies' artistic classification systems vary along four dimensions: differentiation, hierarchy, universality, and boundary strength. These are affected by social structure…

  18. Library Classification 2020

    ERIC Educational Resources Information Center

    Harris, Christopher

    2013-01-01

    In this article the author explores how a new library classification system might be designed using some aspects of the Dewey Decimal Classification (DDC) and ideas from other systems to create something that works for school libraries in the year 2020. By examining what works well with the Dewey Decimal System, what features should be carried…

  19. [AO classification of fractures].

    PubMed

    Tanaka, Tadashi

    2003-10-01

    In the AO classification, the diagnosis of a fracture is expressed by its location (bone and segment) and morphological features, which is characterized by the hierarchical organization into triad. This classification is comprehensive and very useful in a daily clinical practice. PMID:15775205

  20. DUTY STATEMENT CLASSIFICATION

    E-print Network

    studies, environmental impact reports, and Commission reports. (E) #12;CLASSIFICATION: Planner I - EFS of Environmental Impact Reports submitted to the Commission and prepares assessments of those sections. (M) 5DUTY STATEMENT CLASSIFICATION: Planner I - EFS POSITION NUMBER: 760-4734-XXX CBID: R01 WORKING

  1. Engineering rock mass classifications

    Microsoft Academic Search

    Z. T. Bieniawski

    1989-01-01

    This book is a reference on rock mass classification, consolidating into one handy source information widely scattered through the literature. Includes new, unpublished material and case histories. Presents the fundamental concepts of classification schemes and critically appraises their practical application in industrial projects such as tunneling and mining.

  2. Phylogenetic Interrelationships of Ginglymodian Fishes (Actinopterygii: Neopterygii)

    PubMed Central

    López-Arbarello, Adriana

    2012-01-01

    The Ginglymodi is one of the most common, though poorly understood groups of neopterygians, which includes gars, macrosemiiforms, and “semionotiforms.” In particular, the phylogenetic relationships between the widely distributed “semionotiforms,” and between them and other ginglymodians have been enigmatic. Here, the phylogenetic relationships between eight of the 11 “semionotiform” genera, five genera of living and fossil gars and three macrosemiid genera, are analysed through cladistic analysis, based on 90 morphological characters and 37 taxa, including 7 out-group taxa. The results of the analysis show that the Ginglymodi includes two main lineages: Lepisosteiformes and †Semionotiformes. The genera †Pliodetes, †Araripelepidotes, †Lepidotes, †Scheenstia, and †Isanichthys are lepisosteiforms, and not semionotiforms, as previously thought, and these taxa extend the stratigraphic range of the lineage leading to gars back up to the Early Jurassic. A monophyletic †Lepidotes is restricted to the Early Jurassic species, whereas the strongly tritoral species previously referred to †Lepidotes are referred to †Scheenstia. Other species previously referred to †Lepidotes represent other genera or new taxa. The macrosemiids are well nested within semionotiforms, together with †Semionotidae, here restricted to †Semionotus, and a new family including †Callipurbeckia n. gen. minor (previously referred to †Lepidotes), †Macrosemimimus, †Tlayuamichin, †Paralepidotus, and †Semiolepis. Due to the numerous taxonomic changes needed according to the phylogenetic analysis, this article also includes formal taxonomic definitions and diagnoses for all generic and higher taxa, which are new or modified. The study of Mesozoic ginglymodians led to confirm Patterson’s observation that these fishes show morphological affinities with both halecomorphs and teleosts. Therefore, the compilation of large data sets including the Mesozoic ginglymodians and the re-evaluation of several hypotheses of homology are essential to test the hypotheses of the Halecostomi vs. the Holostei, which is one of the major topics in the evolution of Mesozoic vertebrates and the origin of modern fish faunas. PMID:22808031

  3. PHYLOGENY AND CLASSIFICATION OF FINLAYA AND ALLIED TAXA (DIPTERA: CULICIDAE: AEDINI) BASED ON MORPHOLOGICAL DATA FROM ALL LIFE STAGES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The phylogenetic relationships and generic assignments of ‘Finlaya’ and related taxa of uncertain taxonomic position in the classification of Aedini proposed by Reinert et al. (2004) are explored using 232 characters from eggs, fourth-instar larvae, pupae, adults and immature habitat coded for 116 e...

  4. Multi-locus phylogenetic analysis reveals the pattern and tempo of bony fish evolution

    PubMed Central

    Broughton, Richard E.; Betancur-R., Ricardo; Li, Chenhong; Arratia, Gloria; Ortí, Guillermo

    2013-01-01

    Over half of all vertebrates are “fishes”, which exhibit enormous diversity in morphology, physiology, behavior, reproductive biology, and ecology. Investigation of fundamental areas of vertebrate biology depend critically on a robust phylogeny of fishes, yet evolutionary relationships among the major actinopterygian and sarcopterygian lineages have not been conclusively resolved. Although a consensus phylogeny of teleosts has been emerging recently, it has been based on analyses of various subsets of actinopterygian taxa, but not on a full sample of all bony fishes. Here we conducted a comprehensive phylogenetic study on a broad taxonomic sample of 61 actinopterygian and sarcopterygian lineages (with a chondrichthyan outgroup) using a molecular data set of 21 independent loci. These data yielded a resolved phylogenetic hypothesis for extant Osteichthyes, including 1) reciprocally monophyletic Sarcopterygii and Actinopterygii, as currently understood, with polypteriforms as the first diverging lineage within Actinopterygii; 2) a monophyletic group containing gars and bowfin (= Holostei) as sister group to teleosts; and 3) the earliest diverging lineage among teleosts being Elopomorpha, rather than Osteoglossomorpha. Relaxed-clock dating analysis employing a set of 24 newly applied fossil calibrations reveals divergence times that are more consistent with paleontological estimates than previous studies. Establishing a new phylogenetic pattern with accurate divergence dates for bony fishes illustrates several areas where the fossil record is incomplete and provides critical new insights on diversification of this important vertebrate group. PMID:23788273

  5. Novel multi-sample scheme for inferring phylogenetic markers from whole genome tumor profiles

    PubMed Central

    Subramanian, Ayshwarya; Shackney, Stanley; Schwartz, Russell

    2013-01-01

    Computational cancer phylogenetics seeks to enumerate the temporal sequences of aberrations in tumor evolution, thereby delineating the evolution of possible tumor progression pathways, molecular subtypes and mechanisms of action. We previously developed a pipeline for constructing phylogenies describing evolution between major recurring cell types computationally inferred from whole-genome tumor profiles. The accuracy and detail of the phylogenies, however, depends on the identification of accurate, high-resolution molecular markers of progression, i.e., reproducible regions of aberration that robustly differentiate different subtypes and stages of progression. Here we present a novel hidden Markov model (HMM) scheme for the problem of inferring such phylogenetically significant markers through joint segmentation and calling of multi-sample tumor data. Our method classifies sets of genome-wide DNA copy number measurements into a partitioning of samples into normal (diploid) or amplified at each probe. It differs from other similar HMM methods in its design specifically for the needs of tumor phylogenetics, by seeking to identify robust markers of progression conserved across a set of copy number profiles. We show an analysis of our method in comparison to other methods on both synthetic and real tumor data, which confirms its effectiveness for tumor phylogeny inference and suggests avenues for future advances. PMID:24407301

  6. Probabilistic Graphical Model Representation in Phylogenetics

    E-print Network

    Hö hna, Sebastian; Heath, Tracy A.; Boussau, Bastien; Landis, Michael J.; Ronquist, Fredrik; Huelsenbeck, John P.

    2014-06-20

    to use Bayesian methods to conduct these inferences, which means that we will need to specify prior probability distributions for the variables of our models. To simplify our analyses, we will sample five species: a dog, a bat, a rat, a human, and a koala.../7/2014 Sysbio-syu039.tex] Page: 755 753–772 2014 HÖHNA ET AL.—GRAPHICAL MODELS IN PHYLOGENETICS 755 1 1 1 0 0 dog bat rat human koala p ? ? Beta parameters Bernoulli parameter observed states (presence/absence) Prior Beta Distribution Bernoulli FIGURE 2...

  7. Sequence and Phylogenetic Analysis of FAD Synthetase

    NASA Astrophysics Data System (ADS)

    Schubert, Luisa; Frago, Susana; Martínez-Júlvez, Marta; Medina, Milagros

    2006-08-01

    An evolutionary analysis of the sequences available till now for FAD synthetases has been carried out. Several identical conserved residues have been observed along the sequences of all the FAD synthetases analyzed, which might correlate with role for these residues in the catalytic activity of the enzyme. Phylogenetic analysis shows that FAD synthetase sequences can be organized in two main clusters. One of them mainly contains temperature, pressure or pH resistant organisms, whereas in the other one organisms with pathogenic character can be found.

  8. Phylogenetic tree construction based on 2D graphical representation

    NASA Astrophysics Data System (ADS)

    Liao, Bo; Shan, Xinzhou; Zhu, Wen; Li, Renfa

    2006-04-01

    A new approach based on the two-dimensional (2D) graphical representation of the whole genome sequence [Bo Liao, Chem. Phys. Lett., 401(2005) 196.] is proposed to analyze the phylogenetic relationships of genomes. The evolutionary distances are obtained through measuring the differences among the 2D curves. The fuzzy theory is used to construct phylogenetic tree. The phylogenetic relationships of H5N1 avian influenza virus illustrate the utility of our approach.

  9. Best Practices for Data Sharing in Phylogenetic Research

    PubMed Central

    Cranston, Karen; Harmon, Luke J.; O'Leary, Maureen A.; Lisle, Curtis

    2014-01-01

    As phylogenetic data becomes increasingly available, along with associated data on species’ genomes, traits, and geographic distributions, the need to ensure data availability and reuse become more and more acute. In this paper, we provide ten “simple rules” that we view as best practices for data sharing in phylogenetic research. These rules will help lead towards a future phylogenetics where data can easily be archived, shared, reused, and repurposed across a wide variety of projects. PMID:24987572

  10. Phylogenetic basis for a taxonomic dissection of the genus Clostridium

    Microsoft Academic Search

    E. Stackebrandt; I. Kramer; J. Swiderski; H. Hippe

    1999-01-01

    The 16S rDNA-based phylogenetic analysis of the genus Clostridium has been completed by determination of the phylogenetic position of the type strains of 15 species and two non-validated species. These strains are members of phylogenetic clusters I, III, IV, V, IX, XIVa and XVIII as defined previously by Collins et al. [Int. J. Syst. Bacteriol. 44 (1994) 812–826]. Members of

  11. A Novel Vehicle Classification Using Embedded Strain Gauge Sensors

    PubMed Central

    Zhang, Wenbin; Wang, Qi; Suo, Chunguang

    2008-01-01

    This paper presents a new vehicle classification and develops a traffic monitoring detector to provide reliable vehicle classification to aid traffic management systems. The basic principle of this approach is based on measuring the dynamic strain caused by vehicles across pavement to obtain the corresponding vehicle parameters – wheelbase and number of axles – to then accurately classify the vehicle. A system prototype with five embedded strain sensors was developed to validate the accuracy and effectiveness of the classification method. According to the special arrangement of the sensors and the different time a vehicle arrived at the sensors one can estimate the vehicle's speed accurately, corresponding to the estimated vehicle wheelbase and number of axles. Because of measurement errors and vehicle characteristics, there is a lot of overlap between vehicle wheelbase patterns. Therefore, directly setting up a fixed threshold for vehicle classification often leads to low-accuracy results. Using the machine learning pattern recognition method to deal with this problem is believed as one of the most effective tools. In this study, support vector machines (SVMs) were used to integrate the classification features extracted from the strain sensors to automatically classify vehicles into five types, ranging from small vehicles to combination trucks, along the lines of the Federal Highway Administration vehicle classification guide. Test bench and field experiments will be introduced in this paper. Two support vector machines classification algorithms (one-against-all, one-against-one) are used to classify single sensor data and multiple sensor combination data. Comparison of the two classification method results shows that the classification accuracy is very close using single data or multiple data. Our results indicate that using multiclass SVM-based fusion multiple sensor data significantly improves the results of a single sensor data, which is trained on the whole multisensor data set.

  12. Phylogenetic relationships of Mesoamerican spider monkeys (Ateles geoffroyi): Molecular evidence suggests the need for a revised taxonomy.

    PubMed

    Morales-Jimenez, Alba Lucia; Cortés-Ortiz, Liliana; Di Fiore, Anthony

    2015-01-01

    Mesoamerican spider monkeys (Ateles geoffroyi sensu lato) are widely distributed from Mexico to northern Colombia. This group of primates includes many allopatric forms with morphologically distinct pelage color and patterning, but its taxonomy and phylogenetic history are poorly understood. We explored the genetic relationships among the different forms of Mesoamerican spider monkeys using mtDNA sequence data, and we offer a new hypothesis for the evolutionary history of the group. We collected up to ?800 bp of DNA sequence data from hypervariable region 1 (HV1) of the control region, or D-loop, of the mitochondrion for multiple putative subspecies of Ateles geoffroyi sensu lato. Both maximum likelihood and Bayesian reconstructions, using Ateles paniscus as an outgroup, showed that (1) A. fusciceps and A. geoffroyi form two different monophyletic groups and (2) currently recognized subspecies of A. geoffroyi are not monophyletic. Within A. geoffroyi, our phylogenetic analysis revealed little concordance between any of the classifications proposed for this taxon and their phylogenetic relationships, therefore a new classification is needed for this group. Several possible clades with recent divergence times (1.7-0.8 Ma) were identified within Ateles geoffroyi sensu lato. Some previously recognized taxa were not separated by our data (e.g., A. g. vellerosus and A. g. yucatanensis), while one distinct clade had never been described as a different evolutionary unit based on pelage or geography (Ateles geoffroyi ssp. indet. from El Salvador). Based on well-supported phylogenetic relationships, our results challenge previous taxonomic arrangements for Mesoamerican spider monkeys. We suggest a revised arrangement based on our data and call for a thorough taxonomic revision of this group. PMID:25451800

  13. Progressive Classification Using Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri; Kocurek, Michael

    2009-01-01

    An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user can halt this reclassification process at any point, thereby obtaining the best possible result for a given amount of computation time. Alternatively, the results can be displayed as they are generated, providing the user with real-time feedback about the current accuracy of classification.

  14. Phylogenetic relationships among Opisthobranchia (Mollusca: Gastropoda) based on mitochondrial

    E-print Network

    Zardoya, Rafael

    Phylogenetic relationships among Opisthobranchia (Mollusca: Gastropoda) based on mitochondrial cox relationships among 37 species representing seven main lineages within Opisthobranchia (Mollusca: Gastropoda

  15. [Skeletal anatomy and phylogenetic position analysis of Gobiocypris rarus].

    PubMed

    Li, Xiao-Juan; Tang, Qiong-Ying; Liu, Huan-Zhang

    2013-08-01

    The phylogenetic position of Gobiocypris rarus, a small cyprinid fish of interest in many biological areas due to its unique characteristics, is still under debate. At the morphological view, it belongs to the Danioninae subfamily of Cyprinidae; however, recent molecular research recognizes it as a member of the Gobioninae subfamily. To investigate the phylogenetic position of Gobiocypris rarus, we prepared transparent skeleton specimens, selected 47 characteristics and reconstructed the phylogenetic tree using PAUP. The results indicated that Gobiocypris rarus was clustered with Gobioninae, which was in agreement with recent molecular phylogenetic conclusions. PMID:23913888

  16. Accurate indel prediction using paired-end short reads

    PubMed Central

    2013-01-01

    Background One of the major open challenges in next generation sequencing (NGS) is the accurate identification of structural variants such as insertions and deletions (indels). Current methods for indel calling assign scores to different types of evidence or counter-evidence for the presence of an indel, such as the number of split read alignments spanning the boundaries of a deletion candidate or reads that map within a putative deletion. Candidates with a score above a manually defined threshold are then predicted to be true indels. As a consequence, structural variants detected in this manner contain many false positives. Results Here, we present a machine learning based method which is able to discover and distinguish true from false indel candidates in order to reduce the false positive rate. Our method identifies indel candidates using a discriminative classifier based on features of split read alignment profiles and trained on true and false indel candidates that were validated by Sanger sequencing. We demonstrate the usefulness of our method with paired-end Illumina reads from 80 genomes of the first phase of the 1001 Genomes Project ( http://www.1001genomes.org) in Arabidopsis thaliana. Conclusion In this work we show that indel classification is a necessary step to reduce the number of false positive candidates. We demonstrate that missing classification may lead to spurious biological interpretations. The software is available at: http://agkb.is.tuebingen.mpg.de/Forschung/SV-M/. PMID:23442375

  17. Brain extraction based on locally linear representation-based classification.

    PubMed

    Huang, Meiyan; Yang, Wei; Jiang, Jun; Wu, Yao; Zhang, Yu; Chen, Wufan; Feng, Qianjin

    2014-05-15

    Brain extraction is an important procedure in brain image analysis. Although numerous brain extraction methods have been presented, enhancing brain extraction methods remains challenging because brain MRI images exhibit complex characteristics, such as anatomical variability and intensity differences across different sequences and scanners. To address this problem, we present a Locally Linear Representation-based Classification (LLRC) method for brain extraction. A novel classification framework is derived by introducing the locally linear representation to the classical classification model. Under this classification framework, a common label fusion approach can be considered as a special case and thoroughly interpreted. Locality is important to calculate fusion weights for LLRC; this factor is also considered to determine that Local Anchor Embedding is more applicable in solving locally linear coefficients compared with other linear representation approaches. Moreover, LLRC supplies a way to learn the optimal classification scores of the training samples in the dictionary to obtain accurate classification. The International Consortium for Brain Mapping and the Alzheimer's Disease Neuroimaging Initiative databases were used to build a training dataset containing 70 scans. To evaluate the proposed method, we used four publicly available datasets (IBSR1, IBSR2, LPBA40, and ADNI3T, with a total of 241 scans). Experimental results demonstrate that the proposed method outperforms the four common brain extraction methods (BET, BSE, GCUT, and ROBEX), and is comparable to the performance of BEaST, while being more accurate on some datasets compared with BEaST. PMID:24525169

  18. Implementing the North American Industry Classification System at BLS.

    ERIC Educational Resources Information Center

    Walker, James A.; Murphy, John B.

    2001-01-01

    The United States, Canada, and Mexico developed the North American Industry Classification System, which captures new and emerging industries, uses a unified concept to define industries, and is a consistent and comparable tool for measuring the nations' economies. Despite initial conversion difficulties, the new system will be a more accurate way…

  19. Addressing the Problems of Bayesian Network Classification of

    E-print Network

    Fah, Cheong Loong

    data of more than 4,000 segments shows the potential of our approach in pattern classification. Index a very active research topic in machine learning and data mining. One of the best known classifiers recognition tasks and diagnosis systems. Designing accurate classifiers from preclassified data has become

  20. Multiclass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and

    E-print Network

    Anagnostopoulos, Georgios C.

    and, therefore, respond differently to the same treatment therapy. For example, for diffuse large B-cellMulticlass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and Particle Swarm--It is crucial for cancer diagnosis and treatment to accurately identify the site of origin of a tumor

  1. Phylogenetics in the Bioinformatics Culture of Understanding

    PubMed Central

    Allaby, Robin G.

    2004-01-01

    Bioinformatics, as a relatively young discipline, has grown up in a world of high-throughput large volume data that requires automatic analysis to enable us to stay on top of it all. As a response, the bioinformatics discipline has developed strategies to find patterns in a ‘low signal : noise ratio’ environment. While the need to process large amounts of information and extract hypotheses is both laudable and inescapable, the pressures that such requirements have introduced can lead to short cuts and misapprehensions. This is particularly the case with reference to assumptions about the underlying evolutionary theories that are implicitly invoked by the algorithms utilised in the analysis pipelines. The classic example is the misuse of the term ‘homologous’ to mean ‘similar’ or even ‘functionally similar’, rather than the correct definition of ‘having the same evolutionary origin’, which may or may not imply similarity of function. In this review, we outline some of the common phylogenetic questions from a bioinformatics perspective that can be better addressed with a deeper understanding of evolutionary principles and show, with examples from the amidohydrolase and Toll families, that quite different conclusions can be drawn if such approaches are taken. This review focuses on the importance of the underlying evolutionary biology, rather than assessing the merits of different phylogenetic techniques. The relative merits of a priori and a posteriori inclusion of biological information are discussed. PMID:18629061

  2. Phylogenetic skew: an index of community diversity.

    PubMed

    Chen, Hungyen; Shao, Kwang-Tsao; Kishino, Hirohisa

    2015-02-01

    The distribution of divergence times between member species of a community reflects the pattern of species composition. In this study, we contrast the species composition of a community against the meta-community, which we define as the species composition of a set of target communities. We regard the collection of species that comprise a community as a sample from the set of member species of the meta-community, and interpret the pattern of the community species composition in terms of the type of species sampled from the meta-community. A newly defined effective species sampling proportion explains the amount of the difference between the divergence time distributions of the community and that of the meta-community, assuming random sampling. We propose a new index of phylogenetic skew (PS), as the ratio of the maximum-likelihood estimate of the effective species sampling proportion to the observed sampling proportion. A PS value of 1 is interpreted as random sampling. If the value is >1, the sampling is suspected to be phylogenetically skewed. If it is <1, systematic thinning of species is likely. Unlike other indices, the PS does not depend on species richness as long as the community has more than a few members of a species. Because it is possible to compare partially observed communities, the index may be effectively used in exploratory analysis to detect candidate communities with unique species compositions from a large number of communities. PMID:25580733

  3. High phylogenetic diversity among corticioid homobasidiomycetes.

    PubMed

    Larsson, Karl-Henrik; Larsson, Ellen; Kõljalg, Urmas

    2004-09-01

    Homobasidiomycetes display a variety of fruit body morphologies. Examples include gilled mushrooms, coral fungi, polypores and puffballs but also species with simple crust-like basidiomata, usually called corticioid fungi. The latter group has largely been neglected in recent studies of homobasidiomycete evolution. The major goal of the present study was to explore the impact that the addition of a wide selection of species with crust-like basidiomata would have on homobasidiomycete phylogeny. Two genes, 5.8S and 28S in the nuclear rDNA repeats, were sequenced and a data set with 178 taxa analysed using neighbour-joining and maximum parsimony methods. Support for clades was evaluated by bootstrap. Basal nodes generally received weak support and branching order for major clades remained largely unresolved. Twelve major groups were recovered and corticioid fungi make up a major or important constituent in most of them. Nine groups are strongly supported but support for euagarics and polyporoid clades is poor. Phlebioid fungi were in earlier studies merged with the polyporoid clade but are here identified as a separate clade. Athelia is allied with ectomycorrhizal genera, inter alia Piloderma and Amphinema, in a separate clade forming a sister group to the boletes. We conclude that corticioid fungi hold a considerable share of the phylogenetic diversity displayed by homobasidiomycetes, and should always be considered when phylogenetic studies of larger basidiomycetes are designed. PMID:15506012

  4. A Distance Measure for Genome Phylogenetic Analysis

    NASA Astrophysics Data System (ADS)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the ?-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  5. Phylogenetically-Informed Priorities for Amphibian Conservation

    PubMed Central

    Isaac, Nick J. B.; Redding, David W.; Meredith, Helen M.; Safi, Kamran

    2012-01-01

    The amphibian decline and extinction crisis demands urgent action to prevent further large numbers of species extinctions. Lists of priority species for conservation, based on a combination of species’ threat status and unique contribution to phylogenetic diversity, are one tool for the direction and catalyzation of conservation action. We describe the construction of a near-complete species-level phylogeny of 5713 amphibian species, which we use to create a list of evolutionarily distinct and globally endangered species (EDGE list) for the entire class Amphibia. We present sensitivity analyses to test the robustness of our priority list to uncertainty in species’ phylogenetic position and threat status. We find that both sources of uncertainty have only minor impacts on our ‘top 100‘ list of priority species, indicating the robustness of the approach. By contrast, our analyses suggest that a large number of Data Deficient species are likely to be high priorities for conservation action from the perspective of their contribution to the evolutionary history. PMID:22952807

  6. Novel accurate bacterial discrimination by MALDI-time-of-flight MS based on ribosomal proteins coding in S10-spc-alpha operon at strain level S10-GERMS.

    PubMed

    Tamura, Hiroto; Hotta, Yudai; Sato, Hiroaki

    2013-08-01

    Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is one of the most widely used mass-based approaches for bacterial identification and classification because of the simple sample preparation and extremely rapid analysis within a few minutes. To establish the accurate MALDI-TOF MS bacterial discrimination method at strain level, the ribosomal subunit proteins coded in the S10-spc-alpha operon, which encodes half of the ribosomal subunit protein and is highly conserved in eubacterial genomes, were selected as reliable biomarkers. This method, named the S10-GERMS method, revealed that the strains of genus Pseudomonas were successfully identified and discriminated at species and strain levels, respectively; therefore, the S10-GERMS method was further applied to discriminate the pathovar of P. syringae. The eight selected biomarkers (L24, L30, S10, S12, S14, S16, S17, and S19) suggested the rapid discrimination of P. syringae at the strain (pathovar) level. The S10-GERMS method appears to be a powerful tool for rapid and reliable bacterial discrimination and successful phylogenetic characterization. In this article, an overview of the utilization of results from the S10-GERMS method is presented, highlighting the characterization of the Lactobacillus casei group and discrimination of the bacteria of genera Bacillus and Sphingopyxis despite only two and one base difference in the 16S rRNA gene sequence, respectively. PMID:23686278

  7. Phylogenetic Framework and Molecular Signatures for the Main Clades of the Phylum Actinobacteria

    PubMed Central

    Gao, Beile

    2012-01-01

    Summary: The phylum Actinobacteria harbors many important human pathogens and also provides one of the richest sources of natural products, including numerous antibiotics and other compounds of biotechnological interest. Thus, a reliable phylogeny of this large phylum and the means to accurately identify its different constituent groups are of much interest. Detailed phylogenetic and comparative analyses of >150 actinobacterial genomes reported here form the basis for achieving these objectives. In phylogenetic trees based upon 35 conserved proteins, most of the main groups of Actinobacteria as well as a number of their superageneric clades are resolved. We also describe large numbers of molecular markers consisting of conserved signature indels in protein sequences and whole proteins that are specific for either all Actinobacteria or their different clades (viz., orders, families, genera, and subgenera) at various taxonomic levels. These signatures independently support the existence of different phylogenetic clades, and based upon them, it is now possible to delimit the phylum Actinobacteria (excluding Coriobacteriia) and most of its major groups in clear molecular terms. The species distribution patterns of these markers also provide important information regarding the interrelationships among different main orders of Actinobacteria. The identified molecular markers, in addition to enabling the development of a stable and reliable phylogenetic framework for this phylum, also provide novel and powerful means for the identification of different groups of Actinobacteria in diverse environments. Genetic and biochemical studies on these Actinobacteria-specific markers should lead to the discovery of novel biochemical and/or other properties that are unique to different groups of Actinobacteria. PMID:22390973

  8. Entropy-based approach for selecting informative regions in the L1 gene of bovine papillomavirus for phylogenetic inference and primer design.

    PubMed

    Batista, M V A; Freitas, A C; Balbino, V Q

    2013-01-01

    Bovine papillomaviruses (BPVs) cause many benign and malignant lesions in cattle and other animals. Twelve BPV types have been identified so far, and several putative novel BPV types have been detected based on the analysis of L1 gene fragments, generated by FAP59/64 and MY11/09 primers. Phylogenetic trees are important in studies that describe novel BPV types. However, topological mistakes could be a problem in such studies. Therefore, we made use of entropy to find phylogenetic informative regions in the BPV L1 gene sequences from all 12 BPVs. Six data sets were created and phylogenetically compared to each other using neighbor-joining and maximum likelihood methods of phylogenetic tree reconstruction. We found two major regions in the L1 gene, using an entropy-based approach, which selects regions with low information complexity. More robust phylogenetic trees were obtained with these regions, when compared to the ones obtained with FAP59/64 and MY11/09 primers. More robust phylogenetic trees are important to accurately position novel BPV types, subtypes and variants. We conclude that an entropy-based approach is a good methodology for selecting regions of the L1 gene of BPVs that could be used to design more specific and sensitive degenerate primers, for the development of improved diagnostic methods. PMID:23420364

  9. Classification, change-detection and accuracy assessment: Toward fuller automation

    NASA Astrophysics Data System (ADS)

    Podger, Nancy E.

    This research aims to automate methods for conducting change detection studies using remotely sensed images. Five major objectives were tested on two study sites, one encompassing Madison, Wisconsin, and the other Fort Hood, Texas. (Objective 1) Enhance accuracy assessments by estimating standard errors using bootstrap analysis. Bootstrap estimates of the standard errors were found to be comparable to parametric statistical estimates. Also, results show that bootstrapping can be used to evaluate the consistency of a classification process. (Objective 2) Automate the guided clustering classifier. This research shows that the guided clustering classification process can be automated while maintaining highly accurate results. Three different evaluation methods were used. (Evaluation 1) Appraised the consistency of 25 classifications produced from the automated system. The classifications differed from one another by only two to four percent. (Evaluation 2) Compared accuracies produced by the automated system to classification accuracies generated following a manual guided clustering protocol. Results: The automated system produced higher overall accuracies in 50 percent of the tests and was comparable for all but one of the remaining tests. (Evaluation 3) Assessed the time and effort required to produce accurate classifications. Results: The automated system produced classifications in less time and with less effort than the manual 'protocol' method. (Objective 3) Built a flexible, interactive software tool to aid in producing binary change masks. (Objective 4) Reduced by automation the amount of training data needed to classify the second image of a two-time-period change detection project. Locations of the training sites in 'unchanged' areas employed to classify the first image were used to identify sites where spectral information was automatically extracted from the second image. Results: The automatically generated training data produces classification accuracies similar to accuracies from a classification where training data were manually collected. (Objective 5) Decrease the effort needed for post-classification change analysis. Classification accuracy metrics, produced from a hybrid change detection analysis for the second time period, were found to be comparable to results generated from a conventional classification of the same image. Further research showed that pixel-by-pixel comparisons between classifications produced conflicting results.

  10. Land use/cover classification in the Brazilian Amazon using satellite images

    PubMed Central

    Lu, Dengsheng; Batistella, Mateus; Li, Guiying; Moran, Emilio; Hetrick, Scott; Freitas, Corina da Costa; Dutra, Luciano Vieira; Sant’Anna, Sidnei João Siqueira

    2013-01-01

    Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation-based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi-resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Of the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, has the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical-based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data. PMID:24353353

  11. Land use/cover classification in the Brazilian Amazon using satellite images.

    PubMed

    Lu, Dengsheng; Batistella, Mateus; Li, Guiying; Moran, Emilio; Hetrick, Scott; Freitas, Corina da Costa; Dutra, Luciano Vieira; Sant'anna, Sidnei João Siqueira

    2012-09-01

    Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation-based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi-resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Of the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, has the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical-based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data. PMID:24353353

  12. Phylogenetic estimation of timescales using ancient DNA: the effects of temporal sampling scheme and uncertainty in sample ages.

    PubMed

    Molak, Martyna; Lorenzen, Eline D; Shapiro, Beth; Ho, Simon Y W

    2013-02-01

    In recent years, ancient DNA has increasingly been used for estimating molecular timescales, particularly in studies of substitution rates and demographic histories. Molecular clocks can be calibrated using temporal information from ancient DNA sequences. This information comes from the ages of the ancient samples, which can be estimated by radiocarbon dating the source material or by dating the layers in which the material was deposited. Both methods involve sources of uncertainty. The performance of bayesian phylogenetic inference depends on the information content of the data set, which includes variation in the DNA sequences and the structure of the sample ages. Various sources of estimation error can reduce our ability to estimate rates and timescales accurately and precisely. We investigated the impact of sample-dating uncertainties on the estimation of evolutionary timescale parameters using the software BEAST. Our analyses involved 11 published data sets and focused on estimates of substitution rate and root age. We show that, provided that samples have been accurately dated and have a broad temporal span, it might be unnecessary to account for sample-dating uncertainty in Bayesian phylogenetic analyses of ancient DNA. We also investigated the sample size and temporal span of the ancient DNA sequences needed to estimate phylogenetic timescales reliably. Our results show that the range of sample ages plays a crucial role in determining the quality of the results but that accurate and precise phylogenetic estimates of timescales can be made even with only a few ancient sequences. These findings have important practical consequences for studies of molecular rates, timescales, and population dynamics. PMID:23024187

  13. Mineral Classification Exercise

    NSDL National Science Digital Library

    Dexter Perkins

    This exercise is designed to help students think about the properties of minerals that are most useful for mineral classification and identification. Students are given a set of minerals and asked to come up with a hierarchical classification scheme (a "key") that can be used to identify different mineral species. They compare their results with the products of other groups. They test the various schemes by applying them to unknown samples. While doing this exercise, the students develop observational and interpretational skill. They also begin to think about the nature of classification systems.

  14. Applying species-tree analyses to deep phylogenetic histories: challenges and potential suggested from a survey of empirical phylogenetic studies.

    PubMed

    Lanier, Hayley C; Knowles, L Lacey

    2015-02-01

    Coalescent-based methods for species-tree estimation are becoming a dominant approach for reconstructing species histories from multi-locus data, with most of the studies examining these methodologies focused on recently diverged species. However, deeper phylogenies, such as the datasets that comprise many Tree of Life (ToL) studies, also exhibit gene-tree discordance. This discord may also arise from the stochastic sorting of gene lineages during the speciation process (i.e., reflecting the random coalescence of gene lineages in ancestral populations). It remains unknown whether guidelines regarding methodologies and numbers of loci established by simulation studies at shallow tree depths translate into accurate species relationships for deeper phylogenetic histories. We address this knowledge gap and specifically identify the challenges and limitations of species-tree methods that account for coalescent variance for deeper phylogenies. Using simulated data with characteristics informed by empirical studies, we evaluate both the accuracy of estimated species trees and the characteristics associated with recalcitrant nodes, with a specific focus on whether coalescent variance is generally responsible for the lack of resolution. By determining the proportion of coalescent genealogies that support a particular node, we demonstrate that (1) species-tree methods account for coalescent variance at deep nodes and (2) mutational variance - not gene-tree discord arising from the coalescent - posed the primary challenge for accurate reconstruction across the tree. For example, many nodes were accurately resolved despite predicted discord from the random coalescence of gene lineages and nodes with poor support were distributed across a range of depths (i.e., they were not restricted to a particular recent divergences). Given their broad taxonomic scope and large sampling of taxa, deep level phylogenies pose several potential methodological complications including difficulties with MCMC convergence and estimation of requisite population genetic parameters for coalescent-based approaches. Despite these difficulties, the findings generally support the utility of species-tree analyses for the estimation of species relationships throughout the ToL. We discuss strategies for successful application of species-tree approaches to deep phylogenies. PMID:25450097

  15. THE EFFECT OF THE GUIDE TREE ON MULTIPLE SEQUENCE ALIGNMENTS AND SUBSEQUENT PHYLOGENETIC ANALYSES

    E-print Network

    S. Nelesen; K. Liu; D. Zhao; C. R. Linder; T. Warnow

    Many multiple sequence alignment methods (MSAs) use guide trees in conjunction with a progressive alignment technique to generate a multiple sequence alignment but use differing techniques to produce the guide tree and to perform the progressive alignment. In this paper we explore the consequences of changing the guide tree used for the alignment routine. We evaluate four leading MSA methods (ProbCons, MAFFT, Muscle, and ClustalW) as well as a new MSA method (FTA, for “Fixed Tree Alignment”) which we have developed, on a wide range of simulated datasets. Although improvements in alignment accuracy can be obtained by providing better guide trees, in general there is little effect on the “accuracy” (measured using the SP-score) of the alignment by improving the guide tree. However, RAxML-based phylogenetic analyses of alignments based upon better guide trees tend to be much more accurate. This impact is particularly significant for ProbCons, one of the best MSA methods currently available, and our method, FTA. Finally, for very good guide trees, phylogenies based upon FTA alignments are more accurate than phylogenies based upon ProbCons alignments, suggesting that further improvements in phylogenetic accuracy may be obtained through algorithms of this type. 1.

  16. Segmentation assisted food classification for dietary assessment

    NASA Astrophysics Data System (ADS)

    Zhu, Fengqing; Bosch, Marc; Schap, TusaRebecca; Khanna, Nitin; Ebert, David S.; Boushey, Carol J.; Delp, Edward J.

    2011-03-01

    Accurate methods and tools to assess food and nutrient intake are essential for the association between diet and health. Preliminary studies have indicated that the use of a mobile device with a built-in camera to obtain images of the food consumed may provide a less burdensome and more accurate method for dietary assessment. We are developing methods to identify food items using a single image acquired from the mobile device. Our goal is to automatically determine the regions in an image where a particular food is located (segmentation) and correctly identify the food type based on its features (classification or food labeling). Images of foods are segmented using Normalized Cuts based on intensity and color. Color and texture features are extracted from each segmented food region. Classification decisions for each segmented region are made using support vector machine methods. The segmentation of each food region is refined based on feedback from the output of classifier to provide more accurate estimation of the quantity of food consumed.

  17. Segmentation Assisted Food Classification for Dietary Assessment

    PubMed Central

    Zhu, Fengqing; Bosch, Marc; Schap, TusaRebecca; Khanna, Nitin; Ebert, David S.; Boushey, Carol J.; Delp, Edward J.

    2011-01-01

    Accurate methods and tools to assess food and nutrient intake are essential for the association between diet and health. Preliminary studies have indicated that the use of a mobile device with a built-in camera to obtain images of the food consumed may provide a less burdensome and more accurate method for dietary assessment. We are developing methods to identify food items using a single image acquired from the mobile device. Our goal is to automatically determine the regions in an image where a particular food is located (segmentation) and correctly identify the food type based on its features (classification or food labeling). Images of foods are segmented using Normalized Cuts based on intensity and color. Color and texture features are extracted from each segmented food region. Classification decisions for each segmented region are made using support vector machine methods. The segmentation of each food region is refined based on feedback from the output of classifier to provide more accurate estimation of the quantity of food consumed. PMID:22128304

  18. Flying insect detection and classification with inexpensive sensors.

    PubMed

    Chen, Yanping; Why, Adena; Batista, Gustavo; Mafra-Neto, Agenor; Keogh, Eamonn

    2014-01-01

    An inexpensive, noninvasive system that could accurately classify flying insects would have important implications for entomological research, and allow for the development of many useful applications in vector and pest control for both medical and agricultural entomology. Given this, the last sixty years have seen many research efforts devoted to this task. To date, however, none of this research has had a lasting impact. In this work, we show that pseudo-acoustic optical sensors can produce superior data; that additional features, both intrinsic and extrinsic to the insect's flight behavior, can be exploited to improve insect classification; that a Bayesian classification approach allows to efficiently learn classification models that are very robust to over-fitting, and a general classification framework allows to easily incorporate arbitrary number of features. We demonstrate the findings with large-scale experiments that dwarf all previous works combined, as measured by the number of insects and the number of species considered. PMID:25350921

  19. Accurate floating point summation James Demmel

    E-print Network

    California at Berkeley, University of

    Accurate floating point summation James Demmel Yozo Hida May 8, 2002 Abstract We present and analyze several simple algorithms for accurately summing n floating point numbers S = n i=1 si precision only (F 2f). We apply this result to the floating point formats in the (proposed revision of the

  20. Accurate Structural Correlations from Maximum Likelihood Superpositions

    Microsoft Academic Search

    Douglas L. Theobald; Deborah S. Wuttke

    2008-01-01

    The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the

  1. Accurate estimation of sigma(exp 0) using AIRSAR data

    NASA Technical Reports Server (NTRS)

    Holecz, Francesco; Rignot, Eric

    1995-01-01

    During recent years signature analysis, classification, and modeling of Synthetic Aperture Radar (SAR) data as well as estimation of geophysical parameters from SAR data have received a great deal of interest. An important requirement for the quantitative use of SAR data is the accurate estimation of the backscattering coefficient sigma(exp 0). In terrain with relief variations radar signals are distorted due to the projection of the scene topography into the slant range-Doppler plane. The effect of these variations is to change the physical size of the scattering area, leading to errors in the radar backscatter values and incidence angle. For this reason the local incidence angle, derived from sensor position and Digital Elevation Model (DEM) data must always be considered. Especially in the airborne case, the antenna gain pattern can be an additional source of radiometric error, because the radar look angle is not known precisely as a result of the the aircraft motions and the local surface topography. Consequently, radiometric distortions due to the antenna gain pattern must also be corrected for each resolution cell, by taking into account aircraft displacements (position and attitude) and position of the backscatter element, defined by the DEM data. In this paper, a method to derive an accurate estimation of the backscattering coefficient using NASA/JPL AIRSAR data is presented. The results are evaluated in terms of geometric accuracy, radiometric variations of sigma(exp 0), and precision of the estimated forest biomass.

  2. Evolution of rbcL among Lathyrus and Kupicha's classification.

    PubMed

    Marghali, S; Zitouna, N; Gharbi, M; Fadhlaoui, I; Trifi-Farah, N

    2014-01-01

    Phylogenetic relationships in the Lathyrus genus were examined using cpDNA data, particularly data attributed to the "barcode" rbcL gene to construct a possible evolutionary scenario. Plant barcoding can be used to differentiate between species within a genus and to conserve DNA within the same species. We assessed the phylogeny of 29 species of Lathyrus using maximum parsimony, maximum likelihood and unweighted pair-group method and arithmetic mean. The classifications did not agree with current morphological and basic Lathyrus classification. Lathyrus belinensis is a new species that was not described by Kupicha; according to rbcL analysis, the species belongs in the Lathyrus genus. Additionally, the genus Lathyrus has undergone a rapid population expansion as indicated by neutral selection indices. PMID:25366764

  3. Postprocessing classification images

    NASA Technical Reports Server (NTRS)

    Kan, E. P.

    1979-01-01

    Program cleans up remote-sensing maps. It can be used with existing image-processing software. Remapped images closely resemble familiar resource information maps and can replace or supplement classification images not postprocessed by this program.

  4. 2000 Mathematics Subject Classification

    NSDL National Science Digital Library

    The proposed revision of the 1991 edition of the Mathematics Subject Classification, MSC2000, was officially announced at the upcoming International Congress of Mathematicians in Berlin (August 24, 1998). MSC2000 is a collaborative effort between the editors of Mathematical Review and Zentralblatt fur Mathematik, two journals that review mathematics literature. Mathematical Reviews is a publication of the American Mathematical Society (AMS) (discussed in the July 14, 1995 issue of the Scout Report). The 1991 edition of the MSC (linked to from the MSC2000 site) contains "over 5,000 two-, three-, and five-digit classifications, each corresponding to a discipline of mathematics." The MSC2000 classification scheme is browseable; the 1991 edition has both browse and search functionality. Application of the new classification scheme will begin in the year 2000.

  5. Phylogenetic relationships of the mockingbirds and thrashers (Aves: Mimidae)

    Microsoft Academic Search

    Irby J. Lovette; Brian S. Arbogast; Robert L. Curry; Robert M. Zink; Carlos A. Botero; John P. Sullivan; Amanda L. Talaba; Rebecca B. Harris; Dustin R. Rubenstein; Robert E. Ricklefs; Eldredge Bermingham

    The mockingbirds, thrashers and allied birds in the family Mimidae are broadly distributed across the Americas. Many aspects of their phylogenetic history are well established, but there has been no previous phylogenetic study that included all species in this radiation. Our reconstructions based on mitochondrial and nuclear DNA sequence markers show that an early bifurcation separated the Mimidae into two

  6. Phylogenetic hypotheses and the utility of multiple sequence alignment

    Microsoft Academic Search

    Ward C. Wheeler; Gonzalo Giribet

    Abstract The role of Multiple Sequence Alignment in phylogenetic analysis is discussed in the context of data and hypothesis. Alignments cannot be observed in nature, hence are neither data nor “real” in the scientific sense. Observers gather sequence data as strings of nucleotides and phylogenetic hypotheses (= topologies) are tested with them on the basis of quantitative optimality criteria. This

  7. A phylogenetic supertree of the bats (Mammalia: Chiroptera).

    PubMed

    Jones, Kate E; Purvis, Andy; MacLarnon, Ann; Bininda-Emonds, Olaf R P; Simmons, Nancy B

    2002-05-01

    We present the first estimate of the phylogenetic relationships among all 916 extant and nine recently extinct species of bats Mammalia: Chiroptera), a group that accounts for almost one-quarter of extant mammalian diversity. This phylogeny was derived by combining 105 estimates of bat phylogenetic relationships published since 1970 using the supertree construction technique of Matrix Representation with Parsimony (MRP). Despite the explosive growth in the number of phylogenetic studies of bats since 1990, phylogenetic relationships in the order have been studied non-randomly. For example, over one-third of all bat systematic studies to date have locused on relationships within Phyllostomidae, whereas relationships within clades such as Kerivoulinae and Murinae have never been studied using cladistic methods. Resolution in the supertree similarly differs among clades: overall resolution is poor (46.4%, of a fully bifurcating solution) but reaches 100% in some groups (e.g. relationships within Mormoopidae). The supertree analysis does not support a recent proposal that Microchiroptera is paraphyletic with respect to Megachiroptera, as the majority of source topologies support microbat monophyly. Although it is not a substitute for comprehensive phylogenetic analyses of primary molecular and morphological data, the bat supertree provides a useful tool for future phylogenetic comparative and macroevolutionary studies. Additionally, it identifies clades that have been little studied, highlights groups within which relationships are controversial, and like all phylogenetic studies, provides preliminary hypotheses that can form starting points for future phylogenetic studies of bats. PMID:12056748

  8. A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony

    E-print Network

    Hao, Jin-Kao

    Abstract. The Maximum Parsimony problem aims at reconstructing a phylogenetic tree from DNA, RNA or proteinA Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer1, is the formal name for the field within Biology that reconstructs evolutionary history of species. Phy

  9. FOURIER ANALYSIS AND PHYLOGENETIC TREES STEVEN N. EVANS

    E-print Network

    Evans, Steven N.

    the most commonly used technique. Moreover, reconstruction from DNA se- quence data has the added begins with a discus- sion of the sort of DNA sequence data that are used for phylogenetic reconstruction of phylogenetic invariants: a technique for re- constructing evolutionary family trees from DNA sequence data

  10. Representation in stochastic search for phylogenetic tree reconstruction

    E-print Network

    Shieber, Stuart

    Representation in stochastic search for phylogenetic tree reconstruction Griffin Weber a, Harvard University, USA Received 28 May 2005 Abstract Phylogenetic tree reconstruction is a process in which the ancestral relationships among a group of organisms are inferred from their DNA sequences

  11. Student Interpretations of Phylogenetic Trees in an Introductory Biology Course

    ERIC Educational Resources Information Center

    Dees, Jonathan; Momsen, Jennifer L.; Niemi, Jarad; Montplaisir, Lisa

    2014-01-01

    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa…

  12. Skewed Base Compositions, Asymmetric Transition Matrices, and Phylogenetic Invariants

    E-print Network

    Sankoff, David

    Skewed Base Compositions, Asymmetric Transition Matrices, and Phylogenetic Invariants V. FERRETTI a class of semigroups Sc containing matrices of form |1\\p=n-\\acaa1\\p=n-\\ca|toaccount for A+T versus G+C as models for evolution where the phylogenetic inference problem involves species with skewed (AT-rich or AT

  13. Expectation Maximization for Combined Phylogenetic and Hidden Markov Models

    E-print Network

    Keinan, Alon

    Expectation Maximization for Combined Phylogenetic and Hidden Markov Models Adam Siepel December 5 with a combined phylogenetic and hidden Markov model. An efficient method is also shown for computing gradients be combined with hidden Markov models to create a very powerful hybrid model that captures spatial as well

  14. Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis

    E-print Network

    Keinan, Alon

    Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis Adam Siepel Center models of molecular evolution, which apply to individual sites, and hidden Markov models, which allow of secondary structure. In this paper, we review progress on combined phylogenetic and hidden Markov models

  15. The impact of multiple protein sequence alignment on phylogenetic estimation.

    PubMed

    Wang, Li-San; Leebens-Mack, Jim; Kerr Wall, P; Beckmann, Kevin; dePamphilis, Claude W; Warnow, Tandy

    2011-01-01

    Multiple sequence alignment is typically the first step in estimating phylogenetic trees, with the assumption being that as alignments improve, so will phylogenetic reconstructions. Over the last decade or so, new multiple sequence alignment methods have been developed to improve comparative analyses of protein structure, but these new methods have not been typically used in phylogenetic analyses. In this paper, we report on a simulation study that we performed to evaluate the consequences of using these new multiple sequence alignment methods in terms of the resultant phylogenetic reconstruction. We find that while alignment accuracy is positively correlated with phylogenetic accuracy, the amount of improvement in phylogenetic estimation that results from an improved alignment can range from quite small to substantial. We observe that phylogenetic accuracy is most highly correlated with alignment accuracy when sequences are most difficult to align, and that variation in alignment accuracy can have little impact on phylogenetic accuracy when alignment error rates are generally low. We discuss these observations and implications for future work. PMID:21566256

  16. Exploration of phylogenetic data using a global sequence analysis method

    PubMed Central

    Chapus, Charles; Dufraigne, Christine; Edwards, Scott; Giron, Alain; Fertil, Bernard; Deschavanne, Patrick

    2005-01-01

    Background Molecular phylogenetic methods are based on alignments of nucleic or peptidic sequences. The tremendous increase in molecular data permits phylogenetic analyses of very long sequences and of many species, but also requires methods to help manage large datasets. Results Here we explore the phylogenetic signal present in molecular data by genomic signatures, defined as the set of frequencies of short oligonucleotides present in DNA sequences. Although violating many of the standard assumptions of traditional phylogenetic analyses – in particular explicit statements of homology inherent in character matrices – the use of the signature does permit the analysis of very long sequences, even those that are unalignable, and is therefore most useful in cases where alignment is questionable. We compare the results obtained by traditional phylogenetic methods to those inferred by the signature method for two genes: RAG1, which is easily alignable, and 18S RNA, where alignments are often ambiguous for some regions. We also apply this method to a multigene data set of 33 genes for 9 bacteria and one archea species as well as to the whole genome of a set of 16 ?-proteobacteria. In addition to delivering phylogenetic results comparable to traditional methods, the comparison of signatures for the sequences involved in the bacterial example identified putative candidates for horizontal gene transfers. Conclusion The signature method is therefore a fast tool for exploring phylogenetic data, providing not only a pretreatment for discovering new sequence relationships, but also for identifying cases of sequence evolution that could confound traditional phylogenetic analysis. PMID:16280081

  17. Ensemble Sparse Classification of Alzheimer’s Disease

    PubMed Central

    Liu, Manhua; Zhang, Daoqiang; Shen, Dinggang

    2012-01-01

    The high-dimensional pattern classification methods, e.g., support vector machines (SVM), have been widely investigated for analysis of structural and functional brain images (such as magnetic resonance imaging (MRI)) to assist the diagnosis of Alzheimer’s disease (AD) including its prodromal stage, i.e., mild cognitive impairment (MCI). Most existing classification methods extract features from neuroimaging data and then construct a single classifier to perform classification. However, due to noise and small sample size of neuroimaging data, it is challenging to train only a global classifier that can be robust enough to achieve good classification performance. In this paper, instead of building a single global classifier, we propose a local patch-based subspace ensemble method which builds multiple individual classifiers based on different subsets of local patches and then combines them for more accurate and robust classification. Specifically, to capture the local spatial consistency, each brain image is partitioned into a number of local patches and a subset of patches is randomly selected from the patch pool to build a weak classifier. Here, the sparse representation-based classification (SRC) method, which has shown effective for classification of image data (e.g., face), is used to construct each weak classifier. Then, multiple weak classifiers are combined to make the final decision. We evaluate our method on 652 subjects (including 198 AD patients, 225 MCI and 229 normal controls) from Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using MR images. The experimental results show that our method achieves an accuracy of 90.8% and an area under the ROC curve (AUC) of 94.86% for AD classification and an accuracy of 87.85% and an AUC of 92.90% for MCI classification, respectively, demonstrating a very promising performance of our method compared with the state-of-the-art methods for AD/MCI classification using MR images. PMID:22270352

  18. Application of COI Sequences in Studies of Phylogenetic Relationships Among 40 Apionidae Species

    PubMed Central

    Ptaszy?ska, Aneta A.; ??towski, Jacek; Gnat, Sebastian; Ma?ek, Wanda

    2012-01-01

    The systematics of the family Apionidae, as well as the superfamily Curculionoidea, is currently in a state of flux. The comparative analyses of COI sequences from our studies shed some light on the systematics of these weevils. To study the relationship among the organisms of the family Apionidae, we determined the COI sequences of representatives of 23 species and 15 genera, i.e., Apion, Betulapion, Catapion, Ceratapion, Cyanapion, Eutrichapion, Exapion, Hemitrichapion, Holotrichapion, Ischnopterapion, Protapion, Pseudoperapion, Psudoprotapion, Pseudostenapion, and Stenopterapion. Then, they were compared with the COI sequences of 19 species and eight genera from GenBank (Aspidapion, Ceratapion, Exapion, Ischnopterapion, Lepidapion, Omphalapion, Oxystoma, and Protapion). The phylogenetic relationships inferred from molecular data are similar to the classification system developed by Alonso-Zarazaga and Lyal (1999), with some exceptions within the tribe Oxystomatini, and genera Ceratapion and Exapion. PMID:22934614

  19. Occupational Classification System Manual

    NSDL National Science Digital Library

    Researchers may gain insight into the Bureau of Labor Statistics and Census Bureau occupational codes via the Occupational Classification System Manual (OCSM). A list of Major Occupation Group titles (MOGs) is provided as well as links to the Census Occupation Index--an alphabetical list of approximately 30,000 occupational titles. Further guidance in locating the proper occupation classification for research queries is outlined in the articles "Using the OCSM" and "Using the Census Index."

  20. Spline Classification Methods

    NASA Technical Reports Server (NTRS)

    Guseman, L. F., Jr.; Schumaker, L. L.

    1983-01-01

    The use of spline functions in the development of classification algorithms is discussed. A method is formulated for producing spline approximations to univariate density functions when each density function is described by a histogram of measurements. The resulting approximations are then incorporated into a Bayesian classification procedure for which the probability of misclassification can be readily computed. Some preliminary numerical results are presented to illustrate the method.

  1. Potato Chip Classification

    NSDL National Science Digital Library

    1998-01-01

    This activity introduces the structure and function of a dichotomous key, in preparation for student identification of plant and animal specimens. It also reinforces the idea that there are many possible answers in science. Students will be able to classify specimens (in this case, potato chips) according to observable characteristics, prepare a key showing their classification system, use their key to identify a specimen, and recognize the validity of classmates' classification systems.

  2. Free classification of American English dialects by native and non-native listeners

    PubMed Central

    Clopper, Cynthia G.; Bradlow, Ann R.

    2009-01-01

    Most second language acquisition research focuses on linguistic structures, and less research has examined the acquisition of sociolinguistic patterns. The current study explored the perceptual classification of regional dialects of American English by native and non-native listeners using a free classification task. Results revealed similar classification strategies for the native and non-native listeners. However, the native listeners were more accurate overall than the non-native listeners. In addition, the non-native listeners were less able to make use of constellations of cues to accurately classify the talkers by dialect. However, the non-native listeners were able to attend to cues that were either phonologically or sociolinguistically relevant in their native language. These results suggest that non-native listeners can use information in the speech signal to classify talkers by regional dialect, but that their lack of signal-independent cultural knowledge about variation in the second language leads to less accurate classification performance. PMID:20161400

  3. Dental characters of the Quaternary tapirs in China, their significance in classification and phylogenetic assessment

    Microsoft Academic Search

    Haowen Tong

    2005-01-01

    Most of the Quaternary tapir fossils from China are isolated teeth. The purpose of this paper is to identify them and to extract systematic and evolutionary information from them. Based on morphology and W\\/L ratio, isolated teeth can be identified successfully. On the whole, the identification of P1, M3 and P2 is believed to be reliable, while it is difficult

  4. Phylogenetic comparison and classification of laccase and related multicopper oxidase protein sequences

    E-print Network

    James, Timothy

    sequences Patrik J. Hoegger1 , Sreedhar Kilaru1 , Timothy Y. James2 , Jason R. Thacker2 and Ursula Ku¨ es1 1 Georg-August-University Go¨ttingen, Institute of Forest Botany, Go¨ttingen, Germany 2 Duke University, Buesgenweg 2, 37077 Go¨ttingen, Germany Fax: +49 551392705 Tel: +49 5513914086 E-mail: phoegge

  5. A higher-level phylogenetic classification of the Fungi David S. HIBBETTa,

    E-print Network

    Lutzoni, François M.

    . Brandon MATHENYa , David J. MCLAUGHLINh , Martha J. POWELLi , Scott REDHEADj , Conrad L. SCHOCHk , Joseph Ave, Beltsville, MD 20705 USA n ABL Herbarium, Gerrit van der Veenstraat 107, NL-3762 XK Soest

  6. Predicting rates of interspecific interaction from phylogenetic trees.

    PubMed

    Nuismer, Scott L; Harmon, Luke J

    2015-01-01

    Integrating phylogenetic information can potentially improve our ability to explain species' traits, patterns of community assembly, the network structure of communities, and ecosystem function. In this study, we use mathematical models to explore the ecological and evolutionary factors that modulate the explanatory power of phylogenetic information for communities of species that interact within a single trophic level. We find that phylogenetic relationships among species can influence trait evolution and rates of interaction among species, but only under particular models of species interaction. For example, when interactions within communities are mediated by a mechanism of phenotype matching, phylogenetic trees make specific predictions about trait evolution and rates of interaction. In contrast, if interactions within a community depend on a mechanism of phenotype differences, phylogenetic information has little, if any, predictive power for trait evolution and interaction rate. Together, these results make clear and testable predictions for when and how evolutionary history is expected to influence contemporary rates of species interaction. PMID:25349102

  7. Open Reading Frame Phylogenetic Analysis on the Cloud

    PubMed Central

    2013-01-01

    Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843

  8. New weighting methods for phylogenetic tree reconstruction using multiple loci.

    PubMed

    Misawa, Kazuharu; Tajima, Fumio

    2012-08-01

    Efficient determination of evolutionary distances is important for the correct reconstruction of phylogenetic trees. The performance of the pooled distance required for reconstructing a phylogenetic tree can be improved by applying large weights to appropriate distances for reconstructing phylogenetic trees and small weights to inappropriate distances. We developed two weighting methods, the modified Tajima-Takezaki method and the modified least-squares method, for reconstructing phylogenetic trees from multiple loci. By computer simulations, we found that both of the new methods were more efficient in reconstructing correct topologies than the no-weight method. Hence, we reconstructed hominoid phylogenetic trees from mitochondrial DNA using our new methods, and found that the levels of bootstrap support were significantly increased by the modified Tajima-Takezaki and by the modified least-squares method. PMID:22871951

  9. Student interpretations of phylogenetic trees in an introductory biology course

    NASA Astrophysics Data System (ADS)

    Dees, Jonathan Andrew

    Phylogenetic trees are a common visual representation in biology, and the most important visual representation used in evolutionary biology. Thus, phylogenetic trees have also become an important component of biology education. We sought to determine what forms of reasoning are utilized by introductory biology students to interpret taxa relatedness on phylogenetic trees, what percentage of students correctly interpret taxa relatedness, and how these results alter in response to instruction and over time. Our students demonstrated a tendency for counting synapomorphies and nodes, rather than more common misinterpretations found in current literature. Students also struggled mightily with correctly interpreting phylogenetic trees, including many who exhibited memorization of correct reasoning. Broad initial instruction achieved little for phylogenetic tree understanding. More targeted instruction on evolutionary relationships improved understanding, but to a still unacceptable level. It appears these visual representations, which can directly affect student understanding of evolution, represent a formidable challenge for instructors.

  10. Analysis of complete mitochondrial genomes from extinct and extant rhinoceroses reveals lack of phylogenetic resolution

    PubMed Central

    Willerslev, Eske; Gilbert, M Thomas P; Binladen, Jonas; Ho, Simon YW; Campos, Paula F; Ratan, Aakrosh; Tomsho, Lynn P; da Fonseca, Rute R; Sher, Andrei; Kuznetsova, Tatanya V; Nowak-Kemp, Malgosia; Roth, Terri L; Miller, Webb; Schuster, Stephan C

    2009-01-01

    Background The scientific literature contains many examples where DNA sequence analyses have been used to provide definitive answers to phylogenetic problems that traditional (non-DNA based) approaches alone have failed to resolve. One notable example concerns the rhinoceroses, a group for which several contradictory phylogenies were proposed on the basis of morphology, then apparently resolved using mitochondrial DNA fragments. Results In this study we report the first complete mitochondrial genome sequences of the extinct ice-age woolly rhinoceros (Coelodonta antiquitatis), and the threatened Javan (Rhinoceros sondaicus), Sumatran (Dicerorhinus sumatrensis), and black (Diceros bicornis) rhinoceroses. In combination with the previously published mitochondrial genomes of the white (Ceratotherium simum) and Indian (Rhinoceros unicornis) rhinoceroses, this data set putatively enables reconstruction of the rhinoceros phylogeny. While the six species cluster into three strongly supported sister-pairings: (i) The black/white, (ii) the woolly/Sumatran, and (iii) the Javan/Indian, resolution of the higher-level relationships has no statistical support. The phylogenetic signal from individual genes is highly diffuse, with mixed topological support from different genes. Furthermore, the choice of outgroup (horse vs tapir) has considerable effect on reconstruction of the phylogeny. The lack of resolution is suggestive of a hard polytomy at the base of crown-group Rhinocerotidae, and this is supported by an investigation of the relative branch lengths. Conclusion Satisfactory resolution of the rhinoceros phylogeny may not be achievable without additional analyses of substantial amounts of nuclear DNA. This study provides a compelling demonstration that, in spite of substantial sequence length, there are significant limitations with single-locus phylogenetics. We expect further examples of this to appear as next-generation, large-scale sequencing of complete mitochondrial genomes becomes commonplace in evolutionary studies. "The human factor in classification is nowhere more evident than in dealing with this superfamily (Rhinocerotoidea)." G. G. Simpson (1945) PMID:19432984

  11. Molecular phylogenetics, species diversity, and biogeography of the Andean lizards of the genus Proctoporus (Squamata: Gymnophthalmidae).

    PubMed

    Goicoechea, Noemí; Padial, José M; Chaparro, Juan C; Castroviejo-Fisher, Santiago; De la Riva, Ignacio

    2012-12-01

    The family Gymnophthalmidae comprises ca. 220 described species of Neotropical lizards distributed from southern Mexico to Argentina. It includes 36 genera, among them Proctoporus, which contains six currently recognized species occurring across the yungas forests and wet montane grasslands of the Amazonian versant of the Andes from central Peru to central Bolivia. Here, we investigate the phylogenetic relationships and species limits of Proctoporus and closely related taxa by analyzing 2121 base pairs of mitochondrial (12S, 16S, and ND4) and nuclear (c-mos) genes. Our taxon sampling of 92 terminals includes all currently recognized species of Proctoporus and 15 additional species representing the most closely related groups to the genus. Maximum parsimony, maximum likelihood and Bayesian phylogenetic analyses recovered a congruent, fully resolved, and strongly supported hypothesis of relationships that challenges previous phylogenetic hypotheses and classifications, and biogeographic scenarios. Our main results are: (i) discovery of a strongly supported clade that includes all species of Proctoporus and within which are nested the monotypic Opipeuter xestus (a genus that we consider a junior synonym of Proctoporus), and two species of Euspondylus, that are therefore transferred to Proctoporus; (ii) the paraphyly of Proctoporus bolivianus with respect to P. subsolanus, which is proposed as a junior synonym of P. bolivianus; (iii) the detection of seven divergent and reciprocally monophyletic lineages (five of them previously assigned to P. bolivianus) that are considered confirmed candidate species, which implies that more candidate species are awaiting formal description and naming than currently recognized species in the genus; (iv) rejection of the hypothesis that Proctoporus diversified following a south to north pattern parallel to the elevation of the Andes; (v) species diversity in Proctoporus is the result of in situ diversification through vicariance in the grasslands of the high Andes, with at least five dispersals contributing to montane forest species. PMID:22982151

  12. Close phylogenetic relationship between Angolan and Romanian HIV-1 subtype F1 isolates

    PubMed Central

    Guimarães, Monick L; Vicente, Ana Carolina P; Otsuki, Koko; da Silva, Rosa Ferreira FC; Francisco, Moises; da Silva, Filomena Gomes; Serrano, Ducelina; Morgado, Mariza G; Bello, Gonzalo

    2009-01-01

    Background Here, we investigated the phylogenetic relationships of the HIV-1 subtype F1 circulating in Angola with subtype F1 strains sampled worldwide and reconstructed the evolutionary history of this subtype in Central Africa. Methods Forty-six HIV-1-positive samples were collected in Angola in 2006 and subtyped at the env-gp41 region. Partial env-gp120 and pol-RT sequences and near full-length genomes from those env-gp41 subtype F1 samples were further generated. Phylogenetic analyses of partial and full-length subtype F1 strains isolated worldwide were carried out. The onset date of the subtype F1 epidemic in Central Africa was estimated using a Bayesian Markov chain Monte Carlo approach. Results Nine Angolan samples were classified as subtype F1 based on the analysis of the env-gp41 region. All nine Angolan sequences were also classified as subtype F1 in both env-gp120 and pol-RT genomic regions, and near full-length genome analysis of four of these samples confirmed their classification as "pure" subtype F1. Phylogenetic analyses of subtype F1 strains isolated worldwide revealed that isolates from the Democratic Republic of Congo (DRC) were the earliest branching lineages within the subtype F1 phylogeny. Most strains from Angola segregated in a monophyletic group together with Romanian sequences; whereas South American F1 sequences emerged as an independent cluster. The origin of the subtype F1 epidemic in Central African was estimated at 1958 (1934–1971). Conclusion "Pure" subtype F1 strains are common in Angola and seem to be the result of a single founder event. Subtype F1 sequences from Angola are closely related to those described in Romania, and only distantly related to the subtype F1 lineage circulating in South America. Original diversification of subtype F1 probably occurred within the DRC around the late 1950s. PMID:19386115

  13. Reconstruction of Family-Level Phylogenetic Relationships within Demospongiae (Porifera) Using Nuclear Encoded Housekeeping Genes

    PubMed Central

    Hill, Malcolm S.; Hill, April L.; Lopez, Jose; Peterson, Kevin J.; Pomponi, Shirley; Diaz, Maria C.; Thacker, Robert W.; Adamska, Maja; Boury-Esnault, Nicole; Cárdenas, Paco; Chaves-Fonnegra, Andia; Danka, Elizabeth; De Laine, Bre-Onna; Formica, Dawn; Hajdu, Eduardo; Lobo-Hajdu, Gisele; Klontz, Sarah; Morrow, Christine C.; Patel, Jignasa; Picton, Bernard; Pisani, Davide; Pohlmann, Deborah; Redmond, Niamh E.; Reed, John; Richey, Stacy; Riesgo, Ana; Rubin, Ewelina; Russell, Zach; Rützler, Klaus; Sperling, Erik A.; di Stefano, Michael; Tarver, James E.; Collins, Allen G.

    2013-01-01

    Background Demosponges are challenging for phylogenetic systematics because of their plastic and relatively simple morphologies and many deep divergences between major clades. To improve understanding of the phylogenetic relationships within Demospongiae, we sequenced and analyzed seven nuclear housekeeping genes involved in a variety of cellular functions from a diverse group of sponges. Methodology/Principal Findings We generated data from each of the four sponge classes (i.e., Calcarea, Demospongiae, Hexactinellida, and Homoscleromorpha), but focused on family-level relationships within demosponges. With data for 21 newly sampled families, our Maximum Likelihood and Bayesian-based approaches recovered previously phylogenetically defined taxa: Keratosap, Myxospongiaep, Spongillidap, Haploscleromorphap (the marine haplosclerids) and Democlaviap. We found conflicting results concerning the relationships of Keratosap and Myxospongiaep to the remaining demosponges, but our results strongly supported a clade of Haploscleromorphap+Spongillidap+Democlaviap. In contrast to hypotheses based on mitochondrial genome and ribosomal data, nuclear housekeeping gene data suggested that freshwater sponges (Spongillidap) are sister to Haploscleromorphap rather than part of Democlaviap. Within Keratosap, we found equivocal results as to the monophyly of Dictyoceratida. Within Myxospongiaep, Chondrosida and Verongida were monophyletic. A well-supported clade within Democlaviap, Tetractinellidap, composed of all sampled members of Astrophorina and Spirophorina (including the only lithistid in our analysis), was consistently revealed as the sister group to all other members of Democlaviap. Within Tetractinellidap, we did not recover monophyletic Astrophorina or Spirophorina. Our results also reaffirmed the monophyly of order Poecilosclerida (excluding Desmacellidae and Raspailiidae), and polyphyly of Hadromerida and Halichondrida. Conclusions/Significance These results, using an independent nuclear gene set, confirmed many hypotheses based on ribosomal and/or mitochondrial genes, and they also identified clades with low statistical support or clades that conflicted with traditional morphological classification. Our results will serve as a basis for future exploration of these outstanding questions using more taxon- and gene-rich datasets. PMID:23372644

  14. Indigenous vs. International soil classification system in Ohangwena Region, Namibia

    NASA Astrophysics Data System (ADS)

    Prudat, Brice; Kuhn, Nikolaus J.; Bloemertz, Lena

    2014-05-01

    This poster will present soil diversity in North-Central Namibia, with a focus on soil fertility. It aims to show the correspondence and differences between an international and an indigenous soil classification system. International classifications, like World Reference Base for Soil Resources (WRB), are very helpful tools to share information in soil science and agriculture. However, these classification are meaningful for large scale soil processes understanding but local specificities cannot be understood and differentiated. On the other hand, knowledge that farmers have on cultivated soils is very accurate and adapted to local agricultural use. However, their knowledge should be properly defined and translated to be used by scientists. Once their knowledge can be read by scientists, it provides very powerful tools for soil mapping and characterization. Analysis so far has focused on the area of Ondobe (30 km West from Eenhana, Ohangwena region). This area is located between two major systems, the Cuvelai floodplain to the West and the Kalahari Woodlands to the East. While all the cultivated soils from this region would be classified as Arenosols (WRB), the local classification differentiates five major soil types (Omutunda, Ehenge, Omufitu, Elondo, Ehenene). In WRB classification, these soils correspond, roughly, to specific Arenosols, respectively Hypereutric, Albic, Haplic, Rubic and Salic Arenosols. Further work will evaluate, the local variation inside each indigenous soil types. Hierarchical classification using soil field descriptors will be used to create statistic soil groups. These new groups will then be compared to each classification system.

  15. Detecting Phylogenetic Signals From Deep Roots of the Tree of Life

    E-print Network

    Amrine, Katherine Colleen Harris

    2013-01-01

    phylogenetic tree reconstruction DNA Deoxyribonucleic Acid (DNA ? RNA ? protein), the products of this universal process have been exploited for phylogenetic reconstructionDNA to code for a protein, or coding RNA, can be beneficial for phylogenetic tree reconstruction

  16. Weighted morphology: a new approach towards phylogenetic assessment of Nostocales (Cyanobacteria).

    PubMed

    Mishra, Swati; Bhargava, Poonam; Adhikary, Siba Prasad; Pradeep, Anubhav; Rai, Lal Chand

    2015-01-01

    The classification of order Nostocales (Cyanobacteria) and inter relationships of morphologically similar taxa is still debatable due to ever changing morphological features. No attempt has been made to improve the morphological taxonomy despite the fact that it is the morphology that represents the totality of genes. To test the validity of morphological taxonomy and fine tune the phylogenetic relationships within the order Nostocales a new weighted morphology approach was applied by using 76 isolates and their 16S rRNA gene sequences. Further, the study was extended with morphological data set of the remaining 232 taxa for which no molecular data are yet available. Trichome aggregation, heterocyst shape, and akinete shape are suggested as important and stable features for identification. At 30% weight assignment to the selected morphological characters, morphological taxonomy found 36% compatible with 16S tree. Adding weight to the morphological characters considerably improved the congruence between the morphology and 16S rRNA-based phylogenetic trees of the order Nostocales. When the weighting procedure was extended to all the Nostocalean members irrespective of molecular data availability, it was found that Nostoc sphaericum and Nostoc microscopicum closely assembled in a single clade. Closer arrangement of Aulosira and Nodularia represent the subfamily aulosirae (Bornet and Flahault Ann Sci Nat Bot 7:223-224, 1888) while taxonomic affiliation of Cylindrospermum with Nostoc, Anabaena, and Raphidiopsis representing the subfamily anabaenae (Bornet and Flahault Ann Sci Nat Bot 7:223-224, 1888) was resolved. PMID:24965370

  17. Phylogenetic Relationships in Bupleurum (Apiaceae) Based on Nuclear Ribosomal DNA ITS Sequence Data

    PubMed Central

    NEVES, SUSANA S.; WATSON, MARK F.

    2004-01-01

    • Backgroud and Aims The genus Bupleurum has long been recognized as a natural group, but its infrageneric classification is controversial and has not yet been studied in the light of sequence data. • Methods Phylogenetic relationships among 32 species (35 taxa) of the genus Bupleurum were investigated by comparative sequencing of the ITS region of the 18–26S nuclear ribosomal DNA repeat. Exemplar taxa from all currently accepted sections and subsections of the genus were included, along with outgroups from four other early branching Apioideae genera (Anginon, Heteromorpha, Physospermum and Pleurospermum). • Key Results Phylogenies generated by maximum parsimony, maximum likelihood, and neighbour?joining methods show similar topologies, demonstrating monophyly of Bupleurum and the division of the genus into two major clades. This division is also supported by analysis of the 5.8S coding sequence alone. The first branching clade is formed by all the species of the genus with pinnate?reticulate veined leaves and B. rigidum with a unique type of leaf venation. The other major clade includes the remaining species studied, all of which have more or less parallel?veined leaves. • Conclusions These phylogenetic results do not agree with any previous classifications of the genus. Molecular data also suggest that the endemic Macaronesian species B. salicifolium is a neoendemic, as the sequence divergence between the populations in Madeira and Canary Islands, and closer mainland relatives in north?west Africa is small. All endemic north?west African taxa are included in a single unresolved but well?supported clade, and the low nucleotide variation of ITS suggests a recent radiation within this group. The only southern hemisphere species, B. mundii (southern Africa), is shown to be a neoendemic, apparently closely related to B. falcatum, a Eurasian species. PMID:14980972

  18. Extrahepatic biliary cancer: New staging classification

    PubMed Central

    Ganeshan, Dhakshinamoorthy; Moron, Fanny E; Szklaruk, Janio

    2012-01-01

    Tumor staging defines the point in the natural history of the malignancy when the diagnosis is made. The most common staging system for cancer is the tumor, node, metastases classification. Staging of cancers provides useful parameters in the determination of the extent of disease and prognosis. Cholangiocarcinoma are rare and refers to cancers that arise from the biliary epithelium. These tumors can occur anywhere along the biliary tree. These tumors have been previously divided into extrahepatic and intrahepatic lesions. Until recently the extrahepatic bile duct tumors have been considered as a single entity per American Joint Commission on Cancer (AJCC) staging classification. The most recent changes to the AJCC classification of bile duct cancers divide the tumors into two major categories: proximal and distal tumors. This practical classification is based on anatomy and surgical management. High quality cross-sectional computed tomography (CT) and/or magnetic resonance (MR) imaging of the abdomen are essential information to accurately stage this tumors. Imaging plays an important role in diagnosis, localization, staging and optimal management of cholangiocarcinoma. For example, it helps to localize the tumor to either perihilar or distal bile duct, both of which have different management. Further, it helps to accurately stage the disease and identify the presence of significant nodal and distant metastasis, which may preclude surgery. Also, it helps to identify the extent of local invasion, which has a major impact on the management. For example, extensive involvement of hepatic duct reaching up to second-order biliary radicals or major vascular encasement of portal vein or hepatic arteries precludes curative surgery and patient may be managed by palliative therapy. Further, imaging helps to identify any anatomical variations in the hepatic arterial or venous circulation and biliary ductal system, which is vital information for surgical planning. This review presents relevant clinical presentation and imaging acquisition and presentation for the accurate staging classification of bile duct tumors based on the new AJCC criteria. This will be performed with the assistance of anatomical diagrams and representative CT and MR images. The image interpretation must include all relevant imaging information for optimum staging. Detailed recommendations on the items required on the radiology report will be presented. PMID:22937214

  19. Threatened Species and the Potential Loss of Phylogenetic Diversity: Conservation Scenarios Based on Estimated Extinction Probabilities and Phylogenetic Risk Analysis

    Microsoft Academic Search

    DANIEL P. FAITH

    2008-01-01

    New species conservation strategies, including the EDGE of Existence (EDGE) program, have ex- panded threatened species assessments by integrating information about species' phylogenetic distinctiveness. Distinctiveness has been measured through simple scores that assign shared credit among species for evolu- tionary heritage represented by the deeper phylogenetic branches. A species with a high score combined with a high extinction probability receives

  20. Evolution and Classification of P-loop Kinases and Related Proteins

    Microsoft Academic Search

    Detlef D. Leipe; Eugene V. Koonin; L. Aravind

    2003-01-01

    Sequences and structures of all P-loop-fold proteins were compared with the aim of reconstructing the principal events in the evolution of P-loop-containing kinases. It is shown that kinases and some related proteins comprise a monophyletic assemblage within the P-loop NTPase fold. An evolutionary classification of these proteins was developed using standard phylogenetic methods, analysis of shared sequence and structural signatures,

  1. Phylogeny and Classification of the Trapdoor Spider Genus Myrmekiaphila: An Integrative Approach to Evaluating Taxonomic Hypotheses

    PubMed Central

    Bailey, Ashley L.; Brewer, Michael S.; Hendrixson, Brent E.; Bond, Jason E.

    2010-01-01

    Background Revised by Bond and Platnick in 2007, the trapdoor spider genus Myrmekiaphila comprises 11 species. Species delimitation and placement within one of three species groups was based on modifications of the male copulatory device. Because a phylogeny of the group was not available these species groups might not represent monophyletic lineages; species definitions likewise were untested hypotheses. The purpose of this study is to reconstruct the phylogeny of Myrmekiaphila species using molecular data to formally test the delimitation of species and species-groups. We seek to refine a set of established systematic hypotheses by integrating across molecular and morphological data sets. Methods and Findings Phylogenetic analyses comprising Bayesian searches were conducted for a mtDNA matrix composed of contiguous 12S rRNA, tRNA-val, and 16S rRNA genes and a nuclear DNA matrix comprising the glutamyl and prolyl tRNA synthetase gene each consisting of 1348 and 481 bp, respectively. Separate analyses of the mitochondrial and nuclear genome data and a concatenated data set yield M. torreya and M. millerae paraphyletic with respect to M. coreyi and M. howelli and polyphyletic fluviatilis and foliata species groups. Conclusions Despite the perception that molecular data present a solution to a crisis in taxonomy, studies like this demonstrate the efficacy of an approach that considers data from multiple sources. A DNA barcoding approach during the species discovery process would fail to recognize at least two species (M. coreyi and M. howelli) whereas a combined approach more accurately assesses species diversity and illuminates speciation pattern and process. Concomitantly these data also demonstrate that morphological characters likewise fail in their ability to recover monophyletic species groups and result in an unnatural classification. Optimizations of these characters demonstrate a pattern of “Dollo evolution” wherein a complex character evolves only once but is lost multiple times throughout the group's history. PMID:20856873

  2. Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison

    PubMed Central

    2013-01-01

    Background Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. Results We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. Conclusion CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets. PMID:23617892

  3. Obtaining accurate measurement using redundant sensors

    E-print Network

    Burnett, Michael Scott

    1996-01-01

    Conventional wisdom suggests to accomplish accurate measurement, the sensors used must have high precision and excellent dynamic range. This generally results in sensor systems that are complex, costly, and often sensitive to environmental factors...

  4. Determination of Highly Accurate Heats of Formation

    NASA Technical Reports Server (NTRS)

    Lee, Timothy J.; Langhoff, Stephen R. (Technical Monitor)

    1996-01-01

    Two approaches for directly computing a molecular heat of formation based on sophisticated ab initio electronic structure theory axe discussed and example calculations are presented for several molecules of interest in atmospheric chemistry including HNO and CICN. The accuracy of these approaches for a small subset of the molecules is demonstrated by comparison to very accurate experimental data. A third approach for evaluating accurate heats of formation consists of combining experimental and theoretical data. The potential for this method to provide reliable thermochemical data on many molecules using accurate experimental beats of formation for only a small number of species is demonstrated by giving accurate (Delta)H(sup 0, sub f) quantities for a host of fluorine, chlorine, and bromine oxide and nitrogen oxide compounds. Potential pitfalls in all three approaches are discussed.

  5. Accurate capacitive metrology for atomic force microscopy

    E-print Network

    Mazzeo, Aaron D. (Aaron David), 1979-

    2005-01-01

    This thesis presents accurate capacitive sensing metrology designed for a prototype atomic force microscope (AFM) originally developed in the MIT Precision Motion Control Lab. The capacitive measurements use a set of ...

  6. Scalable metagenomic taxonomy classification using a reference genome database

    PubMed Central

    Ames, Sasha K.; Hysom, David A.; Gardner, Shea N.; Lloyd, G. Scott; Gokhale, Maya B.; Allen, Jonathan E.

    2013-01-01

    Motivation: Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. Results: A method is presented to shift computational costs to an off-line computation by creating a taxonomy/genome index that supports scalable metagenomic classification. Scalable performance is demonstrated on real and simulated data to show accurate classification in the presence of novel organisms on samples that include viruses, prokaryotes, fungi and protists. Taxonomic classification of the previously published 150 giga-base Tyrolean Iceman dataset was found to take <20 h on a single node 40 core large memory machine and provide new insights on the metagenomic contents of the sample. Availability: Software was implemented in C++ and is freely available at http://sourceforge.net/projects/lmat Contact: allen99@llnl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23828782

  7. Investigating the performance of AIC in selecting phylogenetic models.

    PubMed

    Jhwueng, Dwueng-Chwuan; Huzurbazar, Snehalata; O'Meara, Brian C; Liu, Liang

    2014-08-01

    The popular likelihood-based model selection criterion, Akaike's Information Criterion (AIC), is a breakthrough mathematical result derived from information theory. AIC is an approximation to Kullback-Leibler (KL) divergence with the derivation relying on the assumption that the likelihood function has finite second derivatives. However, for phylogenetic estimation, given that tree space is discrete with respect to tree topology, the assumption of a continuous likelihood function with finite second derivatives is violated. In this paper, we investigate the relationship between the expected log likelihood of a candidate model, and the expected KL divergence in the context of phylogenetic tree estimation. We find that given the tree topology, AIC is an unbiased estimator of the expected KL divergence. However, when the tree topology is unknown, AIC tends to underestimate the expected KL divergence for phylogenetic models. Simulation results suggest that the degree of underestimation varies across phylogenetic models so that even for large sample sizes, the bias of AIC can result in selecting a wrong model. As the choice of phylogenetic models is essential for statistical phylogenetic inference, it is important to improve the accuracy of model selection criteria in the context of phylogenetics. PMID:24867284

  8. Student Interpretations of Phylogenetic Trees in an Introductory Biology Course

    PubMed Central

    Dees, Jonathan; Niemi, Jarad; Montplaisir, Lisa

    2014-01-01

    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa relatedness on phylogenetic trees, to measure the prevalence of correct taxa-relatedness interpretations, and to determine how student reasoning and correctness change in response to instruction and over time. Counting synapomorphies and nodes between taxa were the most common forms of incorrect reasoning, which presents a pedagogical dilemma concerning labeled synapomorphies on phylogenetic trees. Students also independently generated an alternative form of correct reasoning using monophyletic groups, the use of which decreased in popularity over time. Approximately half of all students were able to correctly interpret taxa relatedness on phylogenetic trees, and many memorized correct reasoning without understanding its application. Broad initial instruction that allowed students to generate inferences on their own contributed very little to phylogenetic tree understanding, while targeted instruction on evolutionary relationships improved understanding to some extent. Phylogenetic trees, which can directly affect student understanding of evolution, appear to offer introductory biology instructors a formidable pedagogical challenge. PMID:25452489

  9. Student interpretations of phylogenetic trees in an introductory biology course.

    PubMed

    Dees, Jonathan; Momsen, Jennifer L; Niemi, Jarad; Montplaisir, Lisa

    2014-01-01

    Phylogenetic trees are widely used visual representations in the biological sciences and the most important visual representations in evolutionary biology. Therefore, phylogenetic trees have also become an important component of biology education. We sought to characterize reasoning used by introductory biology students in interpreting taxa relatedness on phylogenetic trees, to measure the prevalence of correct taxa-relatedness interpretations, and to determine how student reasoning and correctness change in response to instruction and over time. Counting synapomorphies and nodes between taxa were the most common forms of incorrect reasoning, which presents a pedagogical dilemma concerning labeled synapomorphies on phylogenetic trees. Students also independently generated an alternative form of correct reasoning using monophyletic groups, the use of which decreased in popularity over time. Approximately half of all students were able to correctly interpret taxa relatedness on phylogenetic trees, and many memorized correct reasoning without understanding its application. Broad initial instruction that allowed students to generate inferences on their own contributed very little to phylogenetic tree understanding, while targeted instruction on evolutionary relationships improved understanding to some extent. Phylogenetic trees, which can directly affect student understanding of evolution, appear to offer introductory biology instructors a formidable pedagogical challenge. PMID:25452489

  10. Phylogenetic structure of angiosperm communities during tropical forest succession

    PubMed Central

    Letcher, Susan G.

    2010-01-01

    The phylogenetic structure of ecological communities can shed light on assembly processes, but the focus of phylogenetic structure research thus far has been on mature ecosystems. Here, I present the first investigation of phylogenetic community structure during succession. In a replicated chronosequence of 30 sites in northeastern Costa Rica, I found strong phylogenetic overdispersion at multiple scales: species present at local sites were a non-random assemblage, more distantly related than chance would predict. Phylogenetic overdispersion was evident when comparing the species present at each site with the regional species pool, the species pool found in each age category to the regional pool or the species present at each site to the pool of species found in sites of that age category. Comparing stem size classes within each age category, I found that during early succession, phylogenetic overdispersion is strongest in small stems. Overdispersion strengthens and spreads into larger size classes as succession proceeds, corroborating an existing model of forest succession. This study is the first evidence that succession leaves a distinct signature in the phylogenetic structure of communities. PMID:19801375

  11. Phylogeny and classification of Prunus sensu lato (Rosaceae).

    PubMed

    Shi, Shuo; Li, Jinlu; Sun, Jiahui; Yu, Jing; Zhou, Shiliang

    2013-11-01

    The classification of the economically important genus Prunus L. sensu lato (s.l.) is controversial due to the high levels of convergent or the parallel evolution of morphological characters. In the present study, phylogenetic analyses of fifteen main segregates of Prunus s.l. represented by eighty-four species were conducted with maximum parsimony and Bayesian approaches using twelve chloroplast regions (atpB-rbcL, matK, ndhF, psbA-trnH, rbcL, rpL16, rpoC1, rps16, trnS-G, trnL, trnL-F and ycf1) and three nuclear genes (ITS, s6pdh and SbeI) to explore their infrageneric relationships. The results of these analyses were used to develop a new, phylogeny-based classification of Prunus s.l. Our phylogenetic reconstructions resolved three main clades of Prunus s.l. with strong supports. We adopted a broad-sensed genus, Prunus, and recognised three subgenera corresponding to the three main clades: subgenus Padus, subgenus Cerasus and subgenus Prunus. Seven sections of subgenus Prunus were recognised. The dwarf cherries, which were previously assigned to subgenus Cerasus, were included in this subgenus Prunus. One new section name, Prunus L. subgenus Prunus section Persicae (T. T. Yü & L. T. Lu) S. L. Zhou and one new species name, Prunus tianshanica (Pojarkov) S. Shi, were proposed. PMID:23945216

  12. A re-evaluation of phylogenetic relationships within reed warblers (Aves: Acrocephalidae) based on eight molecular loci and ISSR profiles.

    PubMed

    Arbabi, Tayebeh; Gonzalez, Javier; Wink, Michael

    2014-09-01

    Acrocephalidae is the most monomorphic family among passerines and has seen a long history of different classifications and successive revisions. In this study, we evaluated the phylogenetic relationships among 35 species of Acrocephalidae based on DNA sequences from five nuclear loci (MB, ODC, LDH, FIB5 and RAG-1), three mitochondrial genes (CYB, ND2 and COI) and genomic fingerprinting with ISSR-PCR. We could improve the resolution of phylogenetic relationships among many species, but despite the use of 6280 nucleotides, some deep-level relationships remain enigmatic. Lack of nodal support at some branches may be the result of rapid radiation. The last common ancestor of this family dated for the Middle Miocene (14 MYA). In agreement with previous studies, we recovered the major clades of Acrocephalus, Iduna (except I. aedon), Hippolais, Nesillas and Calamonastides. We accept the current taxonomic position of Calamonastides gracilirostris as a monotypic genus and the inclusion of Iduna natalensis and I. similis within Iduna but phylogenetic analyses based on mitochondrial and nuclear genes as well as ISSR profiles did not support the position of I. aedon in Iduna. Therefore, we resurrect the former genus Phragamaticola for this species in order to avoid paraphyletic clades. PMID:24910156

  13. Increased phylogenetic diversity of bovine viral diarrhoea virus type 1 isolates in England and Wales since 2001.

    PubMed

    Strong, R; Errington, J; Cook, R; Ross-Smith, N; Wakeley, P; Steinbach, F

    2013-03-23

    Currently, there are two recognised genotypes of Bovine viral diarrhoea virus (BVDV), type 1 and type 2. These genotypes are divided into subtypes based on phylogenetic analysis, namely a-p for BVDV-1 and a-c for BVDV-2. Within this study, the genetic heterogeneity of BVDV-1 in England and Wales was investigated and compared to the situation in 1996/1997. Viral RNA was extracted from 316 blood samples collected between 2004 and 2009 that were previously identified as BVDV-1 positive. A region of the 5' untranslated region (UTR) was amplified by RT-PCR and the PCR products were sequenced. Phylogenetic analysis of the 5'UTR demonstrated the existence of five subtypes of BVDV-1 circulating in England and Wales, namely BVDV-1a (244 samples), BVDV-1b (50), BVDV-1e (3), BVDV-1f (1) and BVDV-1i (18). Phylogenetic analysis of the nucleotide sequence for the N(pro) region of the viral genome supported the classification obtained with the 5'UTR. Given the fact that only three subtypes were detected in 1999 this report supports the notion that the restocking of cattle from continental Europe, after the mass culling during the Foot-and-Mouth outbreak in 2001 and slaughter of cattle due to bovine tuberculosis infection, has increased the genetic diversity of BVDV-1 subtypes in England and Wales in the past 10 years. PMID:23022681

  14. Phylogenetic Analysis of the Complete Mitochondrial Genome of Madurella mycetomatis Confirms Its Taxonomic Position within the Order Sordariales

    PubMed Central

    van de Sande, Wendy W. J.

    2012-01-01

    Background Madurella mycetomatis is the most common cause of human eumycetoma. The genus Madurella has been characterized by overall sterility on mycological media. Due to this sterility and the absence of other reliable morphological and ultrastructural characters, the taxonomic classification of Madurella has long been a challenge. Mitochondria are of monophyletic origin and mitochondrial genomes have been proven to be useful in phylogenetic analyses. Results The first complete mitochondrial DNA genome of a mycetoma-causative agent was sequenced using 454 sequencing. The mitochondrial genome of M. mycetomatis is a circular DNA molecule with a size of 45,590 bp, encoding for the small and the large subunit rRNAs, 27 tRNAs, 11 genes encoding subunits of respiratory chain complexes, 2 ATP synthase subunits, 5 hypothetical proteins, 6 intronic proteins including the ribosomal protein rps3. In phylogenetic analyses using amino acid sequences of the proteins involved in respiratory chain complexes and the 2 ATP synthases it appeared that M. mycetomatis clustered together with members of the order Sordariales and that it was most closely related to Chaetomium thermophilum. Analyses of the gene order showed that within the order Sordariales a similar gene order is found. Furthermore also the tRNA order seemed mostly conserved. Conclusion Phylogenetic analyses of fungal mitochondrial genomes confirmed that M. mycetomatis belongs to the order of Sordariales and that it was most closely related to Chaetomium thermophilum, with which it also shared a comparable gene and tRNA order. PMID:22701687

  15. Geographical Classification in Comics

    E-print Network

    unknown authors

    2009-01-01

    In this article, the concept of geographical classification-- in itself not a particularly widespread method in cultural geography-- is applied to the field of comics. Although geographical classification is already used in comics sometimes, it is rarely reflected upon. This article aims at closing this gap by addressing some issues concerning geographical classification and its appliance to works of art in general and comics in particular. Before moving on to comics, I'd like to start with some examples from the field of “classical ” art to demonstrate the ubiquity of classifications in the world of art and art history. These examples will introduce some concepts and problems connected with art and geographical classification. Some of these are well-known, whereas others have been relatively neglected so far. These concepts will be applied to comics later in the article. As a first example, consider a typical floor plan of an art museum, say, the National Gallery in London. In the floor plan of the National

  16. Disentangling the Phylogenetic and Ecological Components of Spider Phenotypic Variation

    PubMed Central

    Gonçalves-Souza, Thiago; Diniz-Filho, José Alexandre Felizola; Romero, Gustavo Quevedo

    2014-01-01

    An understanding of how the degree of phylogenetic relatedness influences the ecological similarity among species is crucial to inferring the mechanisms governing the assembly of communities. We evaluated the relative importance of spider phylogenetic relationships and ecological niche (plant morphological variables) to the variation in spider body size and shape by comparing spiders at different scales: (i) between bromeliads and dicot plants (i.e., habitat scale) and (ii) among bromeliads with distinct architectural features (i.e., microhabitat scale). We partitioned the interspecific variation in body size and shape into phylogenetic (that express trait values as expected by phylogenetic relationships among species) and ecological components (that express trait values independent of phylogenetic relationships). At the habitat scale, bromeliad spiders were larger and flatter than spiders associated with the surrounding dicots. At this scale, plant morphology sorted out close related spiders. Our results showed that spider flatness is phylogenetically clustered at the habitat scale, whereas it is phylogenetically overdispersed at the microhabitat scale, although phylogenic signal is present in both scales. Taken together, these results suggest that whereas at the habitat scale selective colonization affect spider body size and shape, at fine scales both selective colonization and adaptive evolution determine spider body shape. By partitioning the phylogenetic and ecological components of phenotypic variation, we were able to disentangle the evolutionary history of distinct spider traits and show that plant architecture plays a role in the evolution of spider body size and shape. We also discussed the relevance in considering multiple scales when studying phylogenetic community structure. PMID:24651264

  17. Improved Hierarchical Optimization-Based Classification of Hyperspectral Images Using Shape Analysis

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.

    2012-01-01

    A new spectral-spatial method for classification of hyperspectral images is proposed. The HSegClas method is based on the integration of probabilistic classification and shape analysis within the hierarchical step-wise optimization algorithm. First, probabilistic support vector machines classification is applied. Then, at each iteration two neighboring regions with the smallest Dissimilarity Criterion (DC) are merged, and classification probabilities are recomputed. The important contribution of this work consists in estimating a DC between regions as a function of statistical, classification and geometrical (area and rectangularity) features. Experimental results are presented on a 102-band ROSIS image of the Center of Pavia, Italy. The developed approach yields more accurate classification results when compared to previously proposed methods.

  18. Comparisons of neural networks to standard techniques for image classification and correlation

    NASA Technical Reports Server (NTRS)

    Paola, Justin D.; Schowengerdt, Robert A.

    1994-01-01

    Neural network techniques for multispectral image classification and spatial pattern detection are compared to the standard techniques of maximum-likelihood classification and spatial correlation. The neural network produced a more accurate classification than maximum-likelihood of a Landsat scene of Tucson, Arizona. Some of the errors in the maximum-likelihood classification are illustrated using decision region and class probability density plots. As expected, the main drawback to the neural network method is the long time required for the training stage. The network was trained using several different hidden layer sizes to optimize both the classification accuracy and training speed, and it was found that one node per class was optimal. The performance improved when 3x3 local windows of image data were entered into the net. This modification introduces texture into the classification without explicit calculation of a texture measure. Larger windows were successfully used for the detection of spatial features in Landsat and Magellan synthetic aperture radar imagery.

  19. GPD: A Graph Pattern Diffusion Kernel for Accurate Graph Classification with Applications

    E-print Network

    Huan, Jun "Luke"

    - cation. In our method, we first identify all frequent patterns from a graph database. We then map subgraphs to graphs in the graph database and use a process we call "pattern diffusion" to label nodes searching graph databases [23], [32], [42], [44]. Much of the work on analyzing graph data previous

  20. ACCURATE POSTERIOR PROBABILITY ESTIMATES FOR CHANNEL EQUALIZATION USING GAUSSIAN PROCESSES FOR CLASSIFICATION

    E-print Network

    Pérez-Cruz, Fernando

    by the Spanish MEC (TIC2006- 13514-C01-01/TCM). This work has been partially supported by the Spanish MEC (TEC], radial basis functions (RBFs) [3], recurrent RBFs [4], self-organizing feature maps [5], wavelet neural

  1. Accurate Parental Classification of Overweight Adolescents' Weight Status: Does It Matter?

    Microsoft Academic Search

    Dianne Neumark-Sztainer; Melanie Wall; Patricia van den Berg

    2010-01-01

    OBJECTIVE.Our goal was to explore whether parents of overweight adolescents who recognize that their children are overweight engage in behaviors that are likely to help their adolescents with long-term weight management. METHODS.The study population included overweight adolescents (BMI 85th percen- tile) who participated in Project EAT (Eating Among Teens) I (1999) and II (2004) and their parents who were interviewed

  2. Using minimum DNA marker loci for accurate population classification in rice (Oryza sativa L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Using few DNA markers to classify genetic background of a germplasm pool will help breeders make a quick decision while saving time and resources. WHICHLOCI is a computer program that selects the best combination of loci for population assignment through empiric analysis of molecular marker data. Th...

  3. Classification of mental disorders*

    PubMed Central

    Stengel, E.

    1959-01-01

    One of the fundamental difficulties in devising a classification of mental disorders is the lack of agreement among psychiatrists regarding the concepts upon which it should be based: diagnoses can rarely be verified objectively and the same or similar conditions are described under a confusing variety of names. This situation militates against the ready exchange of ideas and experiences and hampers progress. As a first step towards remedying this state of affairs, the author of the article below has undertaken a critical survey of existing classifications. He shows how some of the difficulties created by lack of knowledge regarding pathology and etiology may be overcome by the use of “operational definitions” and outlines the basic principles on which he believes a generally acceptable international classification might be constructed. If this can be done it should lead to a greater measure of agreement regarding the value of specific treatments for mental disorders and greatly facilitate a broad epidemiological approach to psychiatric research. PMID:13834299

  4. Commission 45: Spectral Classification

    NASA Astrophysics Data System (ADS)

    Gray, Richard O.; Nordström, Birgitta; Giridhar, Sunetra; Burgasser, Adam J.; Eyer, Laurent; Gupta, Ranjan; Hanson, Margaret M.; Irwin, Michael J.; Soubiran, Caroline

    2012-04-01

    This report gives an update on developments since the last General Assembly in Rio de Janeiro. Classification - both photometric and spectral - continues to play a vital role in stellar astrophysics and stellar surveys. During the past three years, rapid progress has been made in the classification of brown dwarfs, with the discovery of the first Y dwarfs and the introduction of a near-IR classification system for M- and L-dwarfs. The number of known L-dwarfs now exceeds 1000, and so peculiar types are beginning to show up. For instance, there is now enough material to define a low-gravity spectral sequence for the L0 - L5 dwarfs. In addition, a number of unusally blue L-dwarfs are now known. Large-area surveys, always of interest to Commission 45, have proliferated during this period, including RAVE, SEGUE, and WISE with many more in the planning stages.

  5. The new revised classification of acute pancreatitis 2012.

    PubMed

    Sarr, Michael G; Banks, Peter A; Bollen, Thomas L; Dervenis, Christos; Gooszen, Hein G; Johnson, Colin D; Tsiotos, Gregory G; Vege, Santhi Swaroop

    2013-06-01

    This study aims to update the 1991 Atlanta Classification of acute pancreatitis, to standardize the reporting of and terminology of the disease and its complications. Important features of this classification have incorporated new insights into the disease learned over the last 20 years, including the recognition that acute pancreatitis and its complications involve a dynamic process involving two phases, early and late. The accurate and consistent description of acute pancreatitis will help to improve the stratification and reporting of new methods of care of acute pancreatitis across different practices, geographic areas, and countries. PMID:23632143

  6. Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data

    E-print Network

    Kuncheva, Ludmila I.

    Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data: Functional magnetic resonance imaging (fMRI) Online classification Naive labelling Classifier ensembles a b s t r a c t Functional magnetic resonance imaging (fMRI) provides a spatially accurate measure of brain

  7. A closer look at bacteroides: phylogenetic relationship and genomic implications of a life in the human gut.

    PubMed

    Karlsson, Fredrik H; Ussery, David W; Nielsen, Jens; Nookaew, Intawat

    2011-04-01

    The human gut is extremely densely inhabited by bacteria mainly from two phyla, Bacteroidetes and Firmicutes, and there is a great interest in analyzing whole-genome sequences for these species because of their relation to human health and disease. Here, we do whole-genome comparison of 105 Bacteroidetes/Chlorobi genomes to elucidate their phylogenetic relationship and to gain insight into what is separating the gut living Bacteroides and Parabacteroides genera from other Bacteroidetes/Chlorobi species. A comprehensive analysis shows that Bacteroides species have a higher number of extracytoplasmic function ? factors (ECF ? factors) and two component systems for extracellular signal transduction compared to other Bacteroidetes/Chlorobi species. A whole-genome phylogenetic analysis shows a very little difference between the Parabacteroides and Bacteroides genera. Further analysis shows that Bacteroides and Parabacteroides species share a large common core of 1,085 protein families. Genome atlases illustrate that there are few and only small unique areas on the chromosomes of four Bacteroides/Parabacteroides genomes. Functional classification to clusters of othologus groups show that Bacteroides species are enriched in carbohydrate transport and metabolism proteins. Classification of proteins in KEGG metabolic pathways gives a detailed view of the genome's metabolic capabilities that can be linked to its habitat. Bacteroides pectinophilus and Bacteroides capillosus do not cluster together with other Bacteroides species, based on analysis of 16S rRNA sequence, whole-genome protein families and functional content, 16S rRNA sequences of the two species suggest that they belong to the Firmicutes phylum. We have presented a more detailed and precise description of the phylogenetic relationships of members of the Bacteroidetes/Chlorobi phylum by whole genome comparison. Gut living Bacteroides have an enriched set of glycan, vitamin, and cofactor enzymes important for diet digestion. PMID:21222211

  8. DINOFLAJ: Dinoflagellate Classification Database

    NSDL National Science Digital Library

    Fensome, Robert A.

    Rob Fensome and colleagues at the Geological Survey of Canada (Atlantic division), Bedford Institute of Oceanography, have put together DINOFLAJ, a classification database on dinoflagellates. Best known in their relation to "red tides" and paralytic shellfish poisoning, dinoflagellates are single-celled organisms that occur worldwide. The DINOFLAJ database contains current classification information on "fossil and living dinoflagellates down to generic rank, and an index of fossil dinoflagellates at generic, specific, and infraspecific ranks." A glossary and an extensive reference list complete the site.

  9. Shark Teeth Classification

    NSDL National Science Digital Library

    Sally Creel

    2009-03-01

    On a recent autumn afternoon at Harmony Leland Elementary in Mableton, Georgia, students in a fifth-grade science class investigated the essential process of classification--the act of putting things into groups according to some common characteristics or attributes. While they may have honed these skills earlier in the week by grouping their own shoes or school supplies, this class provided the unique opportunity to classify objects that are inherently fascinating to students--shark teeth fossils! This article describes how by using the teeth to estimate the length of ancient sharks, students got a classification activity they could really sink their teeth into.

  10. Conjugate Gibbs sampling for Bayesian phylogenetic models.

    PubMed

    Lartillot, Nicolas

    2006-12-01

    We propose a new Markov Chain Monte Carlo (MCMC) sampling mechanism for Bayesian phylogenetic inference. This method, which we call conjugate Gibbs, relies on analytical conjugacy properties, and is based on an alternation between data augmentation and Gibbs sampling. The data augmentation step consists in sampling a detailed substitution history for each site, and across the whole tree, given the current value of the model parameters. Provided convenient priors are used, the parameters of the model can then be directly updated by a Gibbs sampling procedure, conditional on the current substitution history. Alternating between these two sampling steps yields a MCMC device whose equilibrium distribution is the posterior probability density of interest. We show, on real examples, that this conjugate Gibbs method leads to a significant improvement of the mixing behavior of the MCMC. In all cases, the decorrelation times of the resulting chains are smaller than those obtained by standard Metropolis Hastings procedures by at least one order of magnitude. The method is particularly well suited to heterogeneous models, i.e. assuming site-specific random variables. In particular, the conjugate Gibbs formalism allows one to propose efficient implementations of complex models, for instance assuming site-specific substitution processes, that would not be accessible to standard MCMC methods. PMID:17238840

  11. Recursive algorithms for phylogenetic tree counting

    PubMed Central

    2013-01-01

    Background In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult. Results We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals. Conclusions These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled. PMID:24164709

  12. The phylogenetic profile of mast cells.

    PubMed

    Crivellato, Enrico; Travan, Luciana; Ribatti, Domenico

    2015-01-01

    Mast cells (MCs) are tissue-based immune cells that participate to both innate and adaptive immunities as well as to tissue-remodelling processes. Their evolutionary history appears as a fascinating process, whose outline we can only partly reconstruct according to current remnant evidence. MCs have been identified in all vertebrate classes, and a cell population with the overall characteristics of higher vertebrate MCs is identifiable even in the most evolutionarily advanced fish species. In invertebrates, cells related to vertebrate MCs have been recognized in ascidians, a class of urochordates which appeared approximately 500 million years ago. These comprise the granular hemocyte with intermediate characteristics of basophils and MCs and the "test cell" (see below). Both types of cells contain histamine and heparin, and provide defensive functions. The test cell releases tryptase after stimulation with compound 48/80. A leukocyte ancestor operating in the context of a primitive local innate immunity probably represents the MC phylogenetic progenitor. This cell was likely involved in phagocytic and killing activity against pathogens and operated as a general inducer of inflammation. This early type of defensive cell possibly expressed concomitant tissue-reparative functions. With the advent of recombinase activating gene (RAG)-mediated adaptive immunity in the Cambrian era, some 550 million years ago, and the emergence of early vertebrates, MC progenitors differentiated towards a more complex cellular entity. Early MCs probably appeared in the last common ancestor we shared with hagfish, lamprey, and sharks about 450-500 million years ago. PMID:25388242

  13. Phylogenetic position of the spirochetal genus Cristispira.

    PubMed Central

    Paster, B J; Pelletier, D A; Dewhirst, F E; Weisburg, W G; Fussing, V; Poulsen, L K; Dannenberg, S; Schroeder, I

    1996-01-01

    Comparative sequence analysis of 16S rRNA genes was used to determine the phylogenetic relationship of the genus Cristispira to other spirochetes. Since Cristispira organisms cannot presently be grown in vitro, 16S rRNA genes were amplified directly from bacterial DNA isolated from Cristispira cell-laden crystalline styles of the oyster Crassostrea virginica. The amplified products were then cloned into Escherichia coli plasmids. Sequence comparisons of the gene coding for 16S rRNA (rDNA) insert of one clone, designated CP1, indicated that it was spirochetal. The sequence of the 16S rDNA insert of another clone was mycoplasmal. The CP1 sequence possessed most of the individual base signatures that are unique to 16S rRNA (or rDNA) sequences of known spirochetes. CP1 branched deeply among other spirochetal genera within the family Spirochaetaceae, and accordingly, it represents a separate genus within this family. A fluorescently labeled DNA probe designed from the CP1 sequence was used for in situ hybridization experiments to verify that the sequence obtained was derived from the observed Cristispira cells. PMID:8975621

  14. Pareto-optimal phylogenetic tree reconciliation

    PubMed Central

    Libeskind-Hadas, Ran; Wu, Yi-Chieh; Bansal, Mukul S.; Kellis, Manolis

    2014-01-01

    Motivation: Phylogenetic tree reconciliation is a widely used method for reconstructing the evolutionary histories of gene families and species, hosts and parasites and other dependent pairs of entities. Reconciliation is typically performed using maximum parsimony, in which each evolutionary event type is assigned a cost and the objective is to find a reconciliation of minimum total cost. It is generally understood that reconciliations are sensitive to event costs, but little is understood about the relationship between event costs and solutions. Moreover, choosing appropriate event costs is a notoriously difficult problem. Results: We address this problem by giving an efficient algorithm for computing Pareto-optimal sets of reconciliations, thus providing the first systematic method for understanding the relationship between event costs and reconciliations. This, in turn, results in new techniques for computing event support values and, for cophylogenetic analyses, performing robust statistical tests. We provide new software tools and demonstrate their use on a number of datasets from evolutionary genomic and cophylogenetic studies. Availability and implementation: Our Python tools are freely available at www.cs.hmc.edu/?hadas/xscape. Contact: mukul@engr.uconn.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24932009

  15. Phylogenetic perspectives of nitrogen-fixing actinobacteria.

    PubMed

    Gtari, Maher; Ghodhbane-Gtari, Faten; Nouioui, Imen; Beauchemin, Nicholas; Tisa, Louis S

    2012-01-01

    It was assumed for a long time that the ability to catalyze atmospheric nitrogen (diazotrophy) has a narrow distribution among actinobacteria being limited to the genus Frankia. Recently, the number of nitrogen fixation (nifH) genes identified in other non-Frankia actinobacteria has dramatically increased and has opened investigation on the origin and emergence of diazotrophy among actinobacteria. During the last decade, Mycobacterium flavum, Corynebacterium autotrophicum and a fluorescent Arthrobacter sp. have been reported to have nitrogenase activity, but these studies have not been further verified. Additional reports of nitrogen fixation by Agromyces, Microbacterium, Corynebacterium and Micromonospora isolated from root nodules of leguminous and actinorhizal plants have increased. For several actinobacteria, nitrogen fixation was demonstrated by the ability to grow on nitrogen-free medium, acetylene reduction activity, 15N isotope dilution analysis and identification of a nifH gene via PCR amplification. Moreover, the analyses of draft genome sequences of actinobacteria including Slackia exigua, Rothia mucilaginosa and Gordonibacter pamelaeae have also revealed the presence of nifH-like sequences. Whether these nifH sequences are associated with effective nitrogen fixation in these actinobacteria taxa has not yet been demonstrated. These genes may be vertically or horizontally transferred and be silent sequences. These ideas merit further investigation. This minireview presents a phylogenetic comparison of nitrogen fixation gene (nifH) with the aim of elucidating the processes underlying the evolutionary history of this catalytic ability among actinobacteria. PMID:21779790

  16. A new phylogenetic group of Propionibacterium acnes.

    PubMed

    McDowell, Andrew; Perry, Alexandra L; Lambert, Peter A; Patrick, Sheila

    2008-02-01

    Immunofluorescence microscopy-based identification of presumptive Propionibacterium acnes isolates, using the P. acnes-specific mAb QUBPa3, revealed five organisms with an atypical cellular morphology. Unlike the coryneform morphology seen with P. acnes types I and II, these isolates exhibited long slender filaments (which formed large tangled aggregates) not previously described in P. acnes. No reaction with mAbs that label P. acnes types IA (QUBPa1) and II (QUBPa2) was observed. Nucleotide sequencing of the 16S rRNA gene (1484 bp) revealed the isolates to have between 99.8 and 99.9 % identity to the 16S rRNA gene of the P. acnes type IA, IB and II strains NCTC 737, KPA171202 and NCTC 10390, respectively. Analysis of the recA housekeeping gene (1047 bp) did reveal, however, a greater number of conserved nucleotide polymorphisms between the sequences from these isolates and those from NCTC 737 (98.9 % identity), KPA171202 (98.9 % identity) and NCTC 10390 (99.1 % identity). Phylogenetic investigations demonstrated that the isolates belong to a novel recA cluster or lineage distinct from P. acnes types I and II. We now propose this new grouping as P. acnes type III. The prevalence and clinical importance of this novel recA lineage amongst isolates of P. acnes remains to be determined. PMID:18201989

  17. Comprehensive Phylogenetic Analysis of Bacterial Reverse Transcriptases

    PubMed Central

    Toro, Nicolás; Nisa-Martínez, Rafael

    2014-01-01

    Much less is known about reverse transcriptases (RTs) in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center) platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (?85% identity) per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs), Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L), and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology. PMID:25423096

  18. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...2010-07-01 2010-07-01 false Derivative classification. 17.26 Section...Classified Information § 17.26 Derivative classification. (a) Persons...classification guides. (b) Persons who apply derivative classification markings shall...

  19. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ...2013-07-01 2013-07-01 false Derivative classification. 17.26 Section...Classified Information § 17.26 Derivative classification. (a) Persons...classification guides. (b) Persons who apply derivative classification markings shall...

  20. 28 CFR 17.26 - Derivative classification.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ...2011-07-01 2011-07-01 false Derivative classification. 17.26 Section...Classified Information § 17.26 Derivative classification. (a) Persons...classification guides. (b) Persons who apply derivative classification markings shall...